Call the OpenAI-compatible Chat Completions API

Point curl, GitHub Actions, or the OpenAI Python SDK at Auxot's Chat Completions endpoint — same JSON OpenAI clients expect — with `model: "auto"` riding your GPU → CLI → cloud order.

Plus: three Admin-Agent paste saves — unstick 401/base-url mistakes, sketch a two-turn tool loop without SDK folklore, and decide stream-off vs SSE for CI logs.

Audience Developers · Admins
Time ~8 min
Prerequisites A working personal or team API key ([Generate your first API key](/tutorials/generate-your-first-api-key)). Comfort with env vars or curl. At least one inference path healthy ([Take Auxot's pulse in 10 seconds](/tutorials/take-auxots-pulse)) — otherwise calls authenticate and fail upstream. For **live tool execution** (not just `tool_calls` in JSON), a tools worker must be connected on stacks that use external tools ([Run the tools worker (auxot-tools)](/tutorials/run-the-tools-worker)).
You'll end up with One successful `POST /api/openai/chat/completions` — Bearer auth, correct **`/api/openai`** SDK base (no `/chat/completions` suffix), optional streaming proof — plus clarity on how this path differs from Anthropic Messages ([Call the Anthropic-compatible Messages API](/tutorials/call-the-anthropic-compatible-messages-api)).

When a tutorial shows italic text in quotation marks, it usually mirrors a label or helper string inside Auxot. Product copy changes between releases — if something reads differently in your workspace, trust what you see on screen.

Callouts with a Worth knowing gold accent are meant as must-read context before you move on. Blockquotes that open with Tip are lighter, optional depth.

Why this matters

Generate your first API key ends with a single curl, enough to prove the key. Most automation instead stays on the OpenAI Chat Completions dialect forever:

  • messages array (system, user, assistant, tool).
  • Optional tools + tool_choice for function calling.
  • stream: true → Server-Sent Events chunks shaped like OpenAI.

Auxot serves POST /api/openai/chat/completions (OpenAI-Compatible API). Swap host + Auxot API key; keep your SDK’s chat.completions.create calls.

Claude-first stacks belong on /api/anthropic/v1/messages instead (Call the Anthropic-compatible Messages API): different JSON, same routing idea behind model: "auto" (Providers overview).

Nothing streams until you send HTTP: scripts, CI, and notebooks included.


Quick start

  1. Key: Settings → API Keys (user.… or team.…) (Create a shared Team API Key when automation should outlive one person).
  2. Origin: scheme + host (+ port in dev). Typical Auxot Server docs use 8420; OSS auxot-router examples often use 8080 (don’t mix them blindly) (Run the open-source inference router).
  3. Path: POST …/api/openai/chat/completions (full URL), or configure SDK base_url to https://YOUR-HOST/api/openai, do not append /chat/completions to the SDK base (OpenAI-Compatible API).
  4. Headers: Authorization: Bearer <key>, Content-Type: application/json.
  5. Body: at minimum model (often "auto") and messages. Add stream, tools, max_tokens, etc. per the manual table (OpenAI-Compatible API).

Done? Response JSON shows choices[0].message (non-stream) or SSE data: lines ending in [DONE], not Anthropic content blocks.


The agent can do that?

1. Unstick auth + base URL

After a failed request, paste redacted details into Admin Agent chat:

OpenAI SDK pointed at Auxot. HTTP [status]. base_url=[what you set]. Example full URL if any. Key prefix user. or team. only — full secret redacted. Error body excerpt if JSON. What should I verify first — trailing slash, wrong port (8420 vs 8080), `/api/openai/v1` alias confusion, or nginx vs API host?

Why it’s non-obvious: OpenAI’s official SDK assumes api.openai.com: your failure is usually base_url off by one segment or TLS hostname mismatch, not “the model is dumb.”

2. Two-turn tool sketch (Chat Completions shape)

Draft minimal Chat Completions JSON for Auxot: (1) user message + tools array with one function; (2) assistant message with tool_calls; (3) tool role message with tool_call_id + JSON string result; (4) final assistant reply. Placeholder names only — I'll curl myself.

Why it’s non-obvious: Anthropic uses tool_use / tool_result blocks (Call the Anthropic-compatible Messages API): OpenAI nests tool_calls and a tool role row; copying fields across dialects breaks silently.

3. Stream vs non-stream for automation

Batch job in CI: should I set stream false and parse one JSON body, or stream true and accumulate deltas for [metrics | UX | progress logs]? Gate on log volume and retry semantics only.

Why it’s non-obvious: SSE is nicer interactively; stream: false is often simpler for jq and exit-code gates: paste workload; pick deliberately.


Go deeper

model: "auto"

Auxot picks among healthy providers (Providers overview). Pin a concrete model ID when you need stable routing or comparisons (Read provider routing like an operator).

/api/openai/v1/... aliases

Still accepted for picky clients (OpenAI-Compatible API): prefer the canonical paths in new scripts so teammates grep one pattern.

Tool calls ≠ executed tools

Chat Completions can return tool_calls JSON. Actually running MCP/built-ins requires Auxot’s tools path connected (Tool Worker Policies, Run the tools worker (auxot-tools)). Scripts that fake tool results (unit tests) can loop locally without a worker.

Minimal curl (non-streaming)
curl -sS "https://YOUR-AUXOT-HOST/api/openai/chat/completions" \
  -H "Authorization: Bearer user.YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "Reply with the single word ok."}
    ]
  }' | jq '.choices[0].message.content'

Swap host and key; install jq if you want one-field extraction.

OpenAI Python SDK (from manual)
from openai import OpenAI

client = OpenAI(
    base_url="https://YOUR-AUXOT-HOST/api/openai",
    api_key="user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Same pattern works for stream=True: iterate chunks per OpenAI SDK docs (OpenAI-Compatible API).


Walkthrough

Step 1: Export secrets locally

export AUXOT_BASE_URL="https://YOUR-AUXOT-HOST"   # no trailing slash
export AUXOT_API_KEY="user.your_key_here"

Never commit real keys: CI should use encrypted variables (Pick personal or team API keys for real work).

Step 2: Prove non-streaming completions

curl -sS "$AUXOT_BASE_URL/api/openai/chat/completions" \
  -H "Authorization: Bearer $AUXOT_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg m "auto" '{model:$m, messages:[{role:"user", content:"Ping with one word."}]}')"

Expect HTTP 200 and choices[0].message.content present.

Step 3: Prove streaming

Repeat with "stream": true in the body; stdout should show data: {…} lines and a final data: [DONE] (OpenAI-Compatible API).

Step 4 (optional): Tool-call round trip

Send a tools array like the manual example (OpenAI-Compatible API). If finish_reason is tool_calls, append assistant + tool messages and POST again: your script supplies tool JSON unless Auxot executes tools for that deployment path (Run the tools worker (auxot-tools)).

Step 5: Wire CI

GitHub Actions: store AUXOT_API_KEY + AUXOT_BASE_URL as secrets; same curl or pip install openai step. Keep logs free of Bearer tokens.


What’s next

Reference