Endpoint
POST /api/openai/chat/completions
Auxot implements the OpenAI Chat Completions API specification, so any tool or SDK that works with OpenAI can point at Auxot instead by changing the base URL.
Authenticate with a personal (user.…) or team (team.…) API key from Settings → API Keys on Auxot Server (authentication).
Request Format
curl http://localhost:8420/api/openai/chat/completions \
-H "Authorization: Bearer <api-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the difference between TCP and UDP."}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}'
Supported Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID or "auto" for router-selected. |
messages | array | Conversation messages (system, user, assistant, tool roles). |
temperature | float | Sampling temperature (0.0–2.0). Default: 1.0. |
max_tokens | integer | Maximum tokens to generate. |
stream | boolean | Enable streaming via Server-Sent Events. Default: false. |
tools | array | Tool definitions for function calling. |
tool_choice | string/object | Control tool use: "auto", "none", or specific tool. |
top_p | float | Nucleus sampling parameter. |
stop | string/array | Stop sequences. |
When model is "auto", Auxot picks the best available provider using priority routing (GPU → CLI → Cloud). You can also specify a concrete model ID like "gpt-4o" or "claude-sonnet-4-20250514" to target a specific model.
Response Format
{
"id": "chatcmpl-auxot-abc123",
"object": "chat.completion",
"created": 1710432000,
"model": "llama-3.3-70b-q4_k_m",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are both transport layer protocols..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 156,
"total_tokens": 184
}
}
The response includes which model actually served the request in the model field, which is useful when using "auto" routing.
Streaming
Set "stream": true to receive responses as Server-Sent Events:
curl http://localhost:8420/api/openai/chat/completions \
-H "Authorization: Bearer <api-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Write a haiku about servers."}],
"stream": true
}'
Response stream:
data: {"id":"chatcmpl-auxot-abc123","choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"id":"chatcmpl-auxot-abc123","choices":[{"delta":{"content":"Silicon"},"index":0}]}
data: {"id":"chatcmpl-auxot-abc123","choices":[{"delta":{"content":" hums"},"index":0}]}
...
data: {"id":"chatcmpl-auxot-abc123","choices":[{"delta":{},"index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":17,"total_tokens":29}}
data: [DONE]
Tool Calls (Function Calling)
Define tools in the request and the model can call them:
curl http://localhost:8420/api/openai/chat/completions \
-H "Authorization: Bearer <api-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{"role": "user", "content": "What is the weather in San Francisco?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}
}
]
}'
The model responds with a tool call:
{
"choices": [{
"message": {
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"San Francisco\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}
You then send the tool result back in a follow-up request with role: "tool".
Using with the OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8420/api/openai",
api_key="user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
Using with Cursor (local Auxot)
Cursor can send chat requests through Auxot’s OpenAI-compatible endpoint by setting your OpenAI API Key to an Auxot key and overriding the OpenAI base URL to http://<host>:<port>/api/openai. See Claude Code & Cursor (Local) → for API key, Cursor UI, and Claude Code env on one Auxot instance.
/api/openai/v1/... aliases are still accepted for client compatibility.