Endpoint
POST /api/anthropic/v1/messages
Auxot implements the Anthropic Messages API specification, allowing any tool built for the Anthropic API to work with Auxot by changing the base URL. This is particularly useful for Claude Code integration.
Use a personal (user.…) or team (team.…) API key from Settings → API Keys (authentication).
Request Format
curl http://localhost:8420/api/anthropic/v1/messages \
-H "x-api-key: <api-key>" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"max_tokens": 1024,
"system": "You are a senior software engineer.",
"messages": [
{"role": "user", "content": "Review this Go function for potential issues:\n\nfunc divide(a, b int) int {\n return a / b\n}"}
]
}'
Supported Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID or "auto" for router-selected. |
messages | array | Conversation messages (user, assistant roles). |
system | string | System prompt (separate from messages in Anthropic format). |
max_tokens | integer | Maximum tokens to generate (required). |
temperature | float | Sampling temperature (0.0–1.0). Default: 1.0. |
stream | boolean | Enable streaming via Server-Sent Events. Default: false. |
tools | array | Tool definitions for function calling. |
tool_choice | object | Control tool use: {"type": "auto"}, {"type": "any"}, or {"type": "tool", "name": "..."}. |
top_p | float | Nucleus sampling parameter. |
stop_sequences | array | Stop sequences. |
Authentication
The Anthropic-compatible endpoint accepts API keys via either header:
x-api-key: <key>(Anthropic convention)Authorization: Bearer <key>(standard Bearer auth)
Both work identically. Use whichever your client library expects.
Response Format
{
"id": "msg-auxot-abc123",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-20250514",
"content": [
{
"type": "text",
"text": "This function has a critical issue: division by zero. When `b` is 0, this will cause a runtime panic in Go.\n\nHere's a safer version:\n\n```go\nfunc divide(a, b int) (int, error) {\n if b == 0 {\n return 0, errors.New(\"division by zero\")\n }\n return a / b, nil\n}\n```"
}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 45,
"output_tokens": 87
}
}
Streaming
Set "stream": true to receive Server-Sent Events:
curl http://localhost:8420/api/anthropic/v1/messages \
-H "x-api-key: <api-key>" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Explain mutex vs channel in Go."}],
"stream": true
}'
Response stream:
event: message_start
data: {"type":"message_start","message":{"id":"msg-auxot-abc123","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[],"usage":{"input_tokens":15}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"In Go, both"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" mutexes and channels"}}
...
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":142}}
event: message_stop
data: {"type":"message_stop"}
Tool Calls
Define tools using the Anthropic tool format:
curl http://localhost:8420/api/anthropic/v1/messages \
-H "x-api-key: <api-key>" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Look up the latest deploy status for the payments service."}
],
"tools": [
{
"name": "get_deploy_status",
"description": "Get the latest deployment status for a service",
"input_schema": {
"type": "object",
"properties": {
"service": {"type": "string", "description": "Service name"}
},
"required": ["service"]
}
}
]
}'
The model responds with a tool use block:
{
"content": [
{
"type": "tool_use",
"id": "toolu_abc123",
"name": "get_deploy_status",
"input": {"service": "payments"}
}
],
"stop_reason": "tool_use"
}
Send the result back in the next request:
{
"messages": [
{"role": "user", "content": "Look up the latest deploy status for the payments service."},
{"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_abc123", "name": "get_deploy_status", "input": {"service": "payments"}}]},
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_abc123", "content": "payments v2.4.1 deployed successfully at 2026-03-14T15:30:00Z"}]}
]
}
Using with Claude Code
Point Claude Code at your Auxot instance to route all inference through your own infrastructure:
export ANTHROPIC_BASE_URL=http://localhost:8420/api/anthropic/v1
export ANTHROPIC_API_KEY=user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
claude
Claude Code will now use Auxot for all completions. If you have GPU workers running a compatible model, requests will route there first; otherwise they’ll fall through to CLI workers or cloud Anthropic.
See Claude Code & Cursor (Local) → for API key creation, shell persistence, LAN base URLs, and Cursor OpenAI override on the same instance.
Using with the Anthropic Python SDK
import anthropic
client = anthropic.Anthropic(
base_url="http://localhost:8420/api/anthropic/v1",
api_key="user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
message = client.messages.create(
model="auto",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)