Endpoint

POST /api/anthropic/v1/messages

Auxot implements the Anthropic Messages API specification, allowing any tool built for the Anthropic API to work with Auxot by changing the base URL. This is particularly useful for Claude Code integration.

Use a personal (user.…) or team (team.…) API key from Settings → API Keys (authentication).

Request Format

curl http://localhost:8420/api/anthropic/v1/messages \
  -H "x-api-key: <api-key>" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "max_tokens": 1024,
    "system": "You are a senior software engineer.",
    "messages": [
      {"role": "user", "content": "Review this Go function for potential issues:\n\nfunc divide(a, b int) int {\n  return a / b\n}"}
    ]
  }'

Supported Parameters

ParameterTypeDescription
modelstringModel ID or "auto" for router-selected.
messagesarrayConversation messages (user, assistant roles).
systemstringSystem prompt (separate from messages in Anthropic format).
max_tokensintegerMaximum tokens to generate (required).
temperaturefloatSampling temperature (0.0–1.0). Default: 1.0.
streambooleanEnable streaming via Server-Sent Events. Default: false.
toolsarrayTool definitions for function calling.
tool_choiceobjectControl tool use: {"type": "auto"}, {"type": "any"}, or {"type": "tool", "name": "..."}.
top_pfloatNucleus sampling parameter.
stop_sequencesarrayStop sequences.

Authentication

The Anthropic-compatible endpoint accepts API keys via either header:

  • x-api-key: <key> (Anthropic convention)
  • Authorization: Bearer <key> (standard Bearer auth)

Both work identically. Use whichever your client library expects.

Response Format

{
  "id": "msg-auxot-abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-20250514",
  "content": [
    {
      "type": "text",
      "text": "This function has a critical issue: division by zero. When `b` is 0, this will cause a runtime panic in Go.\n\nHere's a safer version:\n\n```go\nfunc divide(a, b int) (int, error) {\n  if b == 0 {\n    return 0, errors.New(\"division by zero\")\n  }\n  return a / b, nil\n}\n```"
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 45,
    "output_tokens": 87
  }
}

Streaming

Set "stream": true to receive Server-Sent Events:

curl http://localhost:8420/api/anthropic/v1/messages \
  -H "x-api-key: <api-key>" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain mutex vs channel in Go."}],
    "stream": true
  }'

Response stream:

event: message_start
data: {"type":"message_start","message":{"id":"msg-auxot-abc123","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[],"usage":{"input_tokens":15}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"In Go, both"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" mutexes and channels"}}

...

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":142}}

event: message_stop
data: {"type":"message_stop"}

Tool Calls

Define tools using the Anthropic tool format:

curl http://localhost:8420/api/anthropic/v1/messages \
  -H "x-api-key: <api-key>" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Look up the latest deploy status for the payments service."}
    ],
    "tools": [
      {
        "name": "get_deploy_status",
        "description": "Get the latest deployment status for a service",
        "input_schema": {
          "type": "object",
          "properties": {
            "service": {"type": "string", "description": "Service name"}
          },
          "required": ["service"]
        }
      }
    ]
  }'

The model responds with a tool use block:

{
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_abc123",
      "name": "get_deploy_status",
      "input": {"service": "payments"}
    }
  ],
  "stop_reason": "tool_use"
}

Send the result back in the next request:

{
  "messages": [
    {"role": "user", "content": "Look up the latest deploy status for the payments service."},
    {"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_abc123", "name": "get_deploy_status", "input": {"service": "payments"}}]},
    {"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_abc123", "content": "payments v2.4.1 deployed successfully at 2026-03-14T15:30:00Z"}]}
  ]
}

Using with Claude Code

Point Claude Code at your Auxot instance to route all inference through your own infrastructure:

export ANTHROPIC_BASE_URL=http://localhost:8420/api/anthropic/v1
export ANTHROPIC_API_KEY=user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

claude

Claude Code will now use Auxot for all completions. If you have GPU workers running a compatible model, requests will route there first; otherwise they’ll fall through to CLI workers or cloud Anthropic.

See Claude Code & Cursor (Local) → for API key creation, shell persistence, LAN base URLs, and Cursor OpenAI override on the same instance.

Using with the Anthropic Python SDK

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:8420/api/anthropic/v1",
    api_key="user.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)

message = client.messages.create(
    model="auto",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

print(message.content[0].text)