Docker (Single Instance)

The simplest production deployment uses a single Docker container with embedded Redis:

docker run -d \
  -p 8080:8080 \
  -e AUXOT_ADMIN_KEY_HASH=argon2id\$... \
  -e AUXOT_API_KEY_HASH=argon2id\$... \
  --name auxot-router \
  ghcr.io/auxothq/auxot-router:latest

The auxot-router image is ~10 MB — it uses FROM scratch with no OS layer.

To generate the key hashes before deploying:

docker run --rm ghcr.io/auxothq/auxot-router:latest setup

Fly.io

The router fits comfortably on Fly.io’s free tier (256 MB RAM). Generate secrets in Fly format:

auxot-router setup --fly
# Output: fly secrets set AUXOT_ADMIN_KEY_HASH=... AUXOT_API_KEY_HASH=...

Then deploy:

fly launch --image ghcr.io/auxothq/auxot-router:latest
fly secrets set AUXOT_ADMIN_KEY_HASH=... AUXOT_API_KEY_HASH=...
fly deploy

Bare Metal

# Download and set up
curl -Lo auxot-router https://github.com/auxothq/auxot/releases/latest/download/auxot-router-$(uname -s)-$(uname -m)
chmod +x auxot-router
./auxot-router setup --write-env
source .env
./auxot-router

Multi-Instance Scaling

To run multiple router instances behind a load balancer, point all instances at the same external Redis:

AUXOT_REDIS_URL=redis://redis-host:6379 ./auxot-router

Workers connect to any router instance — jobs are coordinated through Redis.


Docker — Worker (GPU or CLI)

Run a GPU or CLI worker alongside the router:

docker run -d \
  -e AUXOT_GPU_KEY=adm_xxx \
  -e AUXOT_ROUTER_URL=router:8080 \
  -e ANTHROPIC_API_KEY=sk-ant-...  \
  -v auxot-models:/home/nonroot/.auxot \
  --name auxot-worker \
  ghcr.io/auxothq/auxot-worker:latest

The volume mount persists downloaded models and llama.cpp binaries across container restarts. ANTHROPIC_API_KEY is only required when the router assigns this worker to CLI mode.


Docker — Agent

Run an autonomous agent worker:

docker run -d \
  -e AUXOT_AGENT_KEY=agnt.abc123... \
  -e AUXOT_ROUTER_URL=your-auxot-instance.com \
  -v /your/agent-workspace:/home/agent \
  --name auxot-agent \
  ghcr.io/auxothq/auxot-agent:latest

If /home/agent/SOUL.md is missing, the worker still starts: it connects in bootstrap mode with instructions to interview you and then Write a real SOUL.md. Optional: run npx @open-gitagent/gitagent init on the host workspace for a fuller gitagent layout before bind-mounting. Bind-mount a host directory to persist the workspace across restarts.


Worker Daemon Install

Install a GPU worker as a persistent system service:

# Linux (systemd) or macOS (launchd)
./auxot-worker install \
  --name qwen \
  --gpu-key adm_xxx \
  --router-url router:8080

The daemon auto-restarts on crash and starts on system boot.


Air-Gapped Deployment

For environments without internet access, pre-stage all dependencies:

  1. Download the model GGUF file on an internet-connected machine
  2. Download the llama.cpp server binary from GitHub Releases
  3. Transfer both to the air-gapped machine
./auxot-worker \
  --model-path /data/models/Qwen3.5-35B-A3B-Q4_K_S.gguf \
  --llama-server-path /data/bin/llama-server \
  --gpu-key adm_xxx \
  --router-url router:8080

No HuggingFace or GitHub access is required at runtime.


Docker Images

ImageSizeContents
ghcr.io/auxothq/auxot-router:latest~10 MBRouter binary only, FROM scratch
ghcr.io/auxothq/auxot-tools:latest~140 MBTools worker, bun, pnpm
ghcr.io/auxothq/auxot-worker:latest~400 MBGPU + CLI worker, Claude Code CLI, Node.js
ghcr.io/auxothq/auxot-agent:latest(varies)Agent worker, gitagent-compatible workspace, Debian slim (no Node.js)

All images are multi-arch (linux/amd64, linux/arm64) and published to GitHub Container Registry.