Docker (Single Instance)
The simplest production deployment uses a single Docker container with embedded Redis:
docker run -d \
-p 8080:8080 \
-e AUXOT_ADMIN_KEY_HASH=argon2id\$... \
-e AUXOT_API_KEY_HASH=argon2id\$... \
--name auxot-router \
ghcr.io/auxothq/auxot-router:latest
The auxot-router image is ~10 MB — it uses FROM scratch with no OS layer.
To generate the key hashes before deploying:
docker run --rm ghcr.io/auxothq/auxot-router:latest setup
Fly.io
The router fits comfortably on Fly.io’s free tier (256 MB RAM). Generate secrets in Fly format:
auxot-router setup --fly
# Output: fly secrets set AUXOT_ADMIN_KEY_HASH=... AUXOT_API_KEY_HASH=...
Then deploy:
fly launch --image ghcr.io/auxothq/auxot-router:latest
fly secrets set AUXOT_ADMIN_KEY_HASH=... AUXOT_API_KEY_HASH=...
fly deploy
Bare Metal
# Download and set up
curl -Lo auxot-router https://github.com/auxothq/auxot/releases/latest/download/auxot-router-$(uname -s)-$(uname -m)
chmod +x auxot-router
./auxot-router setup --write-env
source .env
./auxot-router
Multi-Instance Scaling
To run multiple router instances behind a load balancer, point all instances at the same external Redis:
AUXOT_REDIS_URL=redis://redis-host:6379 ./auxot-router
Workers connect to any router instance — jobs are coordinated through Redis.
Docker — Worker (GPU or CLI)
Run a GPU or CLI worker alongside the router:
docker run -d \
-e AUXOT_GPU_KEY=adm_xxx \
-e AUXOT_ROUTER_URL=router:8080 \
-e ANTHROPIC_API_KEY=sk-ant-... \
-v auxot-models:/home/nonroot/.auxot \
--name auxot-worker \
ghcr.io/auxothq/auxot-worker:latest
The volume mount persists downloaded models and llama.cpp binaries across container restarts. ANTHROPIC_API_KEY is only required when the router assigns this worker to CLI mode.
Docker — Agent
Run an autonomous agent worker:
docker run -d \
-e AUXOT_AGENT_KEY=agnt.abc123... \
-e AUXOT_ROUTER_URL=your-auxot-instance.com \
-v /your/agent-workspace:/home/agent \
--name auxot-agent \
ghcr.io/auxothq/auxot-agent:latest
If /home/agent/SOUL.md is missing, the worker still starts: it connects in bootstrap mode with instructions to interview you and then Write a real SOUL.md. Optional: run npx @open-gitagent/gitagent init on the host workspace for a fuller gitagent layout before bind-mounting. Bind-mount a host directory to persist the workspace across restarts.
Worker Daemon Install
Install a GPU worker as a persistent system service:
# Linux (systemd) or macOS (launchd)
./auxot-worker install \
--name qwen \
--gpu-key adm_xxx \
--router-url router:8080
The daemon auto-restarts on crash and starts on system boot.
Air-Gapped Deployment
For environments without internet access, pre-stage all dependencies:
- Download the model GGUF file on an internet-connected machine
- Download the llama.cpp server binary from GitHub Releases
- Transfer both to the air-gapped machine
./auxot-worker \
--model-path /data/models/Qwen3.5-35B-A3B-Q4_K_S.gguf \
--llama-server-path /data/bin/llama-server \
--gpu-key adm_xxx \
--router-url router:8080
No HuggingFace or GitHub access is required at runtime.
Docker Images
| Image | Size | Contents |
|---|---|---|
ghcr.io/auxothq/auxot-router:latest | ~10 MB | Router binary only, FROM scratch |
ghcr.io/auxothq/auxot-tools:latest | ~140 MB | Tools worker, bun, pnpm |
ghcr.io/auxothq/auxot-worker:latest | ~400 MB | GPU + CLI worker, Claude Code CLI, Node.js |
ghcr.io/auxothq/auxot-agent:latest | (varies) | Agent worker, gitagent-compatible workspace, Debian slim (no Node.js) |
All images are multi-arch (linux/amd64, linux/arm64) and published to GitHub Container Registry.