Architecture Overview
How Auxot Works
A unified AI platform that routes requests across your GPUs, CLI tools, and cloud APIs — governed by policy, managed by agents, extended with skills.
Platform Architecture
One control plane. Any mix of providers — GPUs, CLI workers, cloud APIs. Every request governed.
Unified Provider Routing
Auxot routes every request through a priority cascade — trying your fastest, cheapest providers first and falling back automatically.
- GPU first — On-prem and cloud GPUs get priority for lowest latency and cost
- CLI fallback — Claude Code workers handle agentic tool-use workloads
- Cloud overflow — OpenAI and Anthropic APIs provide elastic capacity when local providers are saturated
- Mixed mode — Run all three provider types simultaneously for different workloads
- Automatic failover — If a provider goes down, traffic reroutes instantly
Admin Agent Setup
Configure your entire platform through a guided conversation with the Admin Agent — no YAML, no dashboards.
Connect Providers
"Add our 4×A100 server at gpu-rack.internal and set up Anthropic as our cloud fallback."
Define Teams & Policies
"Create an Engineering team with interactive priority and 50 req/min. Data Science gets batch priority."
Configure Routing
"Route coding requests to GPU first, then Claude Code CLI. Use Anthropic API only when GPUs are full."
Deploy Agents
"Create a code-review agent with the review skill and connect it to our GitHub MCP server."
Go Live
Admin Agent validates the configuration, generates API keys, and hands the system to your teams.
Agents, Skills & MCP
Auxot's agent architecture separates personality from capability. Build domain-specific agents by composing reusable skills and connecting external tools.
- Agent = personality + skills + tools — Each Agent is a named agent with a defined role, a set of skills, and MCP tool connections
- Skills = reusable behavior — Skills are composable instruction sets that can be shared across Agents. Write once, attach to many agents.
- MCP = external tools — Model Context Protocol connects Agents to GitHub, Jira, Slack, databases, or any system with an MCP server
- Admin Agent — The built-in Agent that manages your platform through conversation
- Custom Agents — Build domain-specific agents for code review, ops triage, data analysis, or any workflow your team needs
Team Boundaries & Governance
AI access is defined by leadership — not by whoever grabs a key first. Teams, roles, and service classes determine who can use AI, which providers they access, and how much capacity they consume.
- Organization & team structure — Mirror your org chart in Auxot
- Role-based access — Admins, members, and service accounts with distinct permissions
- Provider policies — Control which teams can use GPU, CLI, or cloud providers
- Usage limits — Per-team rate limits and concurrency caps enforced at runtime
- Audit trail — Every request logged with team, user, provider, duration, and token usage
Standard API Access
Auxot exposes OpenAI-compatible and Anthropic-compatible endpoints. Any tool that speaks these protocols connects without code changes.
- OpenAI-compatible —
/api/openai/v1/chat/completions - Anthropic-compatible —
/api/anthropic/v1/messages - Streaming — Server-Sent Events matching upstream API format exactly
- Tool calls — Full function calling for agentic workflows
Your developers point Cursor, Claude Code, or any custom integration at Auxot. They use their familiar tools. Leadership controls what runs underneath.
Your providers. Your agents. Your rules.
Define how AI runs inside your organization.
Built for organizations where AI governance is a leadership responsibility.