Run a five-minute operator health check
Use **System Health** on purpose: same five-minute loop admins reuse before a stand-up or after someone says “the AI feels slow,” so gray dots and quota bars tell a story instead of eating your afternoon.
Plus: three Admin Agent prompts that turn a messy dashboard into a ranked briefing, a cascade-aware hypothesis list, and a calm escalator note you can paste to leadership.
| Audience | Admins · Developers |
|---|---|
| Time | ~5 min |
| Prerequisites | You can open **System Health** ([Take Auxot's pulse in 10 seconds](/tutorials/take-auxots-pulse)). Helpful: you already know **Settings → Providers** exists ([Connect a cloud AI model](/tutorials/connect-a-cloud-ai-model)) and you’ve skimmed routing basics ([Read provider routing like an operator](/tutorials/read-provider-routing-like-an-operator)). |
| You'll end up with | A repeatable five-minute pass across Platform Status, providers, workers, integrations, and Recent Activity. Plus three prompts that compress what you saw into next steps without pretending the dashboard fixes itself. |
When a tutorial shows italic text in quotation marks, it usually mirrors a label or helper string inside Auxot. Product copy changes between releases — if something reads differently in your workspace, trust what you see on screen.
Callouts with a Worth knowing gold accent are meant as must-read context before you move on. Blockquotes that open with Tip are lighter, optional depth.
Why this matters
Take Auxot’s pulse in 10 seconds teaches the glance: green vs not green, where to click next. This lesson is the loop: same page, fixed order, enough seconds per card that you notice load shapes, heartbeat gaps, and integration flaps before they become Slack threads.
Operators stop guessing whether “slow” means you, quota, GPU sleep, or Discord reconnect flapping. You read System Health once the way an on-call engineer reads a status board: top to bottom, evidence before narrative.
The next time someone says “models felt weird yesterday,” you already know whether your providers were saturated, workers were offline, or nothing moved on the board at all.
System Health stays one URL; this ritual decides how hard you look before you open Audit Logs or ping a vendor.
Quick start
- Open System Health: click System Health in the left menu.
- Lock onto Platform Status: read license plus Postgres/Redis rollup first. If this strip is red or yellow, stop the full walkthrough and treat it as an infrastructure-first problem (Take Auxot’s pulse in 10 seconds).
- Walk Model Providers in order: scan each row for Online, Over quota, or Offline, and notice the load bars (busy vs idle relative to this session).
- Skim Tool Workers and Integrations: empty cards mean “unused,” not healthy by default; anything Offline or Disconnected gets a mental flag before you blame prompts.
- Scan Recent Activity for patterns: spikes, repeated ✗ marks, or the same worker drop reconnecting; you are looking for rhythm, not reading every line.
Done? You can say in one sentence what was fine, what was trending wrong, and whether traffic behavior could have slid toward cloud fallback (Read provider routing like an operator).
The agent can do that?
You finished one lap around the board. These three prompts ask the Admin Agent to compress what you saw, tie symptoms to routing cascades, and draft comms if something was actually degraded.
1. Severity-tagged briefing
I just ran my five-minute System Health loop. Paste facts only: Platform Status color, each Model Provider name + status + whether load bars looked pegged, Tool Workers row summary, Integrations row summary, plus one sentence on Recent Activity patterns. Respond with P1/P2/P3 issues only, each with next human action (no autonomous fixes).
Why it’s non-obvious: Without a severity ladder, every yellow feels like fire. You still decide what to act on; the Admin Agent sorts noise.
2. Cascade-aware hypotheses
Symptom from the business: [latency / uneven quality / surprise cloud spend]. Here's what System Health showed this morning: [paste short notes]. Explain how GPU → CLI → cloud failover could produce that symptom. Rank hypotheses, then say what I'd verify under Settings → Providers vs Audit Logs jobs first.
Why it’s non-obvious: Same dashboard snapshot pairs with three different root stories. Paste your notes first so the Admin Agent reasons against facts, not gut feel.
3. Status line for leadership
Something on System Health was red or flapping in the last hour. Draft a four-sentence update for leadership: current customer impact (honest), what we know vs don't, what we're doing next, when we'll check back. If everything was actually green, say so and stop.
Why it’s non-obvious: Panic drafts invent incidents. This prompt forces evidence-backed tone because you tied it to a page you read.
Go deeper
How this differs from the ten-second pulse
Pulse is threshold: are we broadly OK? This loop is pattern: loads creeping, quotas leaning, integrations reconnecting, activity clustering failures. Same page, different level of attention.
How this pairs with provider routing
When auto traffic moves tiers, users often stay happy while ops sees cloud spend move (Read provider routing like an operator). System Health shows who is hot; Audit Logs → Jobs shows who actually answered when you need receipts (Trace a failing job end to end).
When to skip straight to Audit Logs
If Platform Status is green but chat still fails, you may be past “dashboard story” and into a single job path: open Trace a failing job end to end. System Health tells you whether the infrastructure was fundamentally broken; Audit Logs tell you what happened to a single job.
Walkthrough
Step 1: Platform Status and license
Read Platform Status first. Green plus “All systems operational” means Auxot’s core dependencies are responding. Yellow or red means database or cache reachability problems that Settings tweaks won’t fix on their own.
Note the license line while you are here: it confirms which tier this session reflects (Free vs paid labels).
Step 2: Model Providers as a load story
Work top to bottom through Model Providers.
- Online / Healthy with low bars: quiet capacity.
- Bars pegged across multiple rows: you’re in a contention hour; expect latency even if nothing is “offline.”
- Over quota: billing or rate limits on the vendor side; routing may fall through to other tiers if configured (Read provider routing like an operator).
- Offline / Error: click the row to open Settings → Providers for the exact error string.
Step 3: Connected Agent Filesystems (often empty)
If you use dedicated agent workspaces, this card matters. Online here is not the same as “chat works” (the UI calls that out). If you do not use filesystem attachments, an empty card is normal.
Step 4: Tool Workers
Online means auxiliary workers (search, sandboxes, custom tool hosts) answered heartbeats recently. Offline often means a process stopped or a network path broke. Check those before you rewrite agent instructions.
Step 5: Integrations
Slack, Discord, and similar rows should read Connected when those bots must answer chat traffic. If integrations flap while Model Providers look fine, suspect identity or channel wiring before you blame models (Link your chat apps to your Auxot account).
Step 6: Recent Activity as a rhythm check
Glance for bursts of failures, repeated reconnects, or patterns that line up with a deploy window. Remember the feed is session-scoped for live streaming; permanent history lives in View your audit logs.
What’s next
- → Take Auxot’s pulse in 10 seconds. The ten-second habit that belongs before every stand-up.
- → Poll an intake job until it finishes. Same operator rigor for
202+work_idloops when HTTP callers need a terminal status, not only a green card. - → Read provider routing like an operator. Turn “why did cloud spend spike?” into a cascade story.
- → Trace a failing job end to end. When one job matters more than the whole board.
- → Connect a GPU worker. When Model Providers shows gray worker rows you own fixing.
Reference
- Pages in Auxot: System Health, Settings → Providers
- Manual: Providers overview
- See also: Poll an intake job until it finishes, Build a cross-meeting commitments ledger, Read provider routing like an operator, Trace a failing job end to end, Connect a cloud AI model, View your audit logs