Batch spreadsheet rows through a workflow
Turn a CSV or sheet export into **one workflow design**, stable **correlation IDs** per row, and either **intake loops** ([Trigger a workflow with an intake webhook](/tutorials/trigger-a-workflow-with-an-intake-webhook)) or disciplined **New task** batches — with sampling for QC and Audit Logs search-friendly payloads.
Plus: three Admin-Agent passes — draft a minimal JSON shape per row before you script, narrate partial failure vs poison rows, and write a five-row smoke script checklist before you unleash hundreds.
| Audience | Admins · Developers |
|---|---|
| Time | ~12 min |
| Prerequisites | A runnable workflow with steps that match one row’s worth of work ([Run a workflow](/tutorials/run-a-workflow)). For HTTP loops: intake fluency ([Trigger a workflow with an intake webhook](/tutorials/trigger-a-workflow-with-an-intake-webhook)); **`team.`** keys ([Create a shared Team API Key](/tutorials/create-a-team-api-key)). Helpful: CI glue ([Trigger a workflow from GitHub Actions](/tutorials/trigger-a-workflow-from-github-actions)), MCP sheets connectors ([Add an MCP server](/tutorials/add-an-mcp-server)). |
| You'll end up with | A named pattern your team can reuse — **payload contract**, **row key** in every POST or task note, **QC sample size**, and **where to look** when row 37 fails alone ([Trace a failing job end to end](/tutorials/trace-a-failing-job-end-to-end)). |
When a tutorial shows italic text in quotation marks, it usually mirrors a label or helper string inside Auxot. Product copy changes between releases — if something reads differently in your workspace, trust what you see on screen.
Callouts with a Worth knowing gold accent are meant as must-read context before you move on. Blockquotes that open with Tip are lighter, optional depth.
Why this matters
Spreadsheets are how ops and revenue teams actually hand you fifty similar jobs. Auxot does not ship a magic Import CSV → batch of tasks button: instead you compose primitives you already have: workflows for the repeatable pipeline, intakes (or careful New task) for starters, and Audit Logs for receipts.
The failure mode is always the same: vague columns, giant pasted grids in Chat, and no stable ID: so when something breaks you cannot answer which row was it?
Today you lock a small JSON shape (even if the sheet has twenty columns), decide automation vs human queue, and treat partial failure as normal, not a crisis.
Nothing fans out across rows on its own: you define the payload, you throttle volume, you spot-check outputs.
Quick start
- Freeze the workflow — one board handles one row’s lifecycle (Run a workflow): classify, enrich, approve, and file (whatever repeats per row).
- Pick row identity — choose a stable
row_key(spreadsheet row number, CRM id, UUID you mint). Every POST or task carries it unchanged: search wins later (Trace a failing job end to end). - Shrink columns — map sheet headers → JSON fields your agent steps actually read; drop noise so prompts stay small and PII stays deliberate.
- Choose starter:
- HTTP loop: script, Actions, or worker POSTs
POST /v1/intake/{INTAKE_ID}per row (Trigger a workflow with an intake webhook); paste-ready CI skeleton in Trigger a workflow from GitHub Actions. - Manual waves: export CSV → New task in batches small enough that mistakes hurt: still paste
row_keyinto task metadata or first-column discipline your humans honor.
- HTTP loop: script, Actions, or worker POSTs
- QC before scale — run five rows, compare outputs against a rubric, then widen: token usage and audit-log volume scale linearly with row count.
Done? Same row_key appears in intake JSON and shows up traceably in Threads / Jobs when you filter or search, not “some spreadsheet somewhere.”
The agent can do that?
1. Payload sketch before the first curl
Chat → Admin Agent:
We're batching spreadsheet rows into intake JSON for workflow "[name]". Columns: [paste header row]. List minimal JSON keys the workflow steps need, flag PII fields to hash or omit, propose example payload for row_key "SMOKE-001".
Why it’s non-obvious: Scripts copy wide sheets verbatim: Admin Agent forces minimalism after you paste headers; you still approve what ships.
2. Partial failure narrative
Intake loop processed 200 rows; 7 workflows failed or stalled. Given pattern — isolated failures vs systemic — ordered checklist: payload shape changes, rate limits, human step backlog, and tool timeouts. No blame; bullets only.
Why it’s non-obvious: “Batch broke” sounds binary (usually classes of rows): paste counts + rough error hints because you asked; you still open Audit rows.
3. Smoke-run discipline
Draft a pre-run checklist: 5 rows, expected column mapping proof, where row_key must appear in Audit Logs search, and when to abort before full send — markdown bullets.
Why it’s non-obvious: Skipping smoke costs more than writing it: checklist keeps humans from defending a thousand bad tasks.
Go deeper
Idempotency and retries
If your loop retries POSTs, pair row_key with an idempotency_key field your operators recognize: duplicate tasks are worse than duplicate HTTP tries (Harden your intake webhooks).
MCP and live sheets
Some teams attach Google Sheets or similar via MCP (Add an MCP server): great for pulling fresh cells, still separate from starting workflow tasks at volume; don’t confuse read tools with task starters unless you designed that explicitly.
Humans in the loop
Workflow human steps slow batches predictably: tune concurrency or batch size so approval columns don’t overload reviewers (Run a workflow).
Writing back to the spreadsheet
Batch workflows often need to land results back in a sheet. The pattern that keeps the live sheet safe is the DRAFT_ tab pattern: wire a Google Sheets MCP (Composio Google Sheets is the cleanest managed-OAuth path), then point the workflow at a DRAFT_summary tab. The workflow writes the proposed rows to the DRAFT_ tab. You review the rows, then copy them into the live tab yourself. The live tab is the canonical record; the DRAFT_ tab is the review gate. See Use a DRAFT_ tab for agent spreadsheet writes for the full wiring (tool-policy shape, smoke test, operating note).
No spreadsheet MCP exposes a “draft cell” mode; once a workflow step calls a write tool, the cell is live. The DRAFT_ tab gives you the review step the cloud API doesn’t.
If the batch data is sensitive (PII, internal contract figures), the offline-file model is the safer alternative: an Excel MCP that writes a local .xlsx your team merges into the canonical sheet manually. No cloud, no OneDrive, no Graph API in the loop.
Pair either path with the smoke-run discipline above: never wire a write tool to the live sheet without a five-row smoke first.
Walkthrough
Step 1: Export and clean
CSV from Excel/Sheets → trim trailing blanks → confirm UTF-8 so names don’t corrupt mid-flight.
Step 2: Wire intake (automation path)
Mint intake on the workflow (Trigger a workflow with an intake webhook) → test one curl → promote to script or Actions (Trigger a workflow from GitHub Actions).
Step 3: Loop with logs
For each row: build JSON including row_key → POST → capture work_id → optional poll: log failures with row_key beside HTTP status (never log Bearer tokens).
Step 4: Manual path alternative
Sort by priority → New task top ten → verify board columns → continue: same row_key in the opening prompt or task title convention.
Step 5: Sample QC
Random N rows (or first/last/middle stratified): compare workflow outputs to rubric, document exceptions before next thousand.
What’s next
- → Trigger a workflow with an intake webhook. The
202+work_idcontract this lesson assumes for loops. - → Trigger a workflow from GitHub Actions. Use it when the spreadsheet trigger belongs next to repo events.
- → Trace a failing job end to end. Turn
row_key+work_idinto a coherent story across tabs. - → Harden your intake webhooks. Before production volume meets the public internet.
- → Triage and follow up on inbound leads. Same row-wise mindset when rows are leads, not generic tasks.
- → Use a DRAFT_ tab for agent spreadsheet writes. The canonical wiring when the batch workflow writes results back to a spreadsheet.
- → Plan for retention and deletion requests. Bulk runs touch threads and exports: know the off-ramp before procurement asks.
Reference
- Manual: Intake Webhooks, API authentication
- Pages in Auxot: Workflows, Settings → Intakes, Audit Logs
- See also: Catch regressions after you change an agent, Plan for retention and deletion requests, Run a workflow, Add an MCP server, Create a shared Team API Key