Automation Rules
Close the production-to-evaluation loop with a filter to sample to actions rules engine. Potato runs rules over every incoming agent trace to route it to the annotation queue, curate it into a dataset, run an evaluator, fire a webhook, or notify annotators.
Automation rules close the production-to-evaluation loop: a programmable filter → sampling rate → actions pipeline runs over every item entering Potato — whether loaded from data files or ingested at runtime via the trace webhook or tracing SDK. Each matching rule can route the item to the annotation queue, curate it into an eval dataset, run an evaluator, fire an outbound webhook, or notify annotators.
Enabling
automation:
enabled: true
rules:
- name: route-errors
when: {field: status, in: [error, failed]}
sample_rate: 1.0 # 0.0–1.0 (default 1.0 = every match)
actions:
- {type: add_to_queue, priority: 100, reason: "Agent errored"}
- {type: add_to_dataset, dataset: errors-to-fix}
- {type: run_evaluator, evaluator: trajectory_match}
- {type: fire_webhook, url: "https://example.com/hook"}
- {type: notify, message: "New error trace"}How a rule fires
A rule fires when both hold:
whenmatches — the shared condition grammar (same as triage):equals,in,contains,exists,lt/lte/gt/gte, dotted field paths (metadata.score). A list of conditions is AND-ed; an emptywhenmatches everything.sample_rateselects it — deterministic sampling on a hash of(item id, rule name), so re-processing the same item yields the same decision (idempotent, replay-safe).
Actions
| Action | When it runs | Effect |
|---|---|---|
add_to_queue | inline (fast) | Boost the item's triage priority so it surfaces first |
add_to_dataset | inline (fast) | Append the item as an example to a dataset |
notify | inline (fast) | Notify connected annotators via SSE |
run_evaluator | background worker | Score the item with an evaluator; store the score on the item |
fire_webhook | background worker | POST {rule, item_id, item_data} to an external URL |
Fast actions run inline in the ingestion path; heavy actions (run_evaluator, fire_webhook) are dispatched to a background worker so ingestion never blocks. Every action records an outcome, and failures are caught — automation never breaks ingestion.
Inspecting
The admin dashboard links to Automation (/admin/automation), showing configured rules, activity counters, and recent action outcomes (also available as JSON at /admin/automation/status and /admin/automation/outcomes).
Related
- Full reference on Read the Docs — all action types and the condition grammar, version-matched
- Datasets & Experiments —
add_to_datasettargets - Programmatic Evaluators —
run_evaluator - Triage Queue — shares the condition grammar
- Tracing SDK — a source of incoming traces