Skip to content

Signal-Based Triage Queue

Prioritize the annotation queue by a per-item quality signal so reviewers see the worst or most-suspect traces first, instead of annotating in arrival order. Route by agent errors, production thumbs-down, low scores, or any custom field.

The triage queue prioritizes annotation by a per-item quality signal, so reviewers see the worst or most-suspect items first instead of working in arrival (FIFO) order. The signal can be an agent error, a production thumbs-down, a low automated score, or any custom field. It is read for both statically loaded data and traces ingested at runtime, and it surfaces in two places: a banner during annotation and the /admin/triage-queue ranking page.

When human review time is scarce, the order in which items reach annotators matters. Routing the most informative items first is the triage half of an active evaluation loop, and it pairs naturally with Judge Alignment to send disagreements and errors to people first.

Triage badge during annotationA priority badge explaining why an item was flagged for review

Configuration

yaml
triage:
  enabled: true
  order: desc            # high priority first (default); 'asc' = low first
  default_priority: 0    # items matching no rule
  show_badge: true       # banner during annotation explaining the priority
  rules:                 # evaluated in order; highest matching priority wins
    - name: "Agent errored"
      badge: "Agent errored"     # banner text (defaults to name)
      priority: 100
      when:
        field: status            # dotted paths allowed, e.g. metadata.tags
        equals: error
    - name: "Negative feedback"
      priority: 80
      when:
        field: feedback
        in: [thumbs_down, negative]
    - name: "Low quality score"
      priority: 60
      when:
        field: score
        lt: 0.5
 
# Serve the highest-priority items first. If you enable triage without setting
# assignment_strategy, Potato defaults to `priority` automatically.
assignment_strategy: priority

If you omit rules (and signal_field), Potato uses a turnkey default set: error status (100), negative feedback (80), and score below 0.5 (60).

Condition operators

OperatorMeaning
equalsexact match (strings are case-insensitive)
invalue is one of a list
containslist field contains, or substring match
lt / lte / gt / gtenumeric comparison
existsfield is present or absent (true/false)

Reading a numeric signal directly

Instead of, or in addition to, rules, you can read a number straight from a field:

yaml
triage:
  enabled: true
  signal_field: quality_score   # used as the priority when no rule matches
  invert_signal: true           # lower score => higher priority

How priority drives assignment

Set assignment_strategy: priority. When a user needs items, the queue is sorted by each item's stored triage_priority (descending by default; order: asc flips it), with ties broken by the original load order for determinism, and the top items are assigned. The signal is computed once at load or ingestion time and stored on the item, so assignment stays cheap.

The badge (show_badge: true) is independent of the strategy. It explains why an item was flagged even if you keep a different assignment strategy.

The admin queue page

text
GET /admin/triage-queue              # JSON
GET /admin/triage-queue?format=html  # rendered page

Send the X-API-Key header. The page shows every remaining (incomplete) item ranked by priority, with the rule that flagged it, the current annotation count, and whether it is already assigned.

Runtime ingestion

Because the scorer runs as items are added, traces ingested at runtime through trace ingestion (the webhook endpoint or a Langfuse poller) are scored as they arrive and slot into the priority queue automatically. A low-scoring or errored trace pushed in mid-session jumps ahead of clean ones still waiting.

Notes and limitations

  • Priority is computed at insertion time. Editing triage.rules and restarting re-scores everything on the next load.
  • A malformed rule logs a warning and is skipped; it never blocks data loading.
  • Triage orders which items are served. It does not change per-item annotation caps.

For implementation details, see the source documentation.