Skip to content
Cette page n'est pas encore disponible dans votre langue. La version anglaise est affichée.

Semantic Curation (Catalog)

Find which agent traces to review by similarity, not just rules. An embedding index over your items powers similarity search ("find traces like this failure") and dynamic slices — saved semantic plus metadata filters that auto-include new matching traces and curate into datasets.

Semantic Curation helps you find what to review by similarity, not just rules or uncertainty. An embedding index over your items powers similarity search ("find traces like this failure") and dynamic slices — saved semantic + metadata filters that auto-include new matching traces and curate into datasets. It complements rule-based triage and model-uncertainty active learning.

Enabling

yaml
curation:
  enabled: true
  model_name: all-MiniLM-L6-v2   # any sentence-transformers model
  embed_on_ingest: false          # index runtime-ingested traces on arrival
  text_key: task_description      # which field to embed

Embeddings are lazysentence-transformers is imported only when you build the index, never at startup, so boot stays fast. Install it with pip install sentence-transformers, or wire a custom embedder. When enabled, the admin dashboard shows a Catalog link.

Build, search, slice

bash
# Build the index over current items
curl -X POST localhost:8000/admin/catalog/api/build -H "X-API-Key: <key>"
 
# Search by text query (or by an anchor instance to find neighbours)
curl -X POST localhost:8000/admin/catalog/api/search -H "X-API-Key: <key>" \
  -H "Content-Type: application/json" -d '{"query": "tool call failed", "top_k": 10, "threshold": 0.3}'

A slice is a saved filter resolved on demand against the current index, so traces ingested after you saved it are automatically included if they match. It combines an optional semantic neighborhood with a metadata filter:

bash
curl -X POST localhost:8000/admin/catalog/api/slices -H "X-API-Key: <key>" \
  -H "Content-Type: application/json" \
  -d '{"name": "tool-errors", "query": "tool call failed", "threshold": 0.3,
       "metadata_filter": [{"field": "metadata.outcome", "equals": "error"}]}'
 
# Curate the resolved instances straight into a dataset
curl -X POST localhost:8000/admin/catalog/api/slices/tool-errors/to_dataset \
  -H "X-API-Key: <key>" -H "Content-Type: application/json" \
  -d '{"dataset": "tool-errors-to-fix"}'