Semantic Curation (Catalog)
Find which agent traces to review by similarity, not just rules. An embedding index over your items powers similarity search ("find traces like this failure") and dynamic slices — saved semantic plus metadata filters that auto-include new matching traces and curate into datasets.
Semantic Curation helps you find what to review by similarity, not just rules or uncertainty. An embedding index over your items powers similarity search ("find traces like this failure") and dynamic slices — saved semantic + metadata filters that auto-include new matching traces and curate into datasets. It complements rule-based triage and model-uncertainty active learning.
Enabling
curation:
enabled: true
model_name: all-MiniLM-L6-v2 # any sentence-transformers model
embed_on_ingest: false # index runtime-ingested traces on arrival
text_key: task_description # which field to embedEmbeddings are lazy — sentence-transformers is imported only when you build the index, never at startup, so boot stays fast. Install it with pip install sentence-transformers, or wire a custom embedder. When enabled, the admin dashboard shows a Catalog link.
Build, search, slice
# Build the index over current items
curl -X POST localhost:8000/admin/catalog/api/build -H "X-API-Key: <key>"
# Search by text query (or by an anchor instance to find neighbours)
curl -X POST localhost:8000/admin/catalog/api/search -H "X-API-Key: <key>" \
-H "Content-Type: application/json" -d '{"query": "tool call failed", "top_k": 10, "threshold": 0.3}'A slice is a saved filter resolved on demand against the current index, so traces ingested after you saved it are automatically included if they match. It combines an optional semantic neighborhood with a metadata filter:
curl -X POST localhost:8000/admin/catalog/api/slices -H "X-API-Key: <key>" \
-H "Content-Type: application/json" \
-d '{"name": "tool-errors", "query": "tool call failed", "threshold": 0.3,
"metadata_filter": [{"field": "metadata.outcome", "equals": "error"}]}'
# Curate the resolved instances straight into a dataset
curl -X POST localhost:8000/admin/catalog/api/slices/tool-errors/to_dataset \
-H "X-API-Key: <key>" -H "Content-Type: application/json" \
-d '{"dataset": "tool-errors-to-fix"}'Related
- Full reference on Read the Docs — full slice/embedding API, version-matched
- Datasets & Experiments — slice curation target
- Automation Rules — rule-based routing (shares the condition grammar)
- Triage Queue — signal-based prioritization