Choosing an Annotation Scheme
How to map your research question to the right Potato annotation type, radio, multiselect, span, likert, slider, pairwise, best-worst, multirate, rubric, and more.
The annotation scheme is the shape of the question you ask annotators. Pick it by asking what kind of answer you need: one category, several categories, a region of the item, a position on a scale, or a comparison between items. Potato supports more than 30 annotation types; this guide narrows them down.
For the full list of options and their settings, see the Annotation Schemes reference.
A decision guide
| If you need… | Use this type | Example |
|---|---|---|
| Exactly one category | radio | Sentiment: positive / negative / neutral |
| Several categories at once | multiselect | Topics present in an article |
| A category from a long list | select (dropdown) | Country, language, ICD code |
| A region inside the text/audio | span | Named entities, error spans |
| A position on a scale | likert | Agreement, fluency, quality |
| A continuous value | slider | Confidence 0–100 |
| The better of two items | pairwise | Which model reply is better? |
| Best and worst of a set | best_worst_scaling | Rank translations by fluency |
| The same scale across many items | multirate | Rate each retrieved document |
| Multiple weighted criteria | rubric_eval | MT-Bench-style LLM scoring |
| A written answer | text | Justification, correction |
Worked example: single vs. multiple labels
If categories are mutually exclusive, use radio so annotators must pick one:
annotation_schemes:
- annotation_type: radio
name: stance
description: "What stance does the post take?"
labels: [Supports, Opposes, Neutral]
sequential_key_binding: trueIf an item can have several labels at once, use multiselect and set limits:
annotation_schemes:
- annotation_type: multiselect
name: topics
description: "Select every topic the article covers."
labels: [Politics, Technology, Health, Sports, Business]
min_selections: 1
max_selections: 3sequential_key_binding: true lets annotators press number keys instead of clicking, which speeds up large jobs.
Combine schemes for richer tasks
You can stack several schemes on one screen, for example, a category plus a free-text reason. Pair them with conditional logic so the reason only appears when it is needed:
annotation_schemes:
- annotation_type: radio
name: quality
description: "Is this answer acceptable?"
labels: [Good, Bad]
- annotation_type: text
name: reason
description: "If Bad, briefly explain why."
label_requirement:
required: false