Task Assignment
Control how Potato distributes annotation items to annotators. Covers all assignment strategies including the custom Batch strategy for repeat-round studies, and reclaiming abandoned assignments from Prolific or QC-blocked workers.
Task assignment controls which items each annotator sees, how many they complete, how many annotations each item receives, and the order items appear in. This page covers Potato's assignment strategies, the custom Batch strategy for repeat-round study designs, and how to reclaim assignments abandoned by crowd workers.
Assignment strategies
Set assignment_strategy to one of:
| Strategy | What it does |
|---|---|
random | Assigns items randomly (the default). |
fixed_order | Assigns items in dataset order. |
least_annotated | Prioritizes items with the fewest annotations so far. |
max_diversity | Prioritizes items with the most disagreement among existing annotations. |
diversity_clustering | Embeds and clusters items, then serves them round-robin across clusters. |
batch | Restricts assignment to explicit annotator/item cohorts (see below). |
priority | Serves the highest-priority items first; see the Triage Queue. |
active_learning | Uses a model to prioritize uncertain items. |
assignment_strategy: random
max_annotations_per_user: 10 # -1 for unlimited
max_annotations_per_item: 3 # -1 for unlimitedCustom Batch assignment
The batch strategy assigns predefined batches of items to specific annotators. It is built for repeat-round study designs, where the same annotators who saw a first-round batch must receive the matching second-round batch.
assignment_strategy: batch
num_annotators_per_item: 4
batch_assignment:
groups:
- name: round1_batch_a
annotators: ["u1", "u2", "u3", "u4"]
instances: ["r2_item_001", "r2_item_002"]For long batches, move the instance list into a separate data file (json, jsonl, csv, tsv, or parquet); IDs are read with item_properties.id_key:
batch_assignment:
groups:
- name: round1_batch_a
annotators: ["u1", "u2", "u3", "u4"]
instances_file: batches/round1_batch_a.csvItems can also name their allowed annotators directly, which is useful when round-2 data is generated from round-1 annotations:
assignment_strategy: batch
batch_assignment:
annotator_key: round1_annotatorsUsers outside the configured cohorts receive no items under this strategy.
Reclaiming abandoned assignments
In crowdsourcing batches, workers can return, time out, or fail quality checks after receiving assigned items. With instance_reclaim enabled, Potato returns assigned-but-unannotated items to the pool so they can be assigned again.
instance_reclaim:
enabled: true
timeout_hours: 24
preserve_completed_annotations: trueReclaiming runs automatically for stale assignments when assignment runs, for Prolific workers whose submissions become RETURNED, TIMED-OUT, or REJECTED, and for users blocked by an attention-check failure (who release their unannotated items immediately).
You can decide per reason whether to keep a reclaimed worker's completed annotations. This lets you trust partial work from a timed-out Prolific worker while discarding everything from a worker blocked by quality control:
instance_reclaim:
enabled: true
timeout_hours: 24
preserve_completed_annotations: true # default for reasons not overridden below
prolific:
status_policies:
TIMED-OUT:
preserve_completed_annotations: true
RETURNED:
preserve_completed_annotations: true
REJECTED:
preserve_completed_annotations: false
quality_control:
preserve_completed_annotations: falseWhen preserve_completed_annotations is false, Potato clears that user's annotations for their assigned items, removes their annotator credit, and returns the items to the pool. The failed attention-check response that triggers a block is never kept.
Related
- Heterogeneous Coverage — per-item annotator caps and overlap sampling
- Crowdsourcing — MTurk and Prolific integration
- Signal-Based Triage Queue — the
prioritystrategy
For implementation details, see the source documentation.