Skip to content

Task Assignment

Control how Potato distributes annotation items to annotators. Covers all assignment strategies including the custom Batch strategy for repeat-round studies, and reclaiming abandoned assignments from Prolific or QC-blocked workers.

Task assignment controls which items each annotator sees, how many they complete, how many annotations each item receives, and the order items appear in. This page covers Potato's assignment strategies, the custom Batch strategy for repeat-round study designs, and how to reclaim assignments abandoned by crowd workers.

Assignment strategies

Set assignment_strategy to one of:

StrategyWhat it does
randomAssigns items randomly (the default).
fixed_orderAssigns items in dataset order.
least_annotatedPrioritizes items with the fewest annotations so far.
max_diversityPrioritizes items with the most disagreement among existing annotations.
diversity_clusteringEmbeds and clusters items, then serves them round-robin across clusters.
batchRestricts assignment to explicit annotator/item cohorts (see below).
priorityServes the highest-priority items first; see the Triage Queue.
active_learningUses a model to prioritize uncertain items.
yaml
assignment_strategy: random
max_annotations_per_user: 10    # -1 for unlimited
max_annotations_per_item: 3     # -1 for unlimited

Custom Batch assignment

The batch strategy assigns predefined batches of items to specific annotators. It is built for repeat-round study designs, where the same annotators who saw a first-round batch must receive the matching second-round batch.

yaml
assignment_strategy: batch
num_annotators_per_item: 4
 
batch_assignment:
  groups:
    - name: round1_batch_a
      annotators: ["u1", "u2", "u3", "u4"]
      instances: ["r2_item_001", "r2_item_002"]

For long batches, move the instance list into a separate data file (json, jsonl, csv, tsv, or parquet); IDs are read with item_properties.id_key:

yaml
batch_assignment:
  groups:
    - name: round1_batch_a
      annotators: ["u1", "u2", "u3", "u4"]
      instances_file: batches/round1_batch_a.csv

Items can also name their allowed annotators directly, which is useful when round-2 data is generated from round-1 annotations:

yaml
assignment_strategy: batch
 
batch_assignment:
  annotator_key: round1_annotators

Users outside the configured cohorts receive no items under this strategy.

Reclaiming abandoned assignments

In crowdsourcing batches, workers can return, time out, or fail quality checks after receiving assigned items. With instance_reclaim enabled, Potato returns assigned-but-unannotated items to the pool so they can be assigned again.

yaml
instance_reclaim:
  enabled: true
  timeout_hours: 24
  preserve_completed_annotations: true

Reclaiming runs automatically for stale assignments when assignment runs, for Prolific workers whose submissions become RETURNED, TIMED-OUT, or REJECTED, and for users blocked by an attention-check failure (who release their unannotated items immediately).

You can decide per reason whether to keep a reclaimed worker's completed annotations. This lets you trust partial work from a timed-out Prolific worker while discarding everything from a worker blocked by quality control:

yaml
instance_reclaim:
  enabled: true
  timeout_hours: 24
  preserve_completed_annotations: true   # default for reasons not overridden below
 
  prolific:
    status_policies:
      TIMED-OUT:
        preserve_completed_annotations: true
      RETURNED:
        preserve_completed_annotations: true
      REJECTED:
        preserve_completed_annotations: false
 
  quality_control:
    preserve_completed_annotations: false

When preserve_completed_annotations is false, Potato clears that user's annotations for their assigned items, removes their annotator credit, and returns the items to the pool. The failed attention-check response that triggers a block is never kept.

For implementation details, see the source documentation.