# Task Assignment

Source: https://www.potatoannotator.com/docs/deployment/task-assignment

**Task assignment controls which items each annotator sees, how many they complete, how many annotations each item receives, and the order items appear in.** This page covers Potato's assignment strategies, the custom Batch strategy for repeat-round study designs, and how to reclaim assignments abandoned by crowd workers.

## Assignment strategies

Set `assignment_strategy` to one of:

| Strategy | What it does |
|----------|--------------|
| `random` | Assigns items randomly (the default). |
| `fixed_order` | Assigns items in dataset order. |
| `least_annotated` | Prioritizes items with the fewest annotations so far. |
| `max_diversity` | Prioritizes items with the most disagreement among existing annotations. |
| `diversity_clustering` | Embeds and clusters items, then serves them round-robin across clusters. |
| `batch` | Restricts assignment to explicit annotator/item cohorts (see below). |
| `priority` | Serves the highest-priority items first; see the [Triage Queue](/docs/agent-evaluation/triage-queue). |
| `active_learning` | Uses a model to prioritize uncertain items. |

```yaml
assignment_strategy: random
max_annotations_per_user: 10    # -1 for unlimited
max_annotations_per_item: 3     # -1 for unlimited
```

## Custom Batch assignment

The `batch` strategy assigns predefined batches of items to specific annotators. It is built for repeat-round study designs, where the same annotators who saw a first-round batch must receive the matching second-round batch.

```yaml
assignment_strategy: batch
num_annotators_per_item: 4

batch_assignment:
  groups:
    - name: round1_batch_a
      annotators: ["u1", "u2", "u3", "u4"]
      instances: ["r2_item_001", "r2_item_002"]
```

For long batches, move the instance list into a separate data file (`json`, `jsonl`, `csv`, `tsv`, or `parquet`); IDs are read with `item_properties.id_key`:

```yaml
batch_assignment:
  groups:
    - name: round1_batch_a
      annotators: ["u1", "u2", "u3", "u4"]
      instances_file: batches/round1_batch_a.csv
```

Items can also name their allowed annotators directly, which is useful when round-2 data is generated from round-1 annotations:

```yaml
assignment_strategy: batch

batch_assignment:
  annotator_key: round1_annotators
```

Users outside the configured cohorts receive no items under this strategy.

## Reclaiming abandoned assignments

In crowdsourcing batches, workers can return, time out, or fail quality checks after receiving assigned items. With `instance_reclaim` enabled, Potato returns assigned-but-unannotated items to the pool so they can be assigned again.

```yaml
instance_reclaim:
  enabled: true
  timeout_hours: 24
  preserve_completed_annotations: true
```

Reclaiming runs automatically for stale assignments when assignment runs, for Prolific workers whose submissions become `RETURNED`, `TIMED-OUT`, or `REJECTED`, and for users blocked by an attention-check failure (who release their unannotated items immediately).

You can decide per reason whether to keep a reclaimed worker's completed annotations. This lets you trust partial work from a timed-out Prolific worker while discarding everything from a worker blocked by quality control:

```yaml
instance_reclaim:
  enabled: true
  timeout_hours: 24
  preserve_completed_annotations: true   # default for reasons not overridden below

  prolific:
    status_policies:
      TIMED-OUT:
        preserve_completed_annotations: true
      RETURNED:
        preserve_completed_annotations: true
      REJECTED:
        preserve_completed_annotations: false

  quality_control:
    preserve_completed_annotations: false
```

When `preserve_completed_annotations` is `false`, Potato clears that user's annotations for their assigned items, removes their annotator credit, and returns the items to the pool. The failed attention-check response that triggers a block is never kept.

## Related

- [Heterogeneous Coverage](/docs/deployment/heterogeneous-coverage) — per-item annotator caps and overlap sampling
- [Crowdsourcing](/docs/deployment/crowdsourcing) — MTurk and Prolific integration
- [Signal-Based Triage Queue](/docs/agent-evaluation/triage-queue) — the `priority` strategy

For implementation details, see the [source documentation](https://github.com/davidjurgens/potato/blob/main/docs/advanced/task_assignment.md).
