# Heterogeneous Annotator Coverage

Source: https://www.potatoannotator.com/docs/deployment/heterogeneous-coverage

**Heterogeneous coverage lets you assign different numbers of annotators to different items** instead of a uniform cap. The common research design is one annotator on most items, with two or three overlapping on a 5–10% sample to monitor quality. Potato expresses that through the `num_annotators_per_item` and `per_annotator_quota` config blocks.

## Per-item annotator caps

`num_annotators_per_item` is the canonical key. It accepts a single integer for a uniform cap, or a structured mapping with a default, an overlap sample, and an optional adaptive boost:

```yaml
num_annotators_per_item:
  default: 1
  overlap_sample:
    fraction: 0.1
    count: 3
    stratify_by: domain
    seed: 42
  adaptive:
    enabled: true
    disagreement_threshold: 0.5
    boost_to: 3
  min: 1
```

`max_annotations_per_item` is now a deprecated alias for `num_annotators_per_item: <int>`.

### Overlap sample

The `overlap_sample` block raises the cap on a deterministic subset of items for quality monitoring. Sampling happens once at startup, and the chosen items are stamped with `required_annotations` so the assignment logic treats them as high-coverage.

| Field | Type | Description |
|-------|------|-------------|
| `fraction` | float in (0, 1] | proportion of items to sample |
| `count` | int ≥ 2 | annotator cap for sampled items (must exceed `default`) |
| `stratify_by` | string (optional) | item-data field used to stratify the sample |
| `seed` | int (optional) | RNG seed; defaults to the global `random_seed` |

When `stratify_by` is set, the fraction is applied per stratum, so every category contributes proportionally.

### Adaptive boost

Adaptive boost expands the cap on an item whose early annotators disagreed. Once an item has at least two annotations and its disagreement score crosses `disagreement_threshold`, its cap is raised to `boost_to` and the item re-enters the assignment queue. The boost is one-shot per item.

## Per-annotator quota

`per_annotator_quota` controls how many items each annotator is assigned, independent of per-item caps:

```yaml
per_annotator_quota:
  default: 100
  by_user:
    alice: 30
  by_user_role:
    expert: 30
    novice: 200

user_roles:
  alice: expert
  carol: novice
```

Resolution order: `by_user[uid]` → `by_user_role[user_roles[uid]]` → `default`.

## Adjudication auto-routing

When the adjudication block is enabled, overlap-sample items that reach their cap are scored automatically and pushed into the adjudication queue if agreement falls below `agreement_threshold`. Low-quality items surface as soon as the sample saturates, rather than when an adjudicator manually rebuilds the queue.

```yaml
adjudication:
  enabled: true
  adjudicator_users: [admin]
  min_annotations: 2
  agreement_threshold: 0.75
```

## Inspecting agreement

Once overlap-sample items saturate, agreement statistics are available at `/admin/iaa`, which computes the metric set appropriate to each schema's `annotation_type` — for example Cohen's and Fleiss' kappa for nominal schemes, weighted kappa for ordinal ones, and token-level kappa plus span F1 for spans. See the [inter-annotator agreement guide](/docs/guides/inter-annotator-agreement) for what these metrics mean.

## Example

A runnable demonstration lives at `examples/advanced/heterogeneous-coverage/`. From the repository root:

```bash
python potato/flask_server.py start examples/advanced/heterogeneous-coverage/config.yaml -p 8000
```

It uses 20 items across two domains, samples 20% for 3-annotator overlap stratified by domain, enables an adaptive boost at threshold 0.5, defines two expertise tiers, and routes low-agreement items into adjudication.

## Related

- [Task Assignment](/docs/deployment/task-assignment) — assignment strategies
- [Inter-annotator agreement guide](/docs/guides/inter-annotator-agreement) — the metrics behind `/admin/iaa`
- [Crowdsourcing](/docs/deployment/crowdsourcing) — MTurk and Prolific integration

For implementation details, see the [source documentation](https://github.com/davidjurgens/potato/blob/main/docs/advanced/heterogeneous_coverage.md).
