# MACE Competence Estimation

Source: https://www.potatoannotator.com/docs/features/mace

MACE (Multi-Annotator Competence Estimation) is a Variational Bayes EM algorithm that jointly estimates **true labels** for each item and **annotator competence** scores. It models each annotator as either "knowing" (produces correct labels) or "guessing" (produces random labels), yielding a competence score between 0.0 and 1.0.

## When to Use MACE

MACE is useful when you have multiple annotators labeling the same items and want to:

- Identify which annotators are most reliable
- Produce higher-quality predicted labels by weighting annotator contributions
- Detect low-quality annotators (spammers) automatically
- Measure label uncertainty (entropy) per item

MACE works with categorical annotation types: `radio`, `likert`, `select`, and `multiselect`. It does not apply to free-text, span, slider, or numeric annotations.

## How It Works

1. **Data extraction**: Potato collects all annotations for each schema across all annotators, building an items-by-annotators matrix
2. **EM algorithm**: MACE runs multiple random restarts of the Variational Bayes EM algorithm, keeping the solution with the best log-likelihood
3. **Output**: For each schema, MACE produces predicted labels, label entropy (uncertainty), and per-annotator competence scores
4. **Triggering**: MACE runs automatically after every N new annotations (configurable), or can be triggered manually via the admin API

## Configuration

```yaml
mace:
  enabled: true

  # Run MACE after every N new annotations
  trigger_every_n: 10

  # Minimum annotators per item before including in computation
  min_annotations_per_item: 3

  # Minimum eligible items before MACE will run
  min_items: 5

  # EM algorithm parameters
  num_restarts: 10
  num_iters: 50
  alpha: 0.5    # Prior for annotator spamming (Beta distribution)
  beta: 0.5     # Prior for guessing strategy (Dirichlet distribution)
```

### Minimal Configuration

```yaml
mace:
  enabled: true
```

Uses all defaults: triggers every 10 annotations, requires 3 annotators per item, minimum 5 eligible items, 10 restarts with 50 iterations each.

## Configuration Reference

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enabled` | boolean | `false` | Enable MACE |
| `trigger_every_n` | integer | `10` | Run after every N new annotations |
| `min_annotations_per_item` | integer | `3` | Minimum annotators per item (must be >= 2) |
| `min_items` | integer | `5` | Minimum eligible items before running |
| `num_restarts` | integer | `10` | Random restarts for EM |
| `num_iters` | integer | `50` | EM iterations per restart |
| `alpha` | float | `0.5` | Prior for annotator spamming |
| `beta` | float | `0.5` | Prior for guessing strategy |

## Admin API Endpoints

All MACE endpoints require admin authentication via the `X-API-Key` header.

### Overview

```bash
curl http://localhost:8000/admin/api/mace/overview \
  -H "X-API-Key: your-admin-key"
```

Returns annotator competence scores and MACE status:

```json
{
  "enabled": true,
  "has_results": true,
  "schemas": ["sentiment"],
  "annotator_competence": {
    "user_1": {"average": 0.92, "per_schema": {"sentiment": 0.92}},
    "user_2": {"average": 0.85, "per_schema": {"sentiment": 0.85}},
    "user_3": {"average": 0.45, "per_schema": {"sentiment": 0.45}}
  },
  "total_annotations": 30,
  "annotations_until_next_run": 0
}
```

### Predictions

```bash
curl "http://localhost:8000/admin/api/mace/predictions?schema=sentiment" \
  -H "X-API-Key: your-admin-key"
```

Returns predicted labels and entropy for each item.

### Manual Trigger

```bash
curl -X POST http://localhost:8000/admin/api/mace/trigger \
  -H "X-API-Key: your-admin-key"
```

## Interpreting Results

### Annotator Competence

- **0.9 - 1.0**: Highly reliable annotator
- **0.7 - 0.9**: Good annotator, occasional disagreements
- **0.5 - 0.7**: Moderate annotator, may benefit from additional training
- **Below 0.5**: Potential spammer or confused annotator

### Label Entropy

- **Near 0.0**: High confidence in the predicted label
- **Above 0.5**: Moderate uncertainty, item may be genuinely ambiguous
- **Near log(num_labels)**: Maximum uncertainty, no consensus

## Adjudication Integration

When both MACE and adjudication are enabled, MACE predicted labels appear as an additional signal in the adjudication interface:

```yaml
adjudication:
  enabled: true
  adjudicator_users: ["admin"]
  min_annotations: 2

mace:
  enabled: true
  trigger_every_n: 10
  min_annotations_per_item: 2
```

## Best Practices

1. **Start with defaults** - the default configuration works well for most scenarios
2. **Monitor competence scores** - use the admin dashboard to track annotator quality over time
3. **Combine with training phases** - use training to qualify annotators, then MACE to monitor ongoing quality
4. **Set appropriate thresholds** - lower `min_annotations_per_item` for smaller annotation projects

## Further Reading

- [Quality Control](/docs/features/quality-control) - Other quality control mechanisms
- [Admin Dashboard](/docs/features/admin-dashboard) - Monitoring annotation progress
- [AI Support](/docs/features/ai-support) - AI-assisted annotation

For implementation details, see the [source documentation](https://github.com/davidjurgens/potato/blob/main/docs/mace.md).