# Coreference Chains

Source: https://www.potatoannotator.com/docs/annotation-types/coreference

Coreference annotation allows annotators to group text spans that refer to the same entity. This is essential for entity resolution, pronoun resolution, and discourse analysis.

## Overview

A coreference chain is a collection of mentions (text spans) that all refer to the same real-world entity. For example:

> "**Marie Curie** was a physicist. **She** won the Nobel Prize. **The scientist** changed **her** field forever."

The spans "Marie Curie", "She", "The scientist", and "her" all refer to the same person and form a single coreference chain.

## Quick Start

Coreference annotation requires two schema components:

1. A **span schema** for creating mentions
2. A **coreference schema** for grouping mentions into chains

```yaml
annotation_schemes:
  - annotation_type: span
    name: mentions
    description: Highlight all entity mentions
    labels:
      - name: MENTION
        tooltip: "Any reference to an entity"
    sequential_key_binding: true

  - annotation_type: coreference
    name: coref_chains
    description: Group mentions that refer to the same entity
    span_schema: mentions
    allow_singletons: true
```

## Configuration Options

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `annotation_type` | string | Required | Must be `"coreference"` |
| `name` | string | Required | Unique identifier for this schema |
| `description` | string | Required | Instructions displayed to annotators |
| `span_schema` | string | Required | Name of the span schema providing mentions |
| `entity_types` | list | `[]` | List of entity type categories |
| `allow_singletons` | boolean | `true` | Allow chains with only one mention |
| `visual_display.highlight_mode` | string | `"background"` | Visual style: `"background"`, `"bracket"`, or `"underline"` |

## Examples

### With Entity Types

Classify chains by entity type:

```yaml
annotation_schemes:
  - annotation_type: span
    name: ner
    description: Mark named entities
    labels:
      - name: ENTITY
        tooltip: "Any named entity mention"

  - annotation_type: coreference
    name: coref
    description: Create coreference chains
    span_schema: ner
    entity_types:
      - name: PERSON
        color: "#6E56CF"
      - name: ORGANIZATION
        color: "#22C55E"
      - name: LOCATION
        color: "#3B82F6"
      - name: OTHER
        color: "#F59E0B"
```

### Without Singletons

For tasks where every mention must link to at least one other mention:

```yaml
annotation_schemes:
  - annotation_type: span
    name: mentions
    description: Highlight co-referring mentions
    labels:
      - name: MENTION

  - annotation_type: coreference
    name: strict_coref
    description: All mentions must be part of a chain with at least 2 mentions
    span_schema: mentions
    allow_singletons: false
```

### Custom Visual Display

```yaml
annotation_schemes:
  - annotation_type: coreference
    name: coref
    description: Link coreference chains
    span_schema: mentions
    visual_display:
      highlight_mode: "underline"  # Options: background, bracket, underline
```

## User Interface

### Creating Chains

1. **Create mentions**: Use the span annotation tool to highlight all entity mentions
2. **Select mentions**: Click on the highlighted spans you want to chain together
3. **Create chain**: Click "New Chain" to group the selected mentions

### Managing Chains

- **Add to Chain**: Select additional mentions and click "Add to Chain"
- **Merge Chains**: Select multiple chains and click "Merge Chains" to combine them
- **Remove Mention**: Select a mention and click "Remove Mention" to remove it from its chain

### Color Coding

Each chain is automatically assigned a distinct color. Mentions in the same chain share the same color, making it easy to visually identify chain membership.

## Output Format

Coreference annotations are saved as span links:

```json
{
  "span_links": [
    {
      "schema": "coref_chains",
      "link_type": "coreference",
      "span_ids": ["mentions_0_5_MENTION", "mentions_34_37_MENTION", "mentions_72_85_MENTION"],
      "entity_type": "PERSON"
    },
    {
      "schema": "coref_chains",
      "link_type": "coreference",
      "span_ids": ["mentions_15_23_MENTION", "mentions_95_97_MENTION"],
      "entity_type": "ORGANIZATION"
    }
  ]
}
```

## Recommended Workflow

1. **First pass** - Read through the text and highlight all entity mentions
2. **Second pass** - Group mentions into coreference chains
3. **Review** - Check that all mentions are correctly assigned and no chains are missing

## Best Practices

1. **Define clear mention boundaries** - establish guidelines for what counts as a mention
2. **Handle nested mentions** - decide how to handle cases like "the CEO of Microsoft"
3. **Consider generic references** - determine whether generic references should be included
4. **Train annotators** - coreference is complex; provide examples and practice rounds
5. **Use entity types sparingly** - too many can slow annotation without improving data quality

## Further Reading

- [Span Annotation](/docs/annotation-types/span-annotation) - Creating text spans
- [Entity Linking](/docs/annotation-types/entity-linking) - Linking spans to knowledge bases
- [Span Linking](/docs/annotation-types/span-linking) - Other types of span relationships

For implementation details, see the [source documentation](https://github.com/davidjurgens/potato/blob/main/docs/coreference_annotation.md).
