Coreference Chains
Group text spans that refer to the same entity for coreference resolution tasks.
Coreference Chains
Coreference annotation allows annotators to group text spans that refer to the same entity. This is essential for entity resolution, pronoun resolution, and discourse analysis.
Overview
A coreference chain is a collection of mentions (text spans) that all refer to the same real-world entity. For example:
"Marie Curie was a physicist. She won the Nobel Prize. The scientist changed her field forever."
The spans "Marie Curie", "She", "The scientist", and "her" all refer to the same person and form a single coreference chain.
Quick Start
Coreference annotation requires two schema components:
- A span schema for creating mentions
- A coreference schema for grouping mentions into chains
annotation_schemes:
- annotation_type: span
name: mentions
description: Highlight all entity mentions
labels:
- name: MENTION
tooltip: "Any reference to an entity"
sequential_key_binding: true
- annotation_type: coreference
name: coref_chains
description: Group mentions that refer to the same entity
span_schema: mentions
allow_singletons: trueConfiguration Options
| Field | Type | Default | Description |
|---|---|---|---|
annotation_type | string | Required | Must be "coreference" |
name | string | Required | Unique identifier for this schema |
description | string | Required | Instructions displayed to annotators |
span_schema | string | Required | Name of the span schema providing mentions |
entity_types | list | [] | List of entity type categories |
allow_singletons | boolean | true | Allow chains with only one mention |
visual_display.highlight_mode | string | "background" | Visual style: "background", "bracket", or "underline" |
Examples
With Entity Types
Classify chains by entity type:
annotation_schemes:
- annotation_type: span
name: ner
description: Mark named entities
labels:
- name: ENTITY
tooltip: "Any named entity mention"
- annotation_type: coreference
name: coref
description: Create coreference chains
span_schema: ner
entity_types:
- name: PERSON
color: "#6E56CF"
- name: ORGANIZATION
color: "#22C55E"
- name: LOCATION
color: "#3B82F6"
- name: OTHER
color: "#F59E0B"Without Singletons
For tasks where every mention must link to at least one other mention:
annotation_schemes:
- annotation_type: span
name: mentions
description: Highlight co-referring mentions
labels:
- name: MENTION
- annotation_type: coreference
name: strict_coref
description: All mentions must be part of a chain with at least 2 mentions
span_schema: mentions
allow_singletons: falseCustom Visual Display
annotation_schemes:
- annotation_type: coreference
name: coref
description: Link coreference chains
span_schema: mentions
visual_display:
highlight_mode: "underline" # Options: background, bracket, underlineUser Interface
Creating Chains
- Create mentions: Use the span annotation tool to highlight all entity mentions
- Select mentions: Click on the highlighted spans you want to chain together
- Create chain: Click "New Chain" to group the selected mentions
Managing Chains
- Add to Chain: Select additional mentions and click "Add to Chain"
- Merge Chains: Select multiple chains and click "Merge Chains" to combine them
- Remove Mention: Select a mention and click "Remove Mention" to remove it from its chain
Color Coding
Each chain is automatically assigned a distinct color. Mentions in the same chain share the same color, making it easy to visually identify chain membership.
Output Format
Coreference annotations are saved as span links:
{
"span_links": [
{
"schema": "coref_chains",
"link_type": "coreference",
"span_ids": ["mentions_0_5_MENTION", "mentions_34_37_MENTION", "mentions_72_85_MENTION"],
"entity_type": "PERSON"
},
{
"schema": "coref_chains",
"link_type": "coreference",
"span_ids": ["mentions_15_23_MENTION", "mentions_95_97_MENTION"],
"entity_type": "ORGANIZATION"
}
]
}Recommended Workflow
- First pass - Read through the text and highlight all entity mentions
- Second pass - Group mentions into coreference chains
- Review - Check that all mentions are correctly assigned and no chains are missing
Best Practices
- Define clear mention boundaries - establish guidelines for what counts as a mention
- Handle nested mentions - decide how to handle cases like "the CEO of Microsoft"
- Consider generic references - determine whether generic references should be included
- Train annotators - coreference is complex; provide examples and practice rounds
- Use entity types sparingly - too many can slow annotation without improving data quality
Further Reading
- Span Annotation - Creating text spans
- Entity Linking - Linking spans to knowledge bases
- Span Linking - Other types of span relationships
For implementation details, see the source documentation.