Coreference Resolution (OntoNotes)

Link pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.

Configuration Fileconfig.yaml

# Coreference Resolution Configuration
# Link pronouns and mentions to their referents

task_dir: "."
annotation_task_name: "Coreference Resolution"

data_files:
  - "data/documents.json"

item_properties:
  id_key: "id"
  text_key: "display"
  text_display_key: "display"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "radio"
    name: "coreference"
    description: "Does the highlighted mention refer to the same entity as the target?"
    labels:
      - name: "Same Entity"
        tooltip: "The mention refers to the same entity"
        key_value: "y"
        color: "#22c55e"
      - name: "Different Entity"
        tooltip: "The mention refers to a different entity"
        key_value: "n"
        color: "#ef4444"
      - name: "Ambiguous"
        tooltip: "Cannot determine from context"
        key_value: "a"
        color: "#eab308"

  - annotation_type: "radio"
    name: "mention_type"
    description: "What type of mention is this?"
    labels:
      - name: "Pronoun"
        tooltip: "he, she, it, they, etc."
      - name: "Proper noun"
        tooltip: "Names of people, places, organizations"
      - name: "Common noun"
        tooltip: "the company, the scientist, etc."
      - name: "Demonstrative"
        tooltip: "this, that, these, those"

  - annotation_type: "text"
    name: "antecedent"
    description: "What is the full antecedent (what does this refer to)?"

  - annotation_type: "likert"
    name: "difficulty"
    description: "How difficult was this decision?"
    size: 5
    min_label: "Very easy"
    max_label: "Very difficult"

output: "annotation_output/"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

Sample Datasample-data.json

[
  {
    "id": "coref_001",
    "text": "Sarah told Emily that she would help with the project tomorrow.",
    "target_entity": "Sarah",
    "mention": "she",
    "mention_position": [
      24,
      27
    ],
    "display": "**Text:** Sarah told Emily that **[she]** would help with the project tomorrow.\n\n**Target Entity:** Sarah\n**Mention to evaluate:** she\n\n*Question: Does 'she' refer to 'Sarah'?*"
  },
  {
    "id": "coref_002",
    "text": "The company announced record profits. It plans to expand into new markets next year.",
    "target_entity": "The company",
    "mention": "It",
    "mention_position": [
      37,
      39
    ],
    "display": "**Text:** The company announced record profits. **[It]** plans to expand into new markets next year.\n\n**Target Entity:** The company\n**Mention to evaluate:** It\n\n*Question: Does 'It' refer to 'The company'?*"
  }
]

// ... and 2 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/text/information-extraction/coreference-resolution
potato start config.yaml

Details

Annotation Types

likertradiotext

Domain

NLPLinguistics

Use Cases

Coreference ResolutionEntity LinkingDiscourse Analysis

Related Designs

FinBERT - Financial Headline Sentiment Analysis

Classify sentiment of financial news headlines as positive, negative, or neutral, based on the FinBERT model (Araci, arXiv 2019). Annotators also rate market outlook on a bearish-to-bullish scale and provide reasoning for their sentiment judgment.

radiotext

Politeness Transfer Annotation

Annotate text for politeness level, speech act type, and optional rewrite suggestions based on the Politeness Transfer framework. Annotators rate the politeness of workplace and email text on a 5-point scale and classify the communicative intent.

likertradio

SayCan - Robot Task Planning Evaluation

Evaluate robot action plans generated from natural language instructions, based on the SayCan framework (Ahn et al., CoRL 2022). Annotators assess feasibility, identify primitive actions, describe plans, and rate safety of grounded language-conditioned robot manipulation tasks.

radiomultiselect