Blog/Tutorials
Tutorials4 min read

Building Your First NER Annotation Task

Step-by-step tutorial on creating a named entity recognition annotation task with span labels and keyboard shortcuts.

By Potato Team·

Building Your First NER Annotation Task

Named Entity Recognition (NER) is one of the most common NLP tasks. In this tutorial, you'll learn how to create a complete NER annotation interface with span highlighting, keyboard shortcuts, and entity type selection.

What We're Building

By the end of this tutorial, you'll have an annotation interface where annotators can:

  • Highlight text spans by clicking and dragging
  • Assign entity types (Person, Organization, Location, etc.)
  • Use keyboard shortcuts for faster annotation
  • Edit or delete existing annotations

Prerequisites

  • Potato installed (pip install potato-annotation)
  • Basic familiarity with YAML
  • Sample text data to annotate

Step 1: Configure the Annotation Scheme

Create a config.yaml file:

annotation_task_name: "Named Entity Recognition"
 
data_files:
  - data/sentences.json
 
item_properties:
  id_key: id
  text_key: text
 
# Enable span annotation
annotation_schemes:
  - annotation_type: span
    name: entities
    description: "Highlight and label named entities in the text"
    labels:
      - name: PER
        description: "Person names"
        color: "#FF6B6B"
        keyboard_shortcut: "p"
      - name: ORG
        description: "Organizations"
        color: "#4ECDC4"
        keyboard_shortcut: "o"
      - name: LOC
        description: "Locations"
        color: "#45B7D1"
        keyboard_shortcut: "l"
      - name: DATE
        description: "Dates and times"
        color: "#96CEB4"
        keyboard_shortcut: "d"
      - name: MISC
        description: "Miscellaneous entities"
        color: "#FFEAA7"
        keyboard_shortcut: "m"
    min_spans: 0  # Allow sentences with no entities

Step 2: Prepare Your Data

Create data/sentences.json with your text data:

{"id": "1", "text": "Apple Inc. announced that CEO Tim Cook will visit Paris next Tuesday."}
{"id": "2", "text": "The United Nations headquarters in New York hosted delegates from Japan."}
{"id": "3", "text": "Dr. Sarah Johnson published her research at Stanford University in March 2024."}

Step 3: Add Annotation Guidelines

Help your annotators with clear guidelines:

# Add to config.yaml
annotation_guidelines:
  title: "NER Annotation Guidelines"
  content: |
    ## Entity Types
 
    **PER (Person)**: Names of people, including fictional characters
    - Examples: "John Smith", "Dr. Johnson", "Batman"
 
    **ORG (Organization)**: Companies, institutions, agencies
    - Examples: "Apple Inc.", "United Nations", "Stanford University"
 
    **LOC (Location)**: Places, including countries, cities, landmarks
    - Examples: "Paris", "New York", "Mount Everest"
 
    **DATE**: Dates, times, and temporal expressions
    - Examples: "Tuesday", "March 2024", "next week"
 
    **MISC**: Other named entities not fitting above categories
    - Examples: "Nobel Prize", "iPhone", "COVID-19"
 
    ## Annotation Rules
    1. Include titles (Dr., Mr.) with person names
    2. For nested entities, annotate the largest meaningful span
    3. Don't include articles (the, a) in entity spans

Step 4: Start Annotating

Launch your NER task:

potato start config.yaml

Annotation Workflow

  1. Select text: Click and drag to highlight a span
  2. Choose entity type: Click a label button or use keyboard shortcut
  3. Edit annotations: Click an existing span to modify or delete
  4. Submit: Press Enter or click Submit when done

Step 5: Review Output

Annotations are saved in JSONL format:

{
  "id": "1",
  "text": "Apple Inc. announced that CEO Tim Cook will visit Paris next Tuesday.",
  "annotations": {
    "entities": [
      {"start": 0, "end": 10, "label": "ORG", "text": "Apple Inc."},
      {"start": 30, "end": 38, "label": "PER", "text": "Tim Cook"},
      {"start": 50, "end": 55, "label": "LOC", "text": "Paris"},
      {"start": 61, "end": 73, "label": "DATE", "text": "next Tuesday"}
    ]
  }
}

Tips for Better NER Annotation

  1. Consistent guidelines: Clear rules reduce disagreement
  2. Training examples: Show annotators edge cases before they start
  3. Regular calibration: Discuss difficult cases as a team
  4. Measure agreement: Use inter-annotator agreement to identify issues

Next Steps


Need help? Check our span annotation documentation for more details.