Tutorials4 min read
Building Your First NER Annotation Task
Step-by-step tutorial on creating a named entity recognition annotation task with span labels and keyboard shortcuts.
By Potato Team·
Building Your First NER Annotation Task
Named Entity Recognition (NER) is one of the most common NLP tasks. In this tutorial, you'll learn how to create a complete NER annotation interface with span highlighting, keyboard shortcuts, and entity type selection.
What We're Building
By the end of this tutorial, you'll have an annotation interface where annotators can:
- Highlight text spans by clicking and dragging
- Assign entity types (Person, Organization, Location, etc.)
- Use keyboard shortcuts for faster annotation
- Edit or delete existing annotations
Prerequisites
- Potato installed (
pip install potato-annotation) - Basic familiarity with YAML
- Sample text data to annotate
Step 1: Configure the Annotation Scheme
Create a config.yaml file:
annotation_task_name: "Named Entity Recognition"
data_files:
- data/sentences.json
item_properties:
id_key: id
text_key: text
# Enable span annotation
annotation_schemes:
- annotation_type: span
name: entities
description: "Highlight and label named entities in the text"
labels:
- name: PER
description: "Person names"
color: "#FF6B6B"
keyboard_shortcut: "p"
- name: ORG
description: "Organizations"
color: "#4ECDC4"
keyboard_shortcut: "o"
- name: LOC
description: "Locations"
color: "#45B7D1"
keyboard_shortcut: "l"
- name: DATE
description: "Dates and times"
color: "#96CEB4"
keyboard_shortcut: "d"
- name: MISC
description: "Miscellaneous entities"
color: "#FFEAA7"
keyboard_shortcut: "m"
min_spans: 0 # Allow sentences with no entitiesStep 2: Prepare Your Data
Create data/sentences.json with your text data:
{"id": "1", "text": "Apple Inc. announced that CEO Tim Cook will visit Paris next Tuesday."}
{"id": "2", "text": "The United Nations headquarters in New York hosted delegates from Japan."}
{"id": "3", "text": "Dr. Sarah Johnson published her research at Stanford University in March 2024."}Step 3: Add Annotation Guidelines
Help your annotators with clear guidelines:
# Add to config.yaml
annotation_guidelines:
title: "NER Annotation Guidelines"
content: |
## Entity Types
**PER (Person)**: Names of people, including fictional characters
- Examples: "John Smith", "Dr. Johnson", "Batman"
**ORG (Organization)**: Companies, institutions, agencies
- Examples: "Apple Inc.", "United Nations", "Stanford University"
**LOC (Location)**: Places, including countries, cities, landmarks
- Examples: "Paris", "New York", "Mount Everest"
**DATE**: Dates, times, and temporal expressions
- Examples: "Tuesday", "March 2024", "next week"
**MISC**: Other named entities not fitting above categories
- Examples: "Nobel Prize", "iPhone", "COVID-19"
## Annotation Rules
1. Include titles (Dr., Mr.) with person names
2. For nested entities, annotate the largest meaningful span
3. Don't include articles (the, a) in entity spansStep 4: Start Annotating
Launch your NER task:
potato start config.yamlAnnotation Workflow
- Select text: Click and drag to highlight a span
- Choose entity type: Click a label button or use keyboard shortcut
- Edit annotations: Click an existing span to modify or delete
- Submit: Press Enter or click Submit when done
Step 5: Review Output
Annotations are saved in JSONL format:
{
"id": "1",
"text": "Apple Inc. announced that CEO Tim Cook will visit Paris next Tuesday.",
"annotations": {
"entities": [
{"start": 0, "end": 10, "label": "ORG", "text": "Apple Inc."},
{"start": 30, "end": 38, "label": "PER", "text": "Tim Cook"},
{"start": 50, "end": 55, "label": "LOC", "text": "Paris"},
{"start": 61, "end": 73, "label": "DATE", "text": "next Tuesday"}
]
}
}Tips for Better NER Annotation
- Consistent guidelines: Clear rules reduce disagreement
- Training examples: Show annotators edge cases before they start
- Regular calibration: Discuss difficult cases as a team
- Measure agreement: Use inter-annotator agreement to identify issues
Next Steps
- Add a training phase to onboard annotators
- Set up multiple annotators for redundancy
- Export to Hugging Face format for model training
Need help? Check our span annotation documentation for more details.