# Named Entity Recognition

Source: https://www.potatoannotator.com/docs/guides/named-entity-recognition

**Named entity recognition (NER) is the task of finding and classifying named things in text, people, organizations, locations, dates, and more. It is a [span annotation](/docs/guides/span-annotation) task with an entity-typed label set.** NER is a building block for search, knowledge graphs, redaction, and information extraction.

See [Named-entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition) for background.

## Choosing a label set

Start from a standard scheme and trim it to your domain:

- **CoNLL-2003**: `PER`, `ORG`, `LOC`, `MISC`. A good minimal default.
- **OntoNotes**: 18 types including dates, money, and percentages, for richer needs.
- **Domain-specific**: biomedical (genes, diseases), legal (statutes, parties), or finance.

Fewer, well-defined types give higher agreement. Add types only when a real downstream use needs them.

## Building the task in Potato

```yaml
annotation_schemes:
  - annotation_type: span
    name: entities
    description: "Highlight each named entity and select its type."
    labels: [PERSON, ORGANIZATION, LOCATION, DATE, MISC]
    label_colors:
      PERSON: "#3b82f6"
      ORGANIZATION: "#10b981"
      LOCATION: "#f59e0b"
      DATE: "#8b5cf6"
      MISC: "#6b7280"
    tooltips:
      PERSON: "Names of people, e.g. 'Ada Lovelace'."
      ORGANIZATION: "Companies, agencies, teams, e.g. 'United Nations'."
      LOCATION: "Cities, countries, landmarks, e.g. 'Paris'."
      DATE: "Dates and time expressions, e.g. 'next Monday'."
      MISC: "Named entities that fit none of the above."
    allow_overlapping: false
    sequential_key_binding: true
```

The [named entity recognition showcase](/showcase/named-entity-recognition) runs this configuration with sample data.

## Boundary rules that prevent disagreement

Most NER disagreement is about *where* an entity starts and ends, not *what* it is. Decide and document:

- Do titles count? ("**Dr.** Jane Smith" vs. "Dr. **Jane Smith**".)
- Do you include "the" in "**the** United Nations"?
- How do you tag nested entities like "Bank of **England**"? If you need them, set `allow_overlapping: true`.

## From labels to a model

Export to [CoNLL](https://en.wikipedia.org/wiki/CoNLL) or spaCy format, which represent entities with [BIO/IOB tags](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)). See [Exporting Annotations for ML](/docs/guides/exporting-annotations-for-ml).

## Further reading

- [Span Annotation](/docs/guides/span-annotation)
- [Entity Linking](/docs/guides/entity-linking), connecting entities to a knowledge base
- [Relation and Event Extraction](/docs/guides/relation-and-event-extraction)
