Entity Linking
How to annotate entity linking, connecting mentions in text to entries in a knowledge base like Wikidata, and set up a linking task in Potato.
Entity linking connects a mention in text to a specific entry in a knowledge base, resolving "Paris" to the city in France rather than the person, or to the city in Texas. Where named entity recognition finds that something is an entity, entity linking decides which real-world entity it is.
See Entity linking and the common targets Wikidata and Wikipedia for background. This is closely related to word-sense disambiguation.
What annotators do
- A mention span is identified (often pre-marked from an NER pass).
- The annotator searches the knowledge base and selects the matching entry.
- If no entry fits, they mark it NIL (not in the knowledge base).
The NIL case is essential, without it, annotators force-fit mentions to wrong entries and corrupt the data.
Setting it up in Potato
Potato supports entity linking with a typeahead search against a knowledge base (Wikidata, UMLS, or a custom list), so annotators pick from real candidates rather than typing IDs. The entity linking showcase is a working example.
annotation_schemes:
- annotation_type: span
name: mentions
description: "Mark the mention to link."
labels: [Entity]
- annotation_type: text
name: kb_id
description: "Search the knowledge base and enter the matching ID, or write NIL if none fits."Quality considerations
- Candidate quality. A good typeahead with descriptions reduces wrong picks far more than longer guidelines.
- Ambiguity defaults. Tell annotators what to do when two entries seem equally valid.
- Granularity. Link to the company or the subsidiary? The film or the franchise? Decide once.