Entity Linking
Link span annotations to external knowledge bases like Wikidata, UMLS, or custom APIs.
Entity Linking
Entity linking enables annotators to connect span annotations to external knowledge bases (KBs) like Wikidata or UMLS. This creates semantic links between text mentions and canonical entities, valuable for named entity recognition, concept normalization, and knowledge graph construction.
How It Works
When entity linking is enabled for a span annotation schema:
- Annotators highlight text and assign a label (e.g., "PERSON", "ORGANIZATION")
- A link icon appears on the span's control bar
- Clicking the icon opens a search modal to find matching KB entities
- The selected entity ID is stored with the span annotation
- Linked spans display a filled icon and show entity details on hover
Quick Start
Enable entity linking by adding the entity_linking configuration to a span schema:
annotation_schemes:
- annotation_type: span
name: ner
description: Named Entity Recognition with KB linking
labels:
- name: PERSON
tooltip: "People's names"
- name: ORGANIZATION
tooltip: "Companies, agencies, institutions"
- name: LOCATION
tooltip: "Places, cities, countries"
entity_linking:
enabled: true
knowledge_bases:
- name: wikidata
type: wikidata
language: enConfiguration Options
| Option | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable entity linking for this schema |
knowledge_bases | list | [] | List of KB configurations |
auto_search | boolean | true | Automatically search when the modal opens |
required | boolean | false | Require entity link before saving span |
multi_select | boolean | false | Allow linking to multiple entities |
Knowledge Base Configuration
| Option | Type | Default | Description |
|---|---|---|---|
name | string | required | Unique identifier for this KB |
type | string | required | KB type: wikidata, umls, or rest |
api_key | string | null | API key for authenticated services |
base_url | string | null | Base URL for REST APIs |
language | string | "en" | Language code for search results |
timeout | integer | 10 | Request timeout in seconds |
Supported Knowledge Bases
Wikidata
Free, open knowledge base with 100+ million entities. No API key required.
entity_linking:
enabled: true
knowledge_bases:
- name: wikidata
type: wikidata
language: enFeatures multilingual labels, entity aliases (e.g., "NYC" finds "New York City"), and links to Wikipedia articles.
UMLS
Comprehensive medical and biomedical terminology. Requires a free API key from UTS.
entity_linking:
enabled: true
knowledge_bases:
- name: umls
type: umls
api_key: ${UMLS_API_KEY}Includes medical concepts, drugs, diseases, procedures, and cross-references to 200+ source vocabularies (SNOMED CT, ICD-10, MeSH, RxNorm).
Custom REST APIs
Connect to any knowledge base with a REST API:
entity_linking:
enabled: true
knowledge_bases:
- name: internal_kb
type: rest
base_url: https://api.example.com
api_key: optional_api_key
extra_params:
search_endpoint: /search
entity_endpoint: /entity/{entity_id}
search_query_param: q
results_path: data.results
entity_id_field: id
label_field: name
description_field: descriptionMultiple Knowledge Bases
Configure multiple KBs to let annotators choose the most appropriate source:
entity_linking:
enabled: true
knowledge_bases:
- name: wikidata
type: wikidata
language: en
- name: umls
type: umls
api_key: ${UMLS_API_KEY}
- name: company_entities
type: rest
base_url: https://internal.company.com/api/entitiesA dropdown in the search modal lets annotators switch between configured knowledge bases.
Multi-Select Mode
Enable multi-select to allow linking a span to multiple entities, useful for ambiguous mentions:
entity_linking:
enabled: true
multi_select: true
knowledge_bases:
- name: wikidata
type: wikidata
language: enData Format
Entity-linked spans include additional fields in the output:
{
"id": "instance_001",
"text": "Albert Einstein was born in Ulm, Germany in 1879.",
"annotations": {
"ner": {
"spans": [
{
"text": "Albert Einstein",
"start": 0,
"end": 15,
"label": "PERSON",
"kb_id": "Q937",
"kb_source": "wikidata",
"kb_label": "Albert Einstein"
},
{
"text": "Ulm",
"start": 28,
"end": 31,
"label": "LOCATION",
"kb_id": "Q3012",
"kb_source": "wikidata",
"kb_label": "Ulm"
}
]
}
}
}Best Practices
- Enable auto-search for efficiency - pre-populates search with span text
- Don't require linking unless essential - don't block annotation if entity not found
- Set appropriate timeouts for slow networks
- Match KB to entity type - Use Wikidata for general entities, UMLS for biomedical terms, custom APIs for domain-specific entities
- Use multi-select for ambiguous mentions - abbreviations, common names, polysemous terms
Further Reading
- Span Annotation - Basic span annotation setup
- Coreference Chains - Grouping entity mentions
- Event Annotation - N-ary event structures
For implementation details, see the source documentation.