Complex Named Entity Recognition (MultiCoNER)
Recognize complex and emerging named entities. Based on SemEval 2022/2023 MultiCoNER. Identify creative works, products, groups, and other challenging entity types.
Konfigurationsdateiconfig.yaml
# Complex Named Entity Recognition (MultiCoNER)
# Based on SemEval 2022/2023 MultiCoNER Shared Tasks
# Paper: https://aclanthology.org/2022.semeval-1.196/
#
# Traditional NER focuses on Person, Location, Organization.
# Complex NER handles challenging entities like:
# - Creative works ("Dial M for Murder", "Game of Thrones")
# - Products ("iPhone 15", "Tesla Model S")
# - Groups ("Anonymous", "BTS Army")
#
# Entity Types (Coarse-grained):
# - PER: Person names
# - LOC: Locations, facilities
# - CORP: Corporations, businesses
# - GRP: Other groups (bands, teams, movements)
# - PROD: Products (consumer goods, vehicles)
# - CW: Creative works (movies, books, songs)
#
# Challenges:
# - Creative works can be any linguistic form
# - Product names blend with common words
# - Group names may be descriptive phrases
# - Emerging entities lack context
#
# Annotation Guidelines:
# 1. Mark the full entity span including modifiers
# 2. Creative works include titles in any form
# 3. Products include brand + product name
# 4. When uncertain, consider: would this have a Wikipedia page?
annotation_task_name: "Complex Named Entity Recognition"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
- annotation_type: span
name: entities
description: "Highlight all named entities in the text"
labels:
- "Person"
- "Location"
- "Corporation"
- "Group"
- "Product"
- "Creative Work"
label_colors:
"Person": "#3b82f6"
"Location": "#22c55e"
"Corporation": "#8b5cf6"
"Group": "#f59e0b"
"Product": "#06b6d4"
"Creative Work": "#ec4899"
tooltips:
"Person": "Names of people (including fictional characters)"
"Location": "Places, addresses, facilities, geographic features"
"Corporation": "Companies, businesses, corporations"
"Group": "Other groups: bands, sports teams, movements, organizations"
"Product": "Consumer products: devices, vehicles, software, games"
"Creative Work": "Movies, TV shows, books, songs, albums, artworks"
allow_overlapping: false
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Beispieldatensample-data.json
[
{
"id": "cner_001",
"text": "I just finished watching Breaking Bad on Netflix. It's one of the best shows ever made."
},
{
"id": "cner_002",
"text": "Apple released the new iPhone 15 Pro Max yesterday at their headquarters in Cupertino."
}
]
// ... and 8 more itemsDieses Design herunterladen
Clone or download from the repository
Schnellstart:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/named-entity-recognition/complex-ner potato start config.yaml
Details
Annotationstypen
Bereich
Anwendungsfälle
Schlagwörter
Problem gefunden oder möchten Sie dieses Design verbessern?
Issue öffnenVerwandte Designs
CrossRE: Cross-Domain Relation Extraction
Cross-domain relation extraction across 6 domains (news, politics, science, music, literature, AI). Annotators identify entities and label 17 relation types between entity pairs, enabling study of domain transfer in relation extraction.
Dialogue Relation Extraction (DialogRE)
Extract relations between entities in dialogue. Based on Yu et al., ACL 2020. Identify 36 relation types between speakers and entities mentioned in conversations.
Event Annotation
N-ary event annotation with trigger spans and typed argument roles. Annotate events like ATTACK, HIRE, and TRAVEL with constrained entity arguments and hub-spoke arc visualization.