SciER: Scientific Entity and Relation Extraction
Entity and relation extraction for scientific papers. Annotators identify scientific entities (Method, Task, Material, Metric, Generic) and link them with typed relations (Used-for, Feature-of, Hyponym-of, Part-of, Compare, Conjunction, Evaluate-for).
設定ファイルconfig.yaml
# SciER: Scientific Entity and Relation Extraction
# Based on Luan et al., EMNLP 2024
# Paper: https://aclanthology.org/2024.emnlp-main.59/
# Dataset: https://github.com/Alab-NII/SciER
#
# This task annotates scientific entities and relations in research papers.
# Annotators first identify scientific entities, then link them with typed
# relations describing how the entities interact.
#
# Entity Types:
# - Method: Algorithms, models, techniques, systems
# - Task: Research tasks, problems, objectives
# - Material: Datasets, corpora, resources, inputs
# - Metric: Evaluation measures, scores, performance indicators
# - Generic: General scientific terms not fitting other categories
#
# Relation Types:
# - Used-for: Entity A is used for entity B
# - Feature-of: Entity A is a feature/property of entity B
# - Hyponym-of: Entity A is a subtype of entity B
# - Part-of: Entity A is a component of entity B
# - Compare: Entity A is compared with entity B
# - Conjunction: Entity A and entity B are coordinated
# - Evaluate-for: Entity A is used to evaluate entity B
#
# Annotation Guidelines:
# 1. Read the sentence in context of the paper section
# 2. Identify all scientific entity mentions
# 3. Assign appropriate entity types
# 4. For each pair of related entities, select the relation type
# 5. Relations are directional: consider which entity is subject vs object
annotation_task_name: "SciER: Scientific Entity and Relation Extraction"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
# Step 1: Identify scientific entities
- annotation_type: span
name: scientific_entities
description: "Highlight all scientific entities in the text"
labels:
- "Method"
- "Task"
- "Material"
- "Metric"
- "Generic"
label_colors:
"Method": "#3b82f6"
"Task": "#ef4444"
"Material": "#22c55e"
"Metric": "#f59e0b"
"Generic": "#8b5cf6"
keyboard_shortcuts:
"Method": "1"
"Task": "2"
"Material": "3"
"Metric": "4"
"Generic": "5"
tooltips:
"Method": "Algorithms, models, techniques, or systems (e.g., BERT, gradient descent, CNN)"
"Task": "Research tasks, problems, or objectives (e.g., named entity recognition, machine translation)"
"Material": "Datasets, corpora, resources, or inputs (e.g., CoNLL-2003, Wikipedia, training data)"
"Metric": "Evaluation measures or performance indicators (e.g., F1 score, BLEU, accuracy)"
"Generic": "General scientific terms not fitting other categories (e.g., features, parameters, embeddings)"
allow_overlapping: false
# Step 2: Link entities with typed relations
- annotation_type: span_link
name: scientific_relations
description: "Draw relations between pairs of scientific entities"
labels:
- "Used-for"
- "Feature-of"
- "Hyponym-of"
- "Part-of"
- "Compare"
- "Conjunction"
- "Evaluate-for"
tooltips:
"Used-for": "Entity A is used for entity B (e.g., BERT is Used-for NER)"
"Feature-of": "Entity A is a feature or property of entity B"
"Hyponym-of": "Entity A is a subtype or instance of entity B"
"Part-of": "Entity A is a component or part of entity B"
"Compare": "Entity A is compared with entity B"
"Conjunction": "Entity A and entity B are coordinated or listed together"
"Evaluate-for": "Entity A is used to evaluate entity B (e.g., F1 is Evaluate-for NER)"
html_layout: |
<div style="margin-bottom: 10px; padding: 8px; background: #f0f4f8; border-radius: 4px;">
<strong>Section:</strong> {{context}}
</div>
<div style="font-size: 16px; line-height: 1.6;">
{{text}}
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
サンプルデータsample-data.json
[
{
"id": "scier_001",
"text": "We apply BERT to the task of named entity recognition on the CoNLL-2003 dataset, achieving a new state-of-the-art F1 score of 93.5%.",
"context": "Experiments"
},
{
"id": "scier_002",
"text": "Our proposed transformer-based architecture incorporates a multi-head attention mechanism and positional encodings to capture long-range dependencies in text classification.",
"context": "Methods"
}
]
// ... and 8 more itemsこのデザインを取得
Clone or download from the repository
クイックスタート:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/relation-extraction/scier-scientific-entity-relations potato start config.yaml
詳細
アノテーションタイプ
ドメイン
ユースケース
タグ
問題を見つけた場合やデザインを改善したい場合は?
Issueを作成関連デザイン
CrossRE: Cross-Domain Relation Extraction
Cross-domain relation extraction across 6 domains (news, politics, science, music, literature, AI). Annotators identify entities and label 17 relation types between entity pairs, enabling study of domain transfer in relation extraction.
Event Coreference and Relations (MAVEN-ERE)
Unified event relation extraction covering coreference, temporal, causal, and subevent relations. Annotators identify event mentions and link them with typed relations (coreference, temporal_before/after/overlap, causal, subevent). Based on the MAVEN-ERE dataset which provides large-scale, unified annotations for four types of event relations.
Legal Event Coreference (LegalCore)
Event coreference resolution in legal documents including court opinions, contracts, and statutes. Annotators identify legal event mentions such as actions, states, obligations, and violations, then link coreferent events that refer to the same real-world event. Based on the LegalCore dataset for domain-specific event coreference in the legal domain.