Universal Dependencies - Dependency Parsing Annotation
Dependency parsing and POS tagging annotation based on Universal Dependencies v2 (Nivre et al., LREC 2020). Annotators build syntactic dependency trees and label parts of speech using the UD tagset.
設定ファイルconfig.yaml
# Universal Dependencies - Dependency Parsing Annotation
# Based on Nivre et al., LREC 2020
# Paper: https://aclanthology.org/2020.lrec-1.497/
# Dataset: https://universaldependencies.org/
#
# This task asks annotators to build dependency trees and tag parts of speech
# for English sentences following the Universal Dependencies v2 guidelines.
# The tree_annotation scheme is used for dependency structure, while the
# span scheme labels parts of speech.
#
# POS Tags (Universal POS):
# - NOUN: Common nouns
# - VERB: Main verbs
# - ADJ: Adjectives
# - ADV: Adverbs
# - DET: Determiners
# - ADP: Adpositions (prepositions, postpositions)
# - PRON: Pronouns
# - CONJ: Conjunctions
# - PUNCT: Punctuation marks
# - NUM: Numerals
#
# Annotation Guidelines:
# 1. Read the entire sentence before beginning annotation
# 2. Identify the root verb of the main clause
# 3. Build the dependency tree top-down from the root
# 4. Tag each token with its Universal POS tag
# 5. Use the span annotation to highlight and tag each word
annotation_task_name: "Universal Dependencies - Dependency Parsing Annotation"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: tree_annotation
name: dependency_tree
description: "Build the syntactic dependency tree for this sentence"
- annotation_type: span
name: pos_tags
description: "Tag each word with its Universal POS tag"
labels:
- "NOUN"
- "VERB"
- "ADJ"
- "ADV"
- "DET"
- "ADP"
- "PRON"
- "CONJ"
- "PUNCT"
- "NUM"
annotation_instructions: |
You will be shown an English sentence. Your task is to:
1. Build a dependency tree by identifying the syntactic head of each word.
- Start by finding the root (usually the main verb).
- For each other word, identify which word it depends on.
2. Tag each word with its Universal POS tag using span annotation.
Follow the Universal Dependencies v2 guidelines for both dependency relations
and part-of-speech tags.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #0369a1;">Sentence:</strong>
<p style="font-size: 18px; line-height: 2.0; margin: 8px 0 0 0; font-family: monospace;">{{text}}</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
サンプルデータsample-data.json
[
{
"id": "ud_001",
"text": "The cat sat on the mat near the door."
},
{
"id": "ud_002",
"text": "She quickly finished her homework before dinner."
}
]
// ... and 8 more itemsこのデザインを取得
Clone or download from the repository
クイックスタート:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/parsing/ud-dependency-parsing potato start config.yaml
詳細
アノテーションタイプ
ドメイン
ユースケース
タグ
問題を見つけた場合やデザインを改善したい場合は?
Issueを作成関連デザイン
PDTB 2.0 - Discourse Relations Tree Annotation
Discourse relation annotation with tree structure, based on the Penn Discourse TreeBank 2.0 (Prasad et al., LREC 2008). Annotators identify discourse connectives, mark argument spans, and build hierarchical discourse trees representing how text segments relate to each other.
OntoNotes - Coreference Resolution
Coreference resolution annotation based on the OntoNotes 5.0 corpus (Pradhan et al., CoNLL 2012). Annotators identify coreferent mentions -- expressions that refer to the same real-world entity -- and link them into coreference chains across multi-sentence text.
Aspect-Based Sentiment Analysis
Identification of aspect terms in review text with sentiment polarity classification for each aspect. Based on SemEval-2016 Task 5 (ABSA).