Universal Dependencies - Dependency Parsing Annotation

Dependency parsing and POS tagging annotation based on Universal Dependencies v2 (Nivre et al., LREC 2020). Annotators build syntactic dependency trees and label parts of speech using the UD tagset.

配置文件config.yaml

# Universal Dependencies - Dependency Parsing Annotation
# Based on Nivre et al., LREC 2020
# Paper: https://aclanthology.org/2020.lrec-1.497/
# Dataset: https://universaldependencies.org/
#
# This task asks annotators to build dependency trees and tag parts of speech
# for English sentences following the Universal Dependencies v2 guidelines.
# The tree_annotation scheme is used for dependency structure, while the
# span scheme labels parts of speech.
#
# POS Tags (Universal POS):
# - NOUN: Common nouns
# - VERB: Main verbs
# - ADJ: Adjectives
# - ADV: Adverbs
# - DET: Determiners
# - ADP: Adpositions (prepositions, postpositions)
# - PRON: Pronouns
# - CONJ: Conjunctions
# - PUNCT: Punctuation marks
# - NUM: Numerals
#
# Annotation Guidelines:
# 1. Read the entire sentence before beginning annotation
# 2. Identify the root verb of the main clause
# 3. Build the dependency tree top-down from the root
# 4. Tag each token with its Universal POS tag
# 5. Use the span annotation to highlight and tag each word

annotation_task_name: "Universal Dependencies - Dependency Parsing Annotation"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: tree_annotation
    name: dependency_tree
    description: "Build the syntactic dependency tree for this sentence"

  - annotation_type: span
    name: pos_tags
    description: "Tag each word with its Universal POS tag"
    labels:
      - "NOUN"
      - "VERB"
      - "ADJ"
      - "ADV"
      - "DET"
      - "ADP"
      - "PRON"
      - "CONJ"
      - "PUNCT"
      - "NUM"

annotation_instructions: |
  You will be shown an English sentence. Your task is to:
  1. Build a dependency tree by identifying the syntactic head of each word.
     - Start by finding the root (usually the main verb).
     - For each other word, identify which word it depends on.
  2. Tag each word with its Universal POS tag using span annotation.

  Follow the Universal Dependencies v2 guidelines for both dependency relations
  and part-of-speech tags.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Sentence:</strong>
      <p style="font-size: 18px; line-height: 2.0; margin: 8px 0 0 0; font-family: monospace;">{{text}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

示例数据sample-data.json

[
  {
    "id": "ud_001",
    "text": "The cat sat on the mat near the door."
  },
  {
    "id": "ud_002",
    "text": "She quickly finished her homework before dinner."
  }
]

// ... and 8 more items

获取此设计

View on GitHub

Clone or download from the repository

快速开始：

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/text/parsing/ud-dependency-parsing
potato start config.yaml

详情

标注类型

tree_annotationspan

领域

NLPSyntaxLinguistics

应用场景

Dependency ParsingPOS TaggingSyntactic Analysis

Universal Dependencies - Dependency Parsing Annotation

配置文件config.yaml

示例数据sample-data.json

获取此设计

详情

标注类型

领域

应用场景

标签

相关设计

PDTB 2.0 - Discourse Relations Tree Annotation

OntoNotes - Coreference Resolution

Aspect-Based Sentiment Analysis