PDTB 2.0 - Discourse Relations Tree Annotation
Discourse relation annotation with tree structure, based on the Penn Discourse TreeBank 2.0 (Prasad et al., LREC 2008). Annotators identify discourse connectives, mark argument spans, and build hierarchical discourse trees representing how text segments relate to each other.
配置文件config.yaml
# PDTB 2.0 - Discourse Relations Tree Annotation
# Based on Prasad et al., LREC 2008
# Paper: https://aclanthology.org/L08-1093/
# Dataset: https://catalog.ldc.upenn.edu/LDC2008T05
#
# This task involves two complementary annotation layers:
# 1. Span annotation: Mark discourse connectives and their argument spans
# - Connective: The explicit discourse connective (e.g., "because", "however")
# - Arg1: The first argument of the discourse relation
# - Arg2: The second argument (syntactically bound to the connective)
#
# 2. Tree annotation: Build hierarchical discourse structure showing
# how text segments relate to each other through:
# - Temporal: before, after, simultaneous
# - Contingency: cause, condition
# - Comparison: contrast, concession
# - Expansion: conjunction, restatement, alternative
#
# Guidelines:
# - Arg2 is always the argument syntactically bound to the connective
# - Arg1 is the other argument in the relation
# - Implicit relations exist between adjacent sentences without connectives
annotation_task_name: "PDTB 2.0: Discourse Relations Tree Annotation"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: span
name: discourse_spans
description: "Mark discourse connectives and their argument spans (Arg1, Arg2)"
labels:
- "Connective"
- "Arg1"
- "Arg2"
tooltips:
"Connective": "The explicit discourse connective word or phrase (e.g., because, however, although)"
"Arg1": "The first argument of the discourse relation"
"Arg2": "The second argument, syntactically bound to the connective"
- annotation_type: tree_annotation
name: discourse_tree
description: "Build a hierarchical discourse tree showing how text segments relate through temporal, causal, comparative, and expansive relations"
annotation_instructions: |
Annotate discourse relations in the text:
1. Identify explicit discourse connectives (e.g., "because", "however", "although", "and").
2. Mark the connective span, then mark Arg1 and Arg2 for each relation.
3. Arg2 is always the argument syntactically bound to the connective.
4. Build the discourse tree to show hierarchical relations between segments.
5. Relation types: Temporal, Contingency, Comparison, Expansion.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #faf5ff; border: 1px solid #e9d5ff; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #7c3aed;">Text:</strong>
<p style="font-size: 16px; line-height: 1.9; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 12px;">
<p style="font-size: 13px; color: #713f12; margin: 0;"><strong>Instructions:</strong> Highlight connectives and argument spans, then build the discourse tree structure.</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
示例数据sample-data.json
[
{
"id": "pdtb_001",
"text": "The company reported strong quarterly earnings. However, its stock price declined because investors were concerned about slowing growth in the Asian market. As a result, several analysts downgraded their recommendations."
},
{
"id": "pdtb_002",
"text": "Although the experiment yielded promising results, the sample size was too small to draw definitive conclusions. The researchers plan to replicate the study with a larger cohort. In addition, they will employ more rigorous statistical methods to account for potential confounds."
}
]
// ... and 8 more items获取此设计
Clone or download from the repository
快速开始:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/discourse/pdtb-discourse-relations-tree potato start config.yaml
详情
标注类型
领域
应用场景
标签
发现问题或想改进此设计?
提交 Issue相关设计
Universal Dependencies - Dependency Parsing Annotation
Dependency parsing and POS tagging annotation based on Universal Dependencies v2 (Nivre et al., LREC 2020). Annotators build syntactic dependency trees and label parts of speech using the UD tagset.
OntoNotes - Coreference Resolution
Coreference resolution annotation based on the OntoNotes 5.0 corpus (Pradhan et al., CoNLL 2012). Annotators identify coreferent mentions -- expressions that refer to the same real-world entity -- and link them into coreference chains across multi-sentence text.
Aspect-Based Sentiment Analysis
Identification of aspect terms in review text with sentiment polarity classification for each aspect. Based on SemEval-2016 Task 5 (ABSA).