SemEval-2007 - Word Sense Disambiguation
Word sense disambiguation task based on the SemEval-2007 English lexical sample (Pradhan et al.). Annotators identify the correct sense of a target word in context from a provided list of sense definitions.
配置文件config.yaml
# SemEval-2007 - Word Sense Disambiguation
# Based on Pradhan et al., SemEval 2007
# Paper: https://aclanthology.org/S07-1003/
# Dataset: https://web.eecs.umich.edu/~mihalcea/downloads.html
#
# This task presents a sentence with a highlighted target word and asks
# annotators to select the correct sense from a list of definitions.
# Annotators also judge the clarity of the sense distinction.
#
# Sense Options:
# - Sense 1-4: Predefined sense definitions for the target word
# - None of the above: For cases where no provided sense fits
#
# Context Clarity:
# - Clear Sense: The context unambiguously indicates one sense
# - Ambiguous: The context could support multiple senses
# - Insufficient Context: Not enough context to determine the sense
#
# Annotation Guidelines:
# 1. Read the full sentence and identify the target word
# 2. Review all sense definitions provided
# 3. Select the sense that best fits the usage in context
# 4. Judge whether the context makes the sense clear or ambiguous
annotation_task_name: "SemEval-2007 - Word Sense Disambiguation"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: select
name: word_sense
description: "Select the correct sense of the target word in this context"
labels:
- "Sense 1"
- "Sense 2"
- "Sense 3"
- "Sense 4"
- "None of the above"
- annotation_type: radio
name: context_clarity
description: "How clear is the word sense from the given context?"
labels:
- "Clear Sense"
- "Ambiguous"
- "Insufficient Context"
keyboard_shortcuts:
"Clear Sense": "1"
"Ambiguous": "2"
"Insufficient Context": "3"
tooltips:
"Clear Sense": "The context clearly indicates one specific sense of the target word"
"Ambiguous": "The context could support multiple senses of the target word"
"Insufficient Context": "There is not enough context to reliably determine the sense"
annotation_instructions: |
You will see a sentence with a target word highlighted. Below the sentence,
you will find definitions for different senses of the target word.
1. Read the sentence carefully, paying attention to the context around the target word.
2. Review all the sense definitions provided.
3. Select the sense that best matches how the word is used in this sentence.
4. If none of the provided senses fit, select "None of the above."
5. Judge whether the context makes the sense clear, ambiguous, or insufficient.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #0369a1;">Sentence:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #fef3c7; border: 1px solid #fde68a; border-radius: 8px; padding: 12px; margin-bottom: 16px; display: inline-block;">
<strong style="color: #92400e;">Target Word:</strong>
<span style="font-size: 16px; font-weight: bold; color: #78350f;">{{target_word}}</span>
</div>
<div style="background: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 16px;">
<strong style="color: #475569;">Sense Definitions:</strong>
<p style="font-size: 15px; line-height: 1.7; margin: 8px 0 0 0; white-space: pre-line;">{{sense_definitions}}</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
示例数据sample-data.json
[
{
"id": "wsd_001",
"text": "The bank approved the loan application after reviewing the applicant's credit history.",
"target_word": "bank",
"sense_definitions": "Sense 1: A financial institution that accepts deposits and channels the money into lending activities.\nSense 2: The land alongside or sloping down to a river or lake.\nSense 3: A long pile or heap of something.\nSense 4: A set of similar things arranged in a row."
},
{
"id": "wsd_002",
"text": "She decided to run for the local council seat in the upcoming election.",
"target_word": "run",
"sense_definitions": "Sense 1: To move at a speed faster than a walk, with both feet off the ground.\nSense 2: To compete as a candidate in a political election.\nSense 3: To operate or manage a business or organization.\nSense 4: To flow or extend in a particular direction."
}
]
// ... and 8 more items获取此设计
Clone or download from the repository
快速开始:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/word-sense/wsd-semeval2007 potato start config.yaml
详情
标注类型
领域
应用场景
标签
发现问题或想改进此设计?
提交 Issue相关设计
Financial PhraseBank - Sentiment Classification
Sentiment classification of financial news sentences based on the Financial PhraseBank dataset (Malo et al., JASIST 2014). Annotators classify sentences from financial news articles into fine-grained and coarse sentiment categories.
KG-BERT Knowledge Graph Triple Validation
Validate knowledge graph triples for correctness and annotate relation types based on the KG-BERT framework. Annotators assess whether entity-relation-entity triples are valid, classify the relation type, and provide entity descriptions.
MS MARCO - Passage Relevance Ranking
Passage relevance ranking based on the MS MARCO dataset (Nguyen et al., NeurIPS 2016 Workshop). Annotators assess the relevance of a candidate passage to a given search query using a graded relevance scale.