Lexical Complexity Prediction
Predict the complexity of words in context using both Likert scale and continuous slider ratings, based on SemEval-2021 Task 1 (Shardlow et al.). Annotators assess how difficult a target word is for a non-native English speaker to understand.
Archivo de configuraciónconfig.yaml
# Lexical Complexity Prediction
# Based on Shardlow et al., SemEval 2021
# Paper: https://aclanthology.org/2021.semeval-1.1/
# Dataset: https://sites.google.com/view/lcpsharedtask2021
#
# Annotators rate how complex a target word is within its sentence context.
# Complexity is assessed on a 5-point Likert scale and a continuous slider
# from 0 (very simple) to 1 (very complex), targeting non-native speakers.
annotation_task_name: "Lexical Complexity Prediction"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: likert
name: complexity_likert
description: "Rate the complexity of the highlighted target word on a 5-point scale."
min_label: "Very Simple"
max_label: "Very Complex"
size: 5
- annotation_type: slider
name: complexity_slider
description: "Rate the complexity of the target word on a continuous scale from 0 (simplest) to 1 (most complex)."
min_value: 0
max_value: 1
starting_value: 0.5
annotation_instructions: |
You will see a sentence with a highlighted target word. Your task is to rate how
complex the target word is for a non-native English speaker.
1. Read the sentence and identify the target word shown in bold.
2. Rate the word's complexity on the 5-point Likert scale (Very Simple to Very Complex).
3. Also provide a continuous complexity rating using the slider (0 = simplest, 1 = most complex).
Consider factors like word frequency, number of syllables, and whether the word
has a simpler synonym that could replace it.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #0369a1;">Sentence:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #a16207;">Target Word:</strong>
<span style="font-size: 18px; font-weight: bold; color: #b45309;">{{target_word}}</span>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Datos de ejemplosample-data.json
[
{
"id": "lcp_001",
"text": "The patient was diagnosed with a severe case of pneumonia after presenting with persistent cough and dyspnea.",
"target_word": "dyspnea"
},
{
"id": "lcp_002",
"text": "The children were playing in the garden and seemed very happy.",
"target_word": "happy"
}
]
// ... and 8 more itemsObtener este diseño
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/semeval/2021/task01-lexical-complexity potato start config.yaml
Detalles
Tipos de anotación
Dominio
Casos de uso
Etiquetas
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
Graded Word Similarity in Context
Rate the semantic similarity of two words in their respective contexts on a graded scale, based on SemEval-2020 Task 3 (Armendariz et al.). Annotators assess how similar word meanings are when each word appears in a specific sentence context.
Multilingual Semantic Word Similarity
Graded word similarity judgment across multiple languages, based on SemEval-2017 Task 2. Annotators rate how semantically similar two words are on a continuous scale, supporting cross-lingual evaluation of distributional semantic models.
Semantic Textual Relatedness
Semantic textual relatedness task requiring annotators to rate the degree of semantic relatedness between sentence pairs using both a Likert scale and a continuous slider. Based on SemEval-2024 Task 1 (STR).