Annotation Showcase

Browse 378+ ready-to-use annotation configurations. Download configs and start annotating immediately.

Showing 378 of 378 designs

Text Annotation172

Adverse Drug Event Extraction (CADEC)

Named entity recognition for adverse drug events from patient-reported experiences, based on the CADEC corpus (Karimi et al., 2015). Annotates drugs, adverse effects, symptoms, diseases, and findings from colloquial health forum posts with mapping to medical vocabularies (SNOMED-CT, MedDRA).

spanradio

Biomedical NLP

AMI Meeting Multi-Tier Annotation

advanced

Multi-tier ELAN-style annotation of multi-party meeting recordings. Annotators segment speaker turns, head gestures, and focus of attention on parallel timeline tiers, then classify dialogue acts and topic segments. Based on the AMI Meeting Corpus.

video_annotationradio

Discourse Analysis

Analysis of Clinical Text: Disorder Identification and Normalization

advanced

Identify disorder mentions and their attributes in clinical discharge summaries, based on SemEval-2015 Task 14 (Elhadad et al.). Annotators mark disorder spans, body locations, severity indicators, and classify the assertion status of each disorder.

spanradio

SemEval

Aspect-Based Sentiment Analysis

intermediate

Identification of aspect terms in review text with sentiment polarity classification for each aspect. Based on SemEval-2016 Task 5 (ABSA).

spanradio

SemEval

Aspect-Based Sentiment Analysis (Original ABSA)

intermediate

Identify aspect terms in review text and classify their sentiment polarity, based on SemEval-2014 Task 4 (Pontiki et al.). Annotators highlight aspect terms and assign sentiment labels across restaurant and laptop review domains.

spanradio

SemEval

Biomedical Entity Linking (MedMentions)

advanced

Entity mention detection and UMLS concept linking for biomedical text based on MedMentions. Annotators identify biomedical entity mentions in PubMed abstracts and link them to UMLS Concept Unique Identifiers (CUIs), supporting large-scale biomedical knowledge base construction and clinical NLP.

radiospantext

Biomedical NLP

Biomedical Named Entity Recognition (JNLPBA)

advanced

Named entity recognition for biomedical text based on the JNLPBA shared task. Annotate entities including proteins, DNA, RNA, cell lines, and cell types following BioNLP community standards.

span

Biomedical

BioNLP 2011 - Gene Regulation Event Extraction

advanced

Biomedical event extraction for gene regulation, based on the BioNLP 2011 Shared Task (Kim et al., ACL Workshop 2011). Annotators identify biological entities and mark regulatory events such as gene expression, transcription, and protein catabolism in scientific abstracts.

event_annotationspan

NLP

Audio Annotation30

DISPLACE 2024 - Speaker and Language Diarization

advanced

Speaker and language diarization in multilingual conversational audio. Annotators mark speaker turn boundaries, identify speakers, and label the language of each segment in conversational environments (Kundu et al., INTERSPEECH 2024).

radiospan

Speech Processing

Sound Event Detection

advanced

Temporal sound event annotation with strong labels following DCASE Challenge protocols.

spanmultiselect

Audio

Speaker Diarization

intermediate

Identify and label different speakers in audio recordings with timestamp-based segment annotation.

spanradio

Speech

ToBI Prosodic Annotation

advanced

Multi-tier prosodic annotation following the Tones and Break Indices (ToBI) framework. Annotators label pitch accents, phrase accents, boundary tones, and break indices on speech utterances, producing a layered prosodic transcription aligned to the audio timeline (Silverman et al., Speech Communication 1992).

spanradiotext

Prosody

Acoustic Scene Classification

beginner

Classify audio recordings by acoustic environment following the TUT/DCASE dataset format.

radiolikert

Audio

Audio Transcription Review

intermediate

Review and correct automatic speech recognition transcriptions with waveform visualization.

likertmultiselectradio+1

Audio

Audio-Visual Sentiment Analysis

intermediate

Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.

likertradio

Audio

AudioHate - Audio Hate Speech Detection

intermediate

Audio hate speech detection with explanations. Annotators classify audio clips for hate speech presence, identify target groups, and note acoustic indicators such as tone, emphasis, and prosody (Guo et al., SIGDIAL 2024).

radiomultiselect

Speech Processing

Image Annotation40

Breakfast Actions Segmentation

advanced

Fine-grained temporal action segmentation of breakfast preparation activities. Annotators label sequences of cooking actions like 'take cup', 'pour milk', 'stir'.

textvideo_annotation

Computer Vision

EPIC-KITCHENS Egocentric Action Annotation

advanced

Annotate fine-grained actions in egocentric kitchen videos with verb-noun pairs. Identify cooking actions from a first-person perspective.

radiotextvideo_annotation

Computer Vision

FineGym Action Segmentation

advanced

Annotate fine-grained gymnastic actions with hierarchical labels. Identify specific elements, sub-actions, and routines in competition videos.

radiovideo_annotation

Computer Vision

FineSports Fine-grained Action Recognition

advanced

Fine-grained sports action annotation with hierarchical labels and person tracking. Annotators draw bounding boxes around athletes and label fine-grained actions within a sports action hierarchy.

multiselectvideo_annotation

Computer Vision

Harmony4D Human Interaction Tracking

advanced

Close-range human interaction tracking and annotation. Annotators track multiple people during close physical interactions (dancing, martial arts, collaborative tasks) with bounding boxes and interaction labels.

multiselectvideo_annotation

Computer Vision

How2Sign Sign Language Multi-Tier Annotation

advanced

Multi-tier ELAN-style annotation of continuous American Sign Language videos. Annotators segment sign glosses, mark mouthing patterns, classify sign handedness, and provide English translations aligned to video timelines. Based on the How2Sign large-scale multimodal ASL dataset.

video_annotationradiotext

Sign Language

MSAD Multi-Scenario Anomaly Detection

intermediate

Video anomaly detection across multiple scenarios. Annotators watch surveillance-style videos and mark temporal segments containing anomalous events, classifying the anomaly type.

video_annotationradio

Computer Vision

ADE20K Semantic Segmentation

advanced

Comprehensive scene parsing with 150 semantic categories (Zhou et al., CVPR 2017). Annotate indoor and outdoor scenes with pixel-level labels covering objects, parts, and stuff classes.

multiselectradio

Computer Vision

Video Annotation28

ActivityNet Captions Dense Annotation

advanced

Dense temporal annotation with natural language descriptions. Annotators segment videos into events and write descriptive captions for each temporal segment.

video_annotationtext

Computer Vision

ActivityNet Temporal Localization

intermediate

Temporal activity localization in untrimmed videos. Annotators identify activity instances by marking precise start and end timestamps across 200 activity classes.

video_annotation

Computer Vision

AVA Atomic Visual Actions

advanced

Spatio-temporal action annotation in movie clips. Annotators localize people with bounding boxes and label their atomic actions (pose, person-object, person-person interactions) in 1-second intervals.

multiselectvideo_annotation

Computer Vision

Charades Indoor Activity Segmentation

intermediate

Multi-label temporal activity segmentation in indoor home videos. Annotators identify action instances using compositional verb-object labels (e.g., 'opening door', 'sitting on chair') with precise temporal boundaries.

video_annotation

Computer Vision

Charades-STA Temporal Grounding

intermediate

Ground natural language descriptions to video segments. Given a sentence describing an action, identify the exact temporal boundaries where that action occurs.

radiovideo_annotation

Computer Vision

Clinical TempEval - Temporal Information Extraction from Clinical Notes

advanced

Extraction of temporal information from clinical text, identifying time expressions, event mentions, and their temporal relations. Based on SemEval-2016 Task 12 (Clinical TempEval).

spanradio

SemEval

DiDeMo Moment Retrieval

intermediate

Localizing natural language descriptions to specific video moments. Given a text query, annotators identify the corresponding temporal segment in the video.

radiovideo_annotation

Computer Vision

Ego4D: Egocentric Video Episodic Memory Annotation

advanced

Annotate egocentric (first-person) video for episodic memory tasks including activity segmentation, hand state tracking, natural language query generation, and scene narration. Supports temporal segment annotation with multiple label tiers for the Ego4D benchmark.

video_annotationtextradio

Egocentric Vision

Comparison Tasks2

Best-Worst Scaling

beginner

MaxDiff annotation where annotators select the best and worst items from a set for relative comparison.

best-worst

NLP

Ranking Task

beginner

Drag-and-drop ranking interface to order items from best to worst.

ranking

NLP

Preference Learning25

Interpretable Semantic Textual Similarity

advanced

Fine-grained semantic similarity assessment between sentence pairs with span alignment, combining chunk-level annotation with graded similarity scoring. Based on SemEval-2016 Task 2.

spanlikert

SemEval

SaGA Gesture-Speech Alignment Multi-Tier Annotation

advanced

Multi-tier ELAN-style annotation of co-speech gestures and their alignment with spoken language. Annotators segment gesture phases and types on parallel timeline tiers, classify handedness and spatial reference frames, and transcribe concurrent speech. Based on the SaGA corpus.

video_annotationradiotext

Gesture Studies

AlpacaEval: Instruction-Following Preference Evaluation

intermediate

Pairwise preference annotation for instruction-following language models. Annotators compare two model responses side by side, select their preferred response, indicate preference strength, and rate individual response quality across diverse instruction categories.

pairwiseradiotext

AI Evaluation

AlpacaFarm Preference Simulation

intermediate

Simulate human preferences for instruction-following responses. Create preference data for efficient RLHF research and LLM evaluation.

likertradio

Natural Language Processing

Arena Hard Auto - LLM Pairwise Evaluation

intermediate

Pairwise evaluation of LLM responses on challenging prompts from the Arena Hard benchmark (Li et al., arXiv 2024). Annotators compare two responses on a continuous scale and rate question difficulty.

pairwiselikert

NLP

BeaverTails Safety Preference

advanced

Annotate AI responses for safety across multiple harm categories. Identify unsafe content and rate response quality for building safer AI systems.

multiselectradio

AI Safety

Chatbot Arena - Pairwise Comparison with Best-Worst Scaling

intermediate

Pairwise comparison and best-worst scaling of chatbot responses, based on the Chatbot Arena framework (Zheng et al., ICML 2024). Annotators compare pairs of LLM-generated responses and rank sets of responses using best-worst scaling methodology.

bwspairwise

NLP

Constitutional AI Harmlessness Evaluation

intermediate

Evaluate AI assistant responses for harmlessness and helpfulness based on the Constitutional AI framework by Anthropic. Annotators rate responses on a harmfulness scale, assess helpfulness, and provide explanations for their judgments.

radiolikerttext

AI Safety

Surveys51

ESA: Error Span Annotation for Machine Translation

advanced

Error span annotation for machine translation output. Annotators identify error spans in translations, classify error types (accuracy, fluency, terminology, style), and rate severity.

spanradiolikert

NLP

LongEval: Faithfulness Evaluation for Long-form Summarization

advanced

Faithfulness evaluation of long-form summaries. Annotators identify atomic content units in summaries, check each against source documents for faithfulness, and rate overall summary quality.

spanradiolikert

NLP

News Headline Emotion Roles (GoodNewsEveryone)

advanced

Annotate emotions in news headlines with semantic roles. Based on Bostan et al., LREC 2020. Identify emotion, experiencer, cause, target, and textual cue.

likertradiospan

NLP

NLI with Explanations (e-SNLI)

intermediate

Natural language inference with human explanations. Based on e-SNLI (Camburu et al., NeurIPS 2018). Classify entailment/contradiction/neutral and provide natural language justifications.

likertradiospan

NLP

RT-2 - Robotic Action Annotation

advanced

Robotic manipulation task evaluation and action segmentation based on RT-2 (Brohan et al., CoRL 2023). Annotators evaluate task success, describe actions, rate execution quality, and segment video into action phases.

radiotextlikert+1

Robotics

Scientific Claim Verification (SciFact)

advanced

Verify scientific claims against evidence from research abstracts. Based on SciFact (Wadden et al., EMNLP 2020). Classify claims as supported, refuted, or having insufficient evidence, and identify rationale sentences.

likertradiospan

NLP

AnnoMI Counselling Dialogue Annotation

advanced

Annotation of motivational interviewing counselling dialogues based on the AnnoMI dataset. Annotators label therapist and client utterances for MI techniques (open questions, reflections, affirmations) and client change talk (sustain talk, change talk), with quality ratings for therapeutic interactions.

radiomultiselectlikert

NLP

Argument Reasoning Comprehension (ARCT)

advanced

Identify implicit warrants in arguments. Based on Habernal et al., NAACL 2018 / SemEval 2018 Task 12. Given a claim and premise, choose the correct warrant that connects them.

likertradio

NLP

Evaluation Tasks30

Code Review Annotation (CodeReviewer)

advanced

Annotation of code review activities based on the CodeReviewer benchmark. Annotators identify issues in code diffs, classify defect types, assign severity levels, make review decisions, and provide natural language review comments, supporting research in automated code review and software engineering.

spanradiotext

Software Engineering

EA-MT - Entity-Aware Machine Translation

advanced

Entity-aware machine translation evaluation requiring annotators to identify entity spans, classify translation errors, and provide corrected translations. Based on SemEval-2025 Task 2.

spanradiotext

SemEval

FAVA: Fine-grained Hallucination Annotations for Faithful Generation

advanced

Fine-grained hallucination span annotation. Annotators identify hallucinated spans in LLM output and classify hallucination types (entity error, relation error, contradicted, invented, subjective, unverifiable). Based on the FAVA framework for fine-grained faithfulness evaluation.

spanradio

NLP

MathDial - Tutoring Dialogue Quality Annotation

intermediate

Annotate math tutoring dialogues for guidance correctness, tutoring strategies, and key concepts, based on the MathDial dataset (Macina et al., Findings ACL 2023). Supports evaluation of AI-generated tutoring interactions for K-12 math problems.

radiomultiselectspan

NLP

#HashtagWars - Learning a Sense of Humor

beginner

Humor ranking of tweets submitted to Comedy Central's @midnight #HashtagWars, classifying comedic quality. Based on SemEval-2017 Task 6.

radio

SemEval

ArgSciChat Scientific Argumentation Dialogue

intermediate

Annotation of argumentative dialogues about scientific papers based on the ArgSciChat dataset. Annotators label dialogue turns for argument components (claim, evidence, rebuttal) and assess argument quality dimensions such as clarity, relevance, and persuasiveness.

multiselectradio

NLP

Argument Quality Assessment

intermediate

Multi-dimensional argument quality annotation based on the Wachsmuth et al. (2017) taxonomy. Rates arguments on three dimensions: Cogency (logical validity), Effectiveness (persuasive power), and Reasonableness (contribution to resolution). Used in Dagstuhl-ArgQuality and GAQCorpus datasets.

multiselectradio

NLP

Bias Benchmark for QA (BBQ)

intermediate

Annotate question-answering examples designed to probe social biases. Based on BBQ (Parrish et al., Findings of ACL 2022). Annotators select the correct answer given a context, assess the direction of bias in the question, categorize the type of bias, and explain their reasoning.

radiotext

NLP

Have a design to share?

Contribute your annotation configurations to help the community.

Submit Your Design View on GitHub