Annotation Showcase

Browse 100+ ready-to-use annotation configurations. Download configs and start annotating immediately.

View on GitHub →

Showing 100 of 100 designs

📝Text Annotation48

📝

Adverse Drug Event Extraction (CADEC)

intermediate

Named entity recognition for adverse drug events from patient-reported experiences, based on the CADEC corpus (Karimi et al., 2015). Annotates drugs, adverse effects, symptoms, diseases, and findings from colloquial health forum posts with mapping to medical vocabularies (SNOMED-CT, MedDRA).

spanradio
Biomedical NLP
📝

Argument Reasoning Comprehension (ARCT)

advanced

Identify implicit warrants in arguments. Based on Habernal et al., NAACL 2018 / SemEval 2018 Task 12. Given a claim and premise, choose the correct warrant that connects them.

radio
NLP
📝

BigEarthNet Remote Sensing Classification

intermediate

Multi-label land cover classification from Sentinel-2 imagery (Sumbul et al., IGARSS 2019). Classify satellite patches into 43 land cover classes following the CORINE taxonomy.

multiselect
Remote Sensing
📝

Biomedical Named Entity Recognition (JNLPBA)

advanced

Named entity recognition for biomedical text based on the JNLPBA shared task. Annotate entities including proteins, DNA, RNA, cell lines, and cell types following BioNLP community standards.

span
Biomedical
📝

Chemical-Disease Relation Extraction (BC5CDR)

advanced

Extract chemical-disease relations from biomedical literature. Based on BioCreative V CDR task. Identify chemical and disease entities, then annotate causal relationships between them (chemical induces disease).

spanradio
Biomedical
📝

CheXpert Chest X-Ray Classification

advanced

Multi-label classification of chest radiographs for 14 observations (Irvin et al., AAAI 2019). Annotate chest X-rays with pathology labels including uncertainty handling for clinical findings.

radiomultiselect
Medical Imaging
📝

Claim Perspectives (Perspectrum)

intermediate

Annotate diverse perspectives on claims with stance and evidence. Based on Chen et al., NAACL 2019. Identify supporting and opposing perspectives for controversial claims.

radio
NLP
📝

Commonsense Inference (ATOMIC 2020)

intermediate

Annotate commonsense inferences about events, mental states, and social interactions. Based on ATOMIC 2020 (Hwang et al., AAAI 2021). Generate if-then knowledge about causes, effects, intents, and reactions.

radiotext
NLP
📝

Commonsense QA Explanation (ECQA)

intermediate

Annotate explanations for commonsense QA with positive and negative properties. Based on ECQA (Aggarwal et al., ACL 2021). Explain why an answer is correct and why others are wrong.

radiotext
NLP
📝

Complex Named Entity Recognition (MultiCoNER)

advanced

Recognize complex and emerging named entities. Based on SemEval 2022/2023 MultiCoNER. Identify creative works, products, groups, and other challenging entity types.

span
NLP
📝

Connotation Frames of Power and Agency

intermediate

Annotate verbs for implied power and agency of participants. Based on Sap et al., EMNLP 2017. Capture how actions implicitly convey power dynamics and agency levels.

radio
NLP
📝

Coreference Resolution (OntoNotes)

advanced

Link pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.

spanradio
NLP
📝

Deceptive Review Detection

intermediate

Distinguish between truthful and deceptive (fake) reviews. Based on Ott et al., ACL 2011. Identify fake reviews written to deceive vs genuine customer experiences.

radiomultiselect
NLP
📝

Dialogue Act Labeling

intermediate

Classify utterances in conversations by their communicative function (question, statement, request, etc.).

radio
NLP
📝

Dialogue Relation Extraction (DialogRE)

advanced

Extract relations between entities in dialogue. Based on Yu et al., ACL 2020. Identify 36 relation types between speakers and entities mentioned in conversations.

spanradio
NLP
📝

DocBank Document Layout Detection

intermediate

Document layout analysis benchmark (Li et al., COLING 2020). Detect and classify document elements including titles, abstracts, paragraphs, figures, tables, and captions.

bboxmultiselect
Document AI
📝

Dynamic Hate Speech Detection

intermediate

Hate speech classification with fine-grained type labels based on the Dynamically Generated Hate Speech Dataset (Vidgen et al., ACL 2021). Classify content as hateful or not, then identify hate type (animosity, derogation, dehumanization, threatening, support for hateful entities) and target group.

radiomultiselect
NLP
📝

Emotion Cause Extraction (RECCON)

advanced

Extract emotion causes from conversational text based on RECCON (Poria et al., EMNLP 2020). Identify which utterances and specific spans caused an emotion expressed in dialogue.

radiospan
NLP
📝

Event Argument Extraction (MAVEN-Arg)

advanced

Document-level event argument extraction based on MAVEN-Arg (Wang et al., ACL 2024). Annotates event triggers with their argument roles including Agent, Patient, Location, Time, Instrument, and more. Supports both entity and non-entity arguments across document context.

spanradio
NLP
📝

Fact Verification

intermediate

Verify claims as supported, refuted, or not enough information based on provided evidence.

radiotext
NLP
📝

Fine-Grained Propaganda Detection

advanced

Span-level annotation of propaganda techniques in news articles based on SemEval 2020 Task 11 (Da San Martino et al., EMNLP 2019). Identifies 14 techniques including Loaded Language, Name Calling, Appeal to Fear, and more across rhetorical dimensions (ethos, logos, pathos).

span
NLP
📝

GoEmotions - Fine-Grained Emotion Classification

intermediate

Multi-label emotion classification with 27 emotion categories plus neutral, based on the Google Research GoEmotions dataset (Demszky et al., ACL 2020). Taxonomy covers 12 positive, 11 negative, and 4 ambiguous emotions designed for Reddit comment analysis.

multiselectradio
NLP
📝

Hate Speech Detection

intermediate

Identify and categorize hate speech, offensive language, and toxic content in text.

radiomultiselect
NLP
📝

HateXplain - Explainable Hate Speech Detection

advanced

Multi-task hate speech annotation with classification (hate/offensive/normal), target community identification, and rationale span highlighting. Based on the HateXplain benchmark (Mathew et al., AAAI 2021) - the first dataset covering classification, target identification, and rationale extraction.

radiomultiselectspan
NLP
📝

Implicit Hate Speech Detection

advanced

Detect and categorize implicit hate speech using a six-category taxonomy. Based on ElSherief et al., EMNLP 2021. Identifies grievance, incitement, stereotypes, inferiority, irony, and threats.

radio
NLP
📝

Intent Classification

beginner

Classify user utterances into intents for chatbot and virtual assistant training.

radiomultiselect
NLP
📝

MIMIC-CXR Chest Radiograph Classification

advanced

Large-scale chest radiograph classification based on MIMIC-CXR (Johnson et al., Scientific Data 2019). Multi-label classification with 14 observations derived from radiology reports.

radiomultiselect
Medical Imaging
📝

Moral Stories Annotation

intermediate

Annotate moral reasoning in situated narratives. Based on Emelin et al., EMNLP 2021. Evaluate whether actions adhere to or diverge from social norms given situations and intentions.

radio
NLP
📝

Named Entity Recognition

intermediate

Span-based entity labeling for identifying people, organizations, locations, and more.

span
NLP
📝

News Headline Emotion Roles (GoodNewsEveryone)

advanced

Annotate emotions in news headlines with semantic roles. Based on Bostan et al., LREC 2020. Identify emotion, experiencer, cause, target, and textual cue.

radiospan
NLP
📝

NLI with Explanations (e-SNLI)

intermediate

Natural language inference with human explanations. Based on e-SNLI (Camburu et al., NeurIPS 2018). Classify entailment/contradiction/neutral and provide natural language justifications.

radiotext
NLP
📝

Political Discourse Analysis (AgoraSpeech)

intermediate

Multi-task annotation of political speeches covering sentiment, polarization, populism, topic identification, and named entities. Based on AgoraSpeech (Sermpezis et al., 2025), featuring human-validated labels for comprehensive political discourse analysis.

radiomultiselectspan
Political Science
📝

Question Answering

intermediate

Annotate answer spans in text passages for reading comprehension tasks.

spanradio
NLP
📝

Rationale Annotation (ERASER)

intermediate

Annotate rationales (evidence spans) that justify classification decisions. Based on the ERASER benchmark (DeYoung et al., ACL 2020). Identify which parts of the text are necessary and sufficient for making a prediction.

radiospan
NLP
📝

Reading Comprehension QA

intermediate

Evaluate question-answer pairs for reading comprehension by verifying answers and rating quality.

radiospantext
NLP
📝

Relation Extraction

advanced

Identify and classify relationships between entities in text (e.g., works-for, located-in, married-to).

radiospan
NLP
📝

Rumor Stance Detection (PHEME)

intermediate

Classify stance toward rumors in social media threads. Based on PHEME (Zubiaga et al.). Label replies as supporting, denying, querying, or commenting on rumorous claims.

radio
NLP
📝

RuSentiment - Social Media Sentiment

beginner

5-class sentiment annotation for social media posts based on RuSentiment (Rogers et al., COLING 2018). Includes Positive, Negative, Neutral, Speech Act (greetings/thanks), and Skip categories. Achieved 0.654 Fleiss kappa with 250-350 posts/hour annotation speed.

radio
NLP
📝

Sarcasm Detection

intermediate

Identify sarcastic statements and label their type and target in social media and conversational text.

radiotext
NLP
📝

Scientific Claim Verification (SciFact)

advanced

Verify scientific claims against evidence from research abstracts. Based on SciFact (Wadden et al., EMNLP 2020). Classify claims as supported, refuted, or having insufficient evidence, and identify rationale sentences.

radiospan
NLP
📝

Semantic Similarity

beginner

Rate the semantic similarity between pairs of sentences on a continuous scale.

sliderlikert
NLP
📝

Sentiment Analysis

beginner

Simple 3-way sentiment classification with radio buttons. Perfect for social media analysis, product reviews, and customer feedback.

radio
NLP
📝

Social Bias Frames (SBIC)

advanced

Annotate social media posts for bias using structured frames. Based on Sap et al., ACL 2020. Identify offensiveness, intent, implied stereotypes, and targeted groups.

radiomultiselecttext
NLP
📝

Social Determinants of Health (SDOH) Extraction

advanced

Event-based extraction of social determinants of health from clinical notes based on the n2c2 2022 Track 2 shared task and SHAC corpus. Annotates substance use (alcohol, drug, tobacco), employment, and living status with temporal and status attributes.

spanradiomultiselect
Clinical NLP
📝

Stance Detection (VAST)

intermediate

Detect the stance of a text toward a given topic. Based on VAST (Allaway & McKeown, EMNLP 2020) for zero-shot stance detection. Classify text as expressing favor, opposition, or neutrality toward various topics.

radio
NLP
📝

SWBD-DAMSL Dialogue Acts

advanced

Dialogue act annotation following the Switchboard DAMSL tagset (Jurafsky et al., 1997). Covers statements, questions, backchannels, agreements, and more for conversational speech analysis. Achieved 0.80 inter-rater Kappa on 1,155 conversations.

radiomultiselect
NLP
📝

Temporal Relation Annotation (TempEval-3)

advanced

Annotate temporal relations between events and time expressions following TimeML guidelines. Based on TempEval-3 shared task. Label relations as BEFORE, AFTER, SIMULTANEOUS, or VAGUE to capture how events relate in time.

spanradio
NLP
📝

Toxic Spans Detection

intermediate

Character-level toxic span annotation based on SemEval-2021 Task 5 (Pavlopoulos et al., 2021). Instead of binary toxicity classification, annotators identify the specific words/phrases that make a comment toxic, enabling more nuanced content moderation.

spanradio
NLP

🎧Audio Annotation16

🎧

Acoustic Scene Classification

beginner

Classify audio recordings by acoustic environment following the TUT/DCASE dataset format.

radiolikert
Audio
🎧

Audio Transcription Review

intermediate

Review and correct automatic speech recognition transcriptions with waveform visualization.

textaudio
Audio
🎧

Audio-Visual Sentiment Analysis

intermediate

Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.

likertradio
Audio
🎧

AudioSet Event Classification

intermediate

Multi-label audio event tagging following the AudioSet ontology for weak supervision.

multiselect
Audio
🎧

Continuous Emotion Rating

intermediate

Rate emotional dimensions (valence, arousal, dominance) continuously following MSP-IMPROV protocol.

sliderlikert
Audio
🎧

Environmental Sound Classification

beginner

Classify environmental sounds into categories following UrbanSound8K and ESC-50 datasets.

radio
Audio
🎧

Keyword Spotting

beginner

Classify spoken commands and keywords following the Google Speech Commands dataset format.

radio
Audio
🎧

Music Genre Classification

beginner

Classify music clips into genres and subgenres with mood and instrumentation tags.

radiomultiselect
music
🎧

Music Tagging

intermediate

Multi-label music tagging following MagnaTagATune dataset format for instrument and genre annotation.

multiselectlikert
Audio
🎧

Respiratory Sound Classification

advanced

Classify lung and respiratory sounds for medical diagnosis following ICBHI 2017 Challenge format.

radiomultiselect
Audio
🎧

Sound Event Detection

advanced

Temporal sound event annotation with strong labels following DCASE Challenge protocols.

spanmultiselect
Audio
🎧

Speaker Diarization

intermediate

Identify and label different speakers in audio recordings with timestamp-based segment annotation.

spanradio
Speech
🎧

Speech Emotion Recognition

intermediate

Classify emotional states from speech audio including happiness, sadness, anger, fear, and more.

radiolikert
Speech
🎧

Speech Emotion Recognition

intermediate

Classify emotional content in speech following IEMOCAP and CREMA-D annotation schemes.

radiolikert
Audio
🎧

Speech Intelligibility Rating

intermediate

Rate speech intelligibility for pathological speech following TORGO database annotation protocols.

likertradiotext
Audio
🎧

Speech Quality MOS Rating

beginner

Rate speech quality using Mean Opinion Score following ITU-T P.800 and Blizzard Challenge protocols.

likertradio
Audio

🖼️Image Annotation23

🖼️

ADE20K Semantic Segmentation

advanced

Comprehensive scene parsing with 150 semantic categories (Zhou et al., CVPR 2017). Annotate indoor and outdoor scenes with pixel-level labels covering objects, parts, and stuff classes.

polygonmultiselect
Computer Vision
🖼️

BDD100K Autonomous Driving Segmentation

advanced

Large-scale diverse driving video dataset (Yu et al., CVPR 2020). Annotate driving scenes with bounding boxes, lane markings, drivable areas, and full-frame instance segmentation.

bboxpolygonmultiselect
Computer Vision
🖼️

CelebA Face Attributes Classification

intermediate

Large-scale face attributes dataset with 40 binary attributes (Liu et al., ICCV 2015). Annotate celebrity face images with attributes including hair color, age, gender, and facial features.

multiselectbbox
Computer Vision
🖼️

Cityscapes Instance Segmentation

advanced

Urban scene understanding with instance-level semantic labeling (Cordts et al., CVPR 2016). Annotate street scenes with pixel-level labels for 30 classes across vehicles, humans, construction, and nature.

polygonmultiselect
Computer Vision
🖼️

CUB-200-2011 Fine-Grained Bird Classification

advanced

Fine-grained visual categorization of 200 bird species (Wah et al., 2011). Annotate bird images with species labels, part locations, and attribute annotations.

radiomultiselectbbox
Computer Vision
🖼️

DeepFashion Fine-Grained Fashion Classification

intermediate

Large-scale fashion dataset with attribute prediction, consumer-to-shop matching, and landmark detection (Liu et al., CVPR 2016). Annotate clothing with 1000 categories and 50 attributes.

radiomultiselectbbox
Computer Vision
🖼️

DOTA Aerial Image Object Detection

advanced

Oriented bounding box detection in aerial images (Xia et al., CVPR 2018). Detect 15 object categories with arbitrary orientations including planes, ships, vehicles, and sports facilities.

bboxmultiselect
Remote Sensing
🖼️

Image Captioning Evaluation

beginner

Rate AI-generated image captions for accuracy, fluency, and detail.

likertimage
Computer Vision
🖼️

Image Classification

beginner

Multi-class image classification with thumbnail preview and zoom controls.

radioimage
Computer Vision
🖼️

Image Segmentation

advanced

Draw polygon masks around objects for semantic segmentation tasks.

polygonimage
Computer Vision
🖼️

ImageNet Image Classification

intermediate

Large-scale image classification following the ImageNet dataset (Deng et al., CVPR 2009). Classify images into 1000+ synsets organized according to the WordNet hierarchy.

radiomultiselect
Computer Vision
🖼️

iWildCam Wildlife Detection & Classification

intermediate

Camera trap image classification for wildlife monitoring (Beery et al., CVPR 2019). Classify wildlife species from camera trap images across diverse ecosystems worldwide.

radiomultiselectbbox
Computer Vision
🖼️

KITTI Road Object Detection

advanced

Autonomous driving benchmark for object detection (Geiger et al., CVPR 2012). Annotate vehicles, pedestrians, and cyclists with 3D bounding boxes and occlusion/truncation labels.

bboxradio
Computer Vision
🖼️

LIP Human Parsing

advanced

Pixel-level human body part segmentation (Gong et al., CVPR 2017). Parse human images into 20 semantic parts including hair, face, arms, legs, and clothing items.

polygonmultiselect
Computer Vision
🖼️

MS COCO Object Detection & Segmentation

advanced

Object detection and instance segmentation annotation following the MS COCO format (Lin et al., ECCV 2014). Annotate objects with bounding boxes and polygon segmentation masks across 80 common object categories.

bboxpolygon
Computer Vision
🖼️

MVTec AD Industrial Defect Detection

intermediate

Anomaly detection and localization in industrial images (Bergmann et al., CVPR 2019). Detect defects across 15 object and texture categories including metal nuts, transistors, and leather.

radiobboxpolygon
Computer Vision
🖼️

Object Detection

intermediate

Draw bounding boxes around objects for object detection model training.

bboximage
Computer Vision
🖼️

Open Images V6 Object Detection

advanced

Large-scale object detection following Open Images V6 (Kuznetsova et al., IJCV 2020). Annotate 600 object classes with bounding boxes, visual relationships, and instance segmentation masks.

bboxmultiselectradio
Computer Vision
🖼️

PASCAL VOC Object Detection

intermediate

Bounding box object detection following the PASCAL Visual Object Classes challenge (Everingham et al., IJCV 2010). Annotate 20 object categories with axis-aligned bounding boxes.

bboxradio
Computer Vision
🖼️

Places365 Scene Classification

intermediate

Scene recognition and classification following the Places365 dataset (Zhou et al., TPAMI 2017). Classify images into 365 scene categories spanning indoor, outdoor, and natural environments.

radiomultiselect
Computer Vision
🖼️

Visual Question Answering

beginner

Answer questions about images for VQA dataset creation.

textradioimage
Computer Vision
🖼️

WikiArt Artwork Classification

intermediate

Art classification by style, genre, and artist (Saleh & Elgammal, 2015). Classify paintings into artistic movements, genres, and predict artist attribution.

radiomultiselect
Computer Vision
🖼️

xView Satellite Object Detection

advanced

Large-scale overhead imagery object detection (Lam et al., arXiv 2018). Detect 60 object classes including vehicles, buildings, and infrastructure from satellite images.

bboxmultiselect
Remote Sensing

📊Surveys9

📊

Argument Quality Assessment

intermediate

Multi-dimensional argument quality annotation based on the Wachsmuth et al. (2017) taxonomy. Rates arguments on three dimensions: Cogency (logical validity), Effectiveness (persuasive power), and Reasonableness (contribution to resolution). Used in Dagstuhl-ArgQuality and GAQCorpus datasets.

likertradio
NLP
📊

Emotion Detection (SemEval-2018 Task 1)

intermediate

Multi-label emotion classification with intensity ratings based on SemEval-2018 Task 1. Annotate text for emotions (anger, anticipation, disgust, fear, joy, love, optimism, pessimism, sadness, surprise, trust) with intensity scales.

multiselectlikert
NLP
📊

Empathetic Dialogue Annotation

intermediate

Annotate emotional situations and empathetic responses in conversations. Based on EmpatheticDialogues (Rashkin et al., ACL 2019). Classify the emotional context and evaluate response empathy.

radiolikert
NLP
📊

Likert Scale Survey

beginner

Multi-question survey using Likert scales to measure agreement, satisfaction, or frequency.

likertmultirate
research
📊

Machine Translation Evaluation

intermediate

Evaluate machine translation quality with adequacy and fluency ratings.

likertmultiselect
NLP
📊

Social Chemistry 101 (Social Norms)

advanced

Annotate rules-of-thumb for social and moral norms. Based on Forbes et al., EMNLP 2020. Capture 12 dimensions of social judgment including cultural pressure, moral foundations, and legality.

radiolikert
NLP
📊

Survey Feedback

beginner

Multi-question survey with Likert scales, text fields, and multiple choice.

likerttextradio
Survey
📊

Text Summarization Evaluation

intermediate

Rate the quality of AI-generated summaries on fluency, coherence, and faithfulness.

likerttext
NLP
📊

Toxicity Detection

intermediate

Multi-label toxicity classification with severity ratings for content moderation.

multiselectlikert
NLP

Have a design to share?

Contribute your annotation configurations to help the community.