Annotation Showcase
Browse 100+ ready-to-use annotation configurations. Download configs and start annotating immediately.
Showing 100 of 100 designs
📝Text Annotation48
Adverse Drug Event Extraction (CADEC)
intermediateNamed entity recognition for adverse drug events from patient-reported experiences, based on the CADEC corpus (Karimi et al., 2015). Annotates drugs, adverse effects, symptoms, diseases, and findings from colloquial health forum posts with mapping to medical vocabularies (SNOMED-CT, MedDRA).
Argument Reasoning Comprehension (ARCT)
advancedIdentify implicit warrants in arguments. Based on Habernal et al., NAACL 2018 / SemEval 2018 Task 12. Given a claim and premise, choose the correct warrant that connects them.
BigEarthNet Remote Sensing Classification
intermediateMulti-label land cover classification from Sentinel-2 imagery (Sumbul et al., IGARSS 2019). Classify satellite patches into 43 land cover classes following the CORINE taxonomy.
Biomedical Named Entity Recognition (JNLPBA)
advancedNamed entity recognition for biomedical text based on the JNLPBA shared task. Annotate entities including proteins, DNA, RNA, cell lines, and cell types following BioNLP community standards.
Chemical-Disease Relation Extraction (BC5CDR)
advancedExtract chemical-disease relations from biomedical literature. Based on BioCreative V CDR task. Identify chemical and disease entities, then annotate causal relationships between them (chemical induces disease).
CheXpert Chest X-Ray Classification
advancedMulti-label classification of chest radiographs for 14 observations (Irvin et al., AAAI 2019). Annotate chest X-rays with pathology labels including uncertainty handling for clinical findings.
Claim Perspectives (Perspectrum)
intermediateAnnotate diverse perspectives on claims with stance and evidence. Based on Chen et al., NAACL 2019. Identify supporting and opposing perspectives for controversial claims.
Commonsense Inference (ATOMIC 2020)
intermediateAnnotate commonsense inferences about events, mental states, and social interactions. Based on ATOMIC 2020 (Hwang et al., AAAI 2021). Generate if-then knowledge about causes, effects, intents, and reactions.
Commonsense QA Explanation (ECQA)
intermediateAnnotate explanations for commonsense QA with positive and negative properties. Based on ECQA (Aggarwal et al., ACL 2021). Explain why an answer is correct and why others are wrong.
Complex Named Entity Recognition (MultiCoNER)
advancedRecognize complex and emerging named entities. Based on SemEval 2022/2023 MultiCoNER. Identify creative works, products, groups, and other challenging entity types.
Connotation Frames of Power and Agency
intermediateAnnotate verbs for implied power and agency of participants. Based on Sap et al., EMNLP 2017. Capture how actions implicitly convey power dynamics and agency levels.
Coreference Resolution (OntoNotes)
advancedLink pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.
Deceptive Review Detection
intermediateDistinguish between truthful and deceptive (fake) reviews. Based on Ott et al., ACL 2011. Identify fake reviews written to deceive vs genuine customer experiences.
Dialogue Act Labeling
intermediateClassify utterances in conversations by their communicative function (question, statement, request, etc.).
Dialogue Relation Extraction (DialogRE)
advancedExtract relations between entities in dialogue. Based on Yu et al., ACL 2020. Identify 36 relation types between speakers and entities mentioned in conversations.
DocBank Document Layout Detection
intermediateDocument layout analysis benchmark (Li et al., COLING 2020). Detect and classify document elements including titles, abstracts, paragraphs, figures, tables, and captions.
Dynamic Hate Speech Detection
intermediateHate speech classification with fine-grained type labels based on the Dynamically Generated Hate Speech Dataset (Vidgen et al., ACL 2021). Classify content as hateful or not, then identify hate type (animosity, derogation, dehumanization, threatening, support for hateful entities) and target group.
Emotion Cause Extraction (RECCON)
advancedExtract emotion causes from conversational text based on RECCON (Poria et al., EMNLP 2020). Identify which utterances and specific spans caused an emotion expressed in dialogue.
Event Argument Extraction (MAVEN-Arg)
advancedDocument-level event argument extraction based on MAVEN-Arg (Wang et al., ACL 2024). Annotates event triggers with their argument roles including Agent, Patient, Location, Time, Instrument, and more. Supports both entity and non-entity arguments across document context.
Fact Verification
intermediateVerify claims as supported, refuted, or not enough information based on provided evidence.
Fine-Grained Propaganda Detection
advancedSpan-level annotation of propaganda techniques in news articles based on SemEval 2020 Task 11 (Da San Martino et al., EMNLP 2019). Identifies 14 techniques including Loaded Language, Name Calling, Appeal to Fear, and more across rhetorical dimensions (ethos, logos, pathos).
GoEmotions - Fine-Grained Emotion Classification
intermediateMulti-label emotion classification with 27 emotion categories plus neutral, based on the Google Research GoEmotions dataset (Demszky et al., ACL 2020). Taxonomy covers 12 positive, 11 negative, and 4 ambiguous emotions designed for Reddit comment analysis.
Hate Speech Detection
intermediateIdentify and categorize hate speech, offensive language, and toxic content in text.
HateXplain - Explainable Hate Speech Detection
advancedMulti-task hate speech annotation with classification (hate/offensive/normal), target community identification, and rationale span highlighting. Based on the HateXplain benchmark (Mathew et al., AAAI 2021) - the first dataset covering classification, target identification, and rationale extraction.
Implicit Hate Speech Detection
advancedDetect and categorize implicit hate speech using a six-category taxonomy. Based on ElSherief et al., EMNLP 2021. Identifies grievance, incitement, stereotypes, inferiority, irony, and threats.
Intent Classification
beginnerClassify user utterances into intents for chatbot and virtual assistant training.
MIMIC-CXR Chest Radiograph Classification
advancedLarge-scale chest radiograph classification based on MIMIC-CXR (Johnson et al., Scientific Data 2019). Multi-label classification with 14 observations derived from radiology reports.
Moral Stories Annotation
intermediateAnnotate moral reasoning in situated narratives. Based on Emelin et al., EMNLP 2021. Evaluate whether actions adhere to or diverge from social norms given situations and intentions.
Named Entity Recognition
intermediateSpan-based entity labeling for identifying people, organizations, locations, and more.
News Headline Emotion Roles (GoodNewsEveryone)
advancedAnnotate emotions in news headlines with semantic roles. Based on Bostan et al., LREC 2020. Identify emotion, experiencer, cause, target, and textual cue.
NLI with Explanations (e-SNLI)
intermediateNatural language inference with human explanations. Based on e-SNLI (Camburu et al., NeurIPS 2018). Classify entailment/contradiction/neutral and provide natural language justifications.
Political Discourse Analysis (AgoraSpeech)
intermediateMulti-task annotation of political speeches covering sentiment, polarization, populism, topic identification, and named entities. Based on AgoraSpeech (Sermpezis et al., 2025), featuring human-validated labels for comprehensive political discourse analysis.
Question Answering
intermediateAnnotate answer spans in text passages for reading comprehension tasks.
Rationale Annotation (ERASER)
intermediateAnnotate rationales (evidence spans) that justify classification decisions. Based on the ERASER benchmark (DeYoung et al., ACL 2020). Identify which parts of the text are necessary and sufficient for making a prediction.
Reading Comprehension QA
intermediateEvaluate question-answer pairs for reading comprehension by verifying answers and rating quality.
Relation Extraction
advancedIdentify and classify relationships between entities in text (e.g., works-for, located-in, married-to).
Rumor Stance Detection (PHEME)
intermediateClassify stance toward rumors in social media threads. Based on PHEME (Zubiaga et al.). Label replies as supporting, denying, querying, or commenting on rumorous claims.
RuSentiment - Social Media Sentiment
beginner5-class sentiment annotation for social media posts based on RuSentiment (Rogers et al., COLING 2018). Includes Positive, Negative, Neutral, Speech Act (greetings/thanks), and Skip categories. Achieved 0.654 Fleiss kappa with 250-350 posts/hour annotation speed.
Sarcasm Detection
intermediateIdentify sarcastic statements and label their type and target in social media and conversational text.
Scientific Claim Verification (SciFact)
advancedVerify scientific claims against evidence from research abstracts. Based on SciFact (Wadden et al., EMNLP 2020). Classify claims as supported, refuted, or having insufficient evidence, and identify rationale sentences.
Semantic Similarity
beginnerRate the semantic similarity between pairs of sentences on a continuous scale.
Sentiment Analysis
beginnerSimple 3-way sentiment classification with radio buttons. Perfect for social media analysis, product reviews, and customer feedback.
Social Bias Frames (SBIC)
advancedAnnotate social media posts for bias using structured frames. Based on Sap et al., ACL 2020. Identify offensiveness, intent, implied stereotypes, and targeted groups.
Social Determinants of Health (SDOH) Extraction
advancedEvent-based extraction of social determinants of health from clinical notes based on the n2c2 2022 Track 2 shared task and SHAC corpus. Annotates substance use (alcohol, drug, tobacco), employment, and living status with temporal and status attributes.
Stance Detection (VAST)
intermediateDetect the stance of a text toward a given topic. Based on VAST (Allaway & McKeown, EMNLP 2020) for zero-shot stance detection. Classify text as expressing favor, opposition, or neutrality toward various topics.
SWBD-DAMSL Dialogue Acts
advancedDialogue act annotation following the Switchboard DAMSL tagset (Jurafsky et al., 1997). Covers statements, questions, backchannels, agreements, and more for conversational speech analysis. Achieved 0.80 inter-rater Kappa on 1,155 conversations.
Temporal Relation Annotation (TempEval-3)
advancedAnnotate temporal relations between events and time expressions following TimeML guidelines. Based on TempEval-3 shared task. Label relations as BEFORE, AFTER, SIMULTANEOUS, or VAGUE to capture how events relate in time.
Toxic Spans Detection
intermediateCharacter-level toxic span annotation based on SemEval-2021 Task 5 (Pavlopoulos et al., 2021). Instead of binary toxicity classification, annotators identify the specific words/phrases that make a comment toxic, enabling more nuanced content moderation.
🎧Audio Annotation16
Acoustic Scene Classification
beginnerClassify audio recordings by acoustic environment following the TUT/DCASE dataset format.
Audio Transcription Review
intermediateReview and correct automatic speech recognition transcriptions with waveform visualization.
Audio-Visual Sentiment Analysis
intermediateRate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.
AudioSet Event Classification
intermediateMulti-label audio event tagging following the AudioSet ontology for weak supervision.
Continuous Emotion Rating
intermediateRate emotional dimensions (valence, arousal, dominance) continuously following MSP-IMPROV protocol.
Environmental Sound Classification
beginnerClassify environmental sounds into categories following UrbanSound8K and ESC-50 datasets.
Keyword Spotting
beginnerClassify spoken commands and keywords following the Google Speech Commands dataset format.
Music Genre Classification
beginnerClassify music clips into genres and subgenres with mood and instrumentation tags.
Music Tagging
intermediateMulti-label music tagging following MagnaTagATune dataset format for instrument and genre annotation.
Respiratory Sound Classification
advancedClassify lung and respiratory sounds for medical diagnosis following ICBHI 2017 Challenge format.
Sound Event Detection
advancedTemporal sound event annotation with strong labels following DCASE Challenge protocols.
Speaker Diarization
intermediateIdentify and label different speakers in audio recordings with timestamp-based segment annotation.
Speech Emotion Recognition
intermediateClassify emotional states from speech audio including happiness, sadness, anger, fear, and more.
Speech Emotion Recognition
intermediateClassify emotional content in speech following IEMOCAP and CREMA-D annotation schemes.
Speech Intelligibility Rating
intermediateRate speech intelligibility for pathological speech following TORGO database annotation protocols.
Speech Quality MOS Rating
beginnerRate speech quality using Mean Opinion Score following ITU-T P.800 and Blizzard Challenge protocols.
🖼️Image Annotation23
ADE20K Semantic Segmentation
advancedComprehensive scene parsing with 150 semantic categories (Zhou et al., CVPR 2017). Annotate indoor and outdoor scenes with pixel-level labels covering objects, parts, and stuff classes.
BDD100K Autonomous Driving Segmentation
advancedLarge-scale diverse driving video dataset (Yu et al., CVPR 2020). Annotate driving scenes with bounding boxes, lane markings, drivable areas, and full-frame instance segmentation.
CelebA Face Attributes Classification
intermediateLarge-scale face attributes dataset with 40 binary attributes (Liu et al., ICCV 2015). Annotate celebrity face images with attributes including hair color, age, gender, and facial features.
Cityscapes Instance Segmentation
advancedUrban scene understanding with instance-level semantic labeling (Cordts et al., CVPR 2016). Annotate street scenes with pixel-level labels for 30 classes across vehicles, humans, construction, and nature.
CUB-200-2011 Fine-Grained Bird Classification
advancedFine-grained visual categorization of 200 bird species (Wah et al., 2011). Annotate bird images with species labels, part locations, and attribute annotations.
DeepFashion Fine-Grained Fashion Classification
intermediateLarge-scale fashion dataset with attribute prediction, consumer-to-shop matching, and landmark detection (Liu et al., CVPR 2016). Annotate clothing with 1000 categories and 50 attributes.
DOTA Aerial Image Object Detection
advancedOriented bounding box detection in aerial images (Xia et al., CVPR 2018). Detect 15 object categories with arbitrary orientations including planes, ships, vehicles, and sports facilities.
Image Captioning Evaluation
beginnerRate AI-generated image captions for accuracy, fluency, and detail.
Image Classification
beginnerMulti-class image classification with thumbnail preview and zoom controls.
Image Segmentation
advancedDraw polygon masks around objects for semantic segmentation tasks.
ImageNet Image Classification
intermediateLarge-scale image classification following the ImageNet dataset (Deng et al., CVPR 2009). Classify images into 1000+ synsets organized according to the WordNet hierarchy.
iWildCam Wildlife Detection & Classification
intermediateCamera trap image classification for wildlife monitoring (Beery et al., CVPR 2019). Classify wildlife species from camera trap images across diverse ecosystems worldwide.
KITTI Road Object Detection
advancedAutonomous driving benchmark for object detection (Geiger et al., CVPR 2012). Annotate vehicles, pedestrians, and cyclists with 3D bounding boxes and occlusion/truncation labels.
LIP Human Parsing
advancedPixel-level human body part segmentation (Gong et al., CVPR 2017). Parse human images into 20 semantic parts including hair, face, arms, legs, and clothing items.
MS COCO Object Detection & Segmentation
advancedObject detection and instance segmentation annotation following the MS COCO format (Lin et al., ECCV 2014). Annotate objects with bounding boxes and polygon segmentation masks across 80 common object categories.
MVTec AD Industrial Defect Detection
intermediateAnomaly detection and localization in industrial images (Bergmann et al., CVPR 2019). Detect defects across 15 object and texture categories including metal nuts, transistors, and leather.
Object Detection
intermediateDraw bounding boxes around objects for object detection model training.
Open Images V6 Object Detection
advancedLarge-scale object detection following Open Images V6 (Kuznetsova et al., IJCV 2020). Annotate 600 object classes with bounding boxes, visual relationships, and instance segmentation masks.
PASCAL VOC Object Detection
intermediateBounding box object detection following the PASCAL Visual Object Classes challenge (Everingham et al., IJCV 2010). Annotate 20 object categories with axis-aligned bounding boxes.
Places365 Scene Classification
intermediateScene recognition and classification following the Places365 dataset (Zhou et al., TPAMI 2017). Classify images into 365 scene categories spanning indoor, outdoor, and natural environments.
Visual Question Answering
beginnerAnswer questions about images for VQA dataset creation.
WikiArt Artwork Classification
intermediateArt classification by style, genre, and artist (Saleh & Elgammal, 2015). Classify paintings into artistic movements, genres, and predict artist attribution.
xView Satellite Object Detection
advancedLarge-scale overhead imagery object detection (Lam et al., arXiv 2018). Detect 60 object classes including vehicles, buildings, and infrastructure from satellite images.
⚖️Comparison Tasks3
Best-Worst Scaling
beginnerMaxDiff annotation where annotators select the best and worst items from a set for relative comparison.
Pairwise Preference
beginnerCompare two items and select the preferred one. Great for ranking and preference learning.
Ranking Task
beginnerDrag-and-drop ranking interface to order items from best to worst.
📊Surveys9
Argument Quality Assessment
intermediateMulti-dimensional argument quality annotation based on the Wachsmuth et al. (2017) taxonomy. Rates arguments on three dimensions: Cogency (logical validity), Effectiveness (persuasive power), and Reasonableness (contribution to resolution). Used in Dagstuhl-ArgQuality and GAQCorpus datasets.
Emotion Detection (SemEval-2018 Task 1)
intermediateMulti-label emotion classification with intensity ratings based on SemEval-2018 Task 1. Annotate text for emotions (anger, anticipation, disgust, fear, joy, love, optimism, pessimism, sadness, surprise, trust) with intensity scales.
Empathetic Dialogue Annotation
intermediateAnnotate emotional situations and empathetic responses in conversations. Based on EmpatheticDialogues (Rashkin et al., ACL 2019). Classify the emotional context and evaluate response empathy.
Likert Scale Survey
beginnerMulti-question survey using Likert scales to measure agreement, satisfaction, or frequency.
Machine Translation Evaluation
intermediateEvaluate machine translation quality with adequacy and fluency ratings.
Social Chemistry 101 (Social Norms)
advancedAnnotate rules-of-thumb for social and moral norms. Based on Forbes et al., EMNLP 2020. Capture 12 dimensions of social judgment including cultural pressure, moral foundations, and legality.
Survey Feedback
beginnerMulti-question survey with Likert scales, text fields, and multiple choice.
Text Summarization Evaluation
intermediateRate the quality of AI-generated summaries on fluency, coherence, and faithfulness.
Toxicity Detection
intermediateMulti-label toxicity classification with severity ratings for content moderation.
Have a design to share?
Contribute your annotation configurations to help the community.