Skip to content
Blog/Announcements
Announcements5 min read

Potato 2.2: Events, Entity Linking, Export, and 55 Survey Instruments

Potato 2.2.0 adds 9 new annotation schemas, a pluggable export system, MACE competence estimation, 55 validated survey instruments, and remote data sources.

By Potato Team·

Potato 2.2: Events, Entity Linking, Export, and 55 Survey Instruments

We're excited to announce Potato 2.2.0, a major feature release that significantly expands what you can annotate and how you manage annotation quality. This update adds 9 new annotation schemas, a pluggable export system, MACE competence estimation, 55 validated survey instruments, and remote data sources.

New Annotation Schemas

Event Annotation

The headline annotation feature of v2.2 is N-ary event annotation. Events consist of a trigger span (the word indicating the event) and argument spans with typed semantic roles. A hub-spoke arc visualization connects triggers to their arguments.

yaml
annotation_schemes:
  - annotation_type: event_annotation
    name: events
    span_schema: entities
    event_types:
      - type: "ATTACK"
        trigger_labels: ["EVENT_TRIGGER"]
        arguments:
          - role: "attacker"
            entity_types: ["PERSON", "ORGANIZATION"]
            required: true
          - role: "target"
            entity_types: ["PERSON", "ORGANIZATION", "LOCATION"]
            required: true

This opens up information extraction, semantic role labeling, and knowledge graph construction tasks that previously required custom tooling.

Read the Event Annotation documentation →

Entity Linking

Span annotations can now be linked to external knowledge bases. Annotators highlight text, assign a label, then use a search modal to find and link the matching Wikidata, UMLS, or custom KB entity.

yaml
annotation_schemes:
  - annotation_type: span
    name: ner
    labels: [PERSON, ORGANIZATION, LOCATION]
    entity_linking:
      enabled: true
      knowledge_bases:
        - name: wikidata
          type: wikidata
          language: en

Supports multi-select mode for ambiguous entities and multiple knowledge bases in a single task.

Read the Entity Linking documentation →

Triage, Pairwise, Coreference, and More

Six additional annotation types round out the v2.2 schema additions:

  • Triage — Accept/reject/skip interface for rapid data screening with auto-advance and keyboard shortcuts
  • Pairwise Comparison — Binary A/B or scale slider for preference learning and RLHF data collection
  • Conversation Trees — Hierarchical tree annotation with per-node ratings and path selection
  • Coreference Chains — Group coreferring mentions into chains with visual indicators
  • Segmentation Masks — New fill, eraser, and brush tools for pixel-level image annotation
  • Discontinuous Spansallow_discontinuous: true for non-contiguous text selections

Intelligent Annotation

MACE Competence Estimation

MACE uses a Variational Bayes EM algorithm to jointly estimate true labels and annotator competence scores (0.0-1.0). It identifies reliable annotators, detects spammers, and produces higher-quality predicted labels.

yaml
mace:
  enabled: true
  trigger_every_n: 10
  min_annotations_per_item: 3

MACE runs automatically in the background and integrates with the admin dashboard and adjudication system.

Read the MACE documentation →

Option Highlighting

A new AI feature that analyzes content to highlight the most likely correct options for discrete annotation tasks. Top-k options display at full opacity with a star indicator while less-likely options are dimmed.

yaml
ai_support:
  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4

Read the Option Highlighting documentation →

Diversity Ordering

Sentence-transformer embeddings cluster similar items together, then round-robin sampling presents items from different clusters. This reduces annotator fatigue and improves coverage of the topic space.

yaml
assignment_strategy: diversity_clustering
diversity_ordering:
  enabled: true
  prefill_count: 100

Read the Diversity Ordering documentation →

Export System

The new export CLI (python -m potato.export) converts annotations to 6 industry-standard formats with a single command:

bash
python -m potato.export --config config.yaml --format coco --output ./export/
python -m potato.export --config config.yaml --format yolo --output ./export/
python -m potato.export --config config.yaml --format conll_2003 --output ./export/

Supported formats: COCO, YOLO, Pascal VOC, CoNLL-2003, CoNLL-U, and Segmentation Masks. The system is extensible — create custom exporters by subclassing BaseExporter.

Read the Export Formats documentation →

Remote Data Sources

Load annotation data from URLs, S3, Google Drive, Dropbox, Hugging Face datasets, Google Sheets, and SQL databases:

yaml
data_sources:
  - type: huggingface
    dataset: "squad"
    split: "train"
 
  - type: s3
    bucket: "my-annotation-data"
    key: "datasets/items.jsonl"

Includes partial/incremental loading for large datasets, local caching, and secure credential management with environment variables.

Read the Remote Data Sources documentation →

Survey Instruments

A library of 55 validated questionnaires ready to use in prestudy and poststudy phases:

yaml
phases:
  prestudy:
    type: prestudy
    instrument: "tipi"      # 10-item personality questionnaire
 
  poststudy:
    type: poststudy
    instrument: "phq-9"     # 9-item depression screening

Instruments span 8 categories: Personality (BFI-2, TIPI), Mental Health (PHQ-9, GAD-7), Affect (PANAS), Self-Concept (RSE), Social Attitudes (SDO-7, MFQ), Response Style, Short-Form versions, and Demographic Batteries from major surveys (ANES, GSS, ESS).

Read the Survey Instruments documentation →

UX Improvements

  • Video object tracking with keyframe interpolation
  • Bounding box annotation on PDF pages
  • External AI config file support
  • Form layout grid improvements

Upgrading to v2.2

bash
pip install --upgrade potato-annotation

Existing v2.0 and v2.1 configurations work without changes — all new features are opt-in through additional config blocks.

Getting Started


Have questions or feedback? Join our Discord or open an issue on GitHub.