Skip to content
Announcements5 min read

Potato 2.2: Events, Entity Linking, Export, and 55 Survey Instruments

Potato 2.2.0 adds 9 new annotation schemas, a pluggable export system, MACE competence estimation, 55 validated survey instruments, and remote data sources.

Potato Team·
Diese Seite ist in Ihrer Sprache noch nicht verfügbar. Englische Version wird angezeigt.

Potato 2.2: Events, Entity Linking, Export, and 55 Survey Instruments

We're excited to announce Potato 2.2.0, a major feature release that significantly expands what you can annotate and how you manage annotation quality. This update adds 9 new annotation schemas, a pluggable export system, MACE competence estimation, 55 validated survey instruments, and remote data sources.

New Annotation Schemas

Event Annotation

The headline annotation feature of v2.2 is N-ary event annotation. Events consist of a trigger span (the word indicating the event) and argument spans with typed semantic roles. A hub-spoke arc visualization connects triggers to their arguments.

yaml
annotation_schemes:
  - annotation_type: event_annotation
    name: events
    span_schema: entities
    event_types:
      - type: "ATTACK"
        trigger_labels: ["EVENT_TRIGGER"]
        arguments:
          - role: "attacker"
            entity_types: ["PERSON", "ORGANIZATION"]
            required: true
          - role: "target"
            entity_types: ["PERSON", "ORGANIZATION", "LOCATION"]
            required: true

This opens up information extraction, semantic role labeling, and knowledge graph construction tasks that previously required custom tooling.

Read the Event Annotation documentation →

Entity Linking

Span annotations can now be linked to external knowledge bases. Annotators highlight text, assign a label, then use a search modal to find and link the matching Wikidata, UMLS, or custom KB entity.

yaml
annotation_schemes:
  - annotation_type: span
    name: ner
    labels: [PERSON, ORGANIZATION, LOCATION]
    entity_linking:
      enabled: true
      knowledge_bases:
        - name: wikidata
          type: wikidata
          language: en

Supports multi-select mode for ambiguous entities and multiple knowledge bases in a single task.

Read the Entity Linking documentation →

Triage, Pairwise, Coreference, and More

Six additional annotation types round out the v2.2 schema additions:

  • Triage — Accept/reject/skip interface for rapid data screening with auto-advance and keyboard shortcuts
  • Pairwise Comparison — Binary A/B or scale slider for preference learning and RLHF data collection
  • Conversation Trees — Hierarchical tree annotation with per-node ratings and path selection
  • Coreference Chains — Group coreferring mentions into chains with visual indicators
  • Segmentation Masks — New fill, eraser, and brush tools for pixel-level image annotation
  • Discontinuous Spansallow_discontinuous: true for non-contiguous text selections

Intelligent Annotation

MACE Competence Estimation

MACE uses a Variational Bayes EM algorithm to jointly estimate true labels and annotator competence scores (0.0-1.0). It identifies reliable annotators, detects spammers, and produces higher-quality predicted labels.

yaml
mace:
  enabled: true
  trigger_every_n: 10
  min_annotations_per_item: 3

MACE runs automatically in the background and integrates with the admin dashboard and adjudication system.

Read the MACE documentation →

Option Highlighting

A new AI feature that analyzes content to highlight the most likely correct options for discrete annotation tasks. Top-k options display at full opacity with a star indicator while less-likely options are dimmed.

yaml
ai_support:
  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4

Read the Option Highlighting documentation →

Diversity Ordering

Sentence-transformer embeddings cluster similar items together, then round-robin sampling presents items from different clusters. This reduces annotator fatigue and improves coverage of the topic space.

yaml
assignment_strategy: diversity_clustering
diversity_ordering:
  enabled: true
  prefill_count: 100

Read the Diversity Ordering documentation →

Export System

The new export CLI (python -m potato.export) converts annotations to 6 industry-standard formats with a single command:

bash
python -m potato.export --config config.yaml --format coco --output ./export/
python -m potato.export --config config.yaml --format yolo --output ./export/
python -m potato.export --config config.yaml --format conll_2003 --output ./export/

Supported formats: COCO, YOLO, Pascal VOC, CoNLL-2003, CoNLL-U, and Segmentation Masks. The system is extensible — create custom exporters by subclassing BaseExporter.

Read the Export Formats documentation →

Remote Data Sources

Load annotation data from URLs, S3, Google Drive, Dropbox, Hugging Face datasets, Google Sheets, and SQL databases:

yaml
data_sources:
  - type: huggingface
    dataset: "squad"
    split: "train"
 
  - type: s3
    bucket: "my-annotation-data"
    key: "datasets/items.jsonl"

Includes partial/incremental loading for large datasets, local caching, and secure credential management with environment variables.

Read the Remote Data Sources documentation →

Survey Instruments

A library of 55 validated questionnaires ready to use in prestudy and poststudy phases:

yaml
phases:
  prestudy:
    type: prestudy
    instrument: "tipi"      # 10-item personality questionnaire
 
  poststudy:
    type: poststudy
    instrument: "phq-9"     # 9-item depression screening

Instruments span 8 categories: Personality (BFI-2, TIPI), Mental Health (PHQ-9, GAD-7), Affect (PANAS), Self-Concept (RSE), Social Attitudes (SDO-7, MFQ), Response Style, Short-Form versions, and Demographic Batteries from major surveys (ANES, GSS, ESS).

Read the Survey Instruments documentation →

UX Improvements

  • Video object tracking with keyframe interpolation
  • Bounding box annotation on PDF pages
  • External AI config file support
  • Form layout grid improvements

Upgrading to v2.2

bash
pip install --upgrade potato-annotation

Existing v2.0 and v2.1 configurations work without changes — all new features are opt-in through additional config blocks.

Getting Started


Have questions or feedback? Join our Discord or open an issue on GitHub.