# What's New

Source: https://www.potatoannotator.com/docs/getting-started/whats-new-v2

This page covers new features and improvements across Potato v2.x releases.

---

## Potato 2.3.0

*Released March 9, 2026*

Potato 2.3 is the largest release in Potato's history, introducing agentic annotation, Solo Mode, Best-Worst Scaling, SSO/OAuth authentication, Parquet export, 15 new demo projects, and security hardening.

### Agentic Annotation

A complete system for evaluating AI agents through human annotation. Includes 12 trace format converters, 3 specialized display types, and 9 pre-built annotation schemas.

**12 Trace Format Converters** — Import agent traces from OpenAI, Anthropic, SWE-bench, OpenTelemetry, MCP, CrewAI/AutoGen/LangGraph, LangChain, LangFuse, ReAct, WebArena/VisualWebArena, ATIF, and raw browser recordings. Auto-detection available.

```yaml
agentic:
  enabled: true
  trace_converter: react       # or openai, anthropic, webarena, auto, etc.
  trace_file: "data/traces.jsonl"
```

**3 Display Types:**

- **Agent Trace Display** — Color-coded step cards with collapsible observations, JSON pretty-printing, and timeline sidebar for tool-using agents
- **Web Agent Trace Display** — Full screenshots with SVG overlays showing click targets, text inputs, and scroll actions; filmstrip navigation for browsing agents
- **Interactive Chat Display** — Live chat mode (annotator interacts with agent via proxy) and trace review mode for conversational agents

**Per-Turn Ratings** — Rate individual steps alongside the overall trace for fine-grained evaluation.

**9 Pre-Built Schemas** — `agent_task_success`, `agent_step_correctness`, `agent_error_taxonomy`, `agent_safety`, `agent_efficiency`, `agent_instruction_following`, `agent_explanation_quality`, `agent_web_action_correctness`, `agent_conversation_quality`.

**Agent Proxy System** — OpenAI, HTTP, and echo proxies for live agent evaluation.

[Learn more about Agentic Annotation →](/docs/features/agentic-annotation)

---

### Solo Mode

A 12-phase intelligent workflow where a single human annotator collaborates with an LLM to label entire datasets, achieving 95%+ agreement with multi-annotator pipelines while requiring only 10-15% of total human labels.

**The 12 Phases:**

1. Seed Annotation — human labels 50 diverse instances
2. Initial LLM Calibration — LLM labels using seed examples
3. Confusion Analysis — identify systematic disagreement patterns
4. Guideline Refinement — LLM proposes, human approves updated guidelines
5. Labeling Function Generation — ALCHEmist-inspired programmatic rules
6. Active Labeling — human labels most informative instances
7. Automated Refinement Loop — iterative re-labeling with improved guidelines
8. Disagreement Exploration — human resolves LLM/LF conflicts
9. Edge Case Synthesis — LLM generates ambiguous examples for human labeling
10. Cascaded Confidence Escalation — human reviews lowest-confidence labels
11. Prompt Optimization — DSPy-inspired automated prompt search
12. Final Validation — random sample review

```yaml
solo_mode:
  enabled: true
  llm:
    endpoint_type: openai
    model: "gpt-4o"
    api_key: ${OPENAI_API_KEY}
  seed_count: 50
  accuracy_threshold: 0.92
```

**Multi-Signal Instance Prioritization** — 6 weighted pools (uncertain, disagreement, boundary, novel, error_pattern, random) for selecting the most valuable instances.

[Learn more about Solo Mode →](/docs/features/solo-mode)

---

### Best-Worst Scaling

Efficient comparative annotation where annotators select the best and worst items from tuples. Automatic tuple generation with balanced incomplete block designs and three scoring methods (Counting, Bradley-Terry, Plackett-Luce).

```yaml
annotation_schemes:
  - annotation_type: best_worst_scaling
    name: fluency
    items_key: "translations"
    tuple_size: 4
    best_label: "Most Fluent"
    worst_label: "Least Fluent"
    scoring:
      method: bradley_terry
```

[Learn more about Best-Worst Scaling →](/docs/annotation-types/best-worst-scaling)

---

### SSO & OAuth Authentication

Production-ready authentication with Google OAuth (domain restriction), GitHub OAuth (organization restriction), and generic OIDC (Okta, Azure AD, Auth0, Keycloak). Supports auto-registration, mixed mode, and session management.

```yaml
authentication:
  method: google_oauth
  google_oauth:
    client_id: ${GOOGLE_CLIENT_ID}
    client_secret: ${GOOGLE_CLIENT_SECRET}
    allowed_domains:
      - "umich.edu"
    auto_register: true
```

[Learn more about SSO & OAuth →](/docs/deployment/sso-oauth)

---

### Parquet Export

Export annotations to Apache Parquet format, producing three structured files: `annotations.parquet`, `spans.parquet`, and `items.parquet`. Supports snappy, gzip, zstd, lz4, and brotli compression, incremental export, and date/annotator partitioning. Compatible with pandas, DuckDB, PyArrow, Polars, and Hugging Face Datasets.

```yaml
parquet_export:
  enabled: true
  output_dir: "output/parquet/"
  compression: zstd
  auto_export: true
```

[Learn more about Parquet Export →](/docs/features/parquet-export)

---

### 15 New Demo Projects

New demos in `project-hub/` covering agentic annotation (5 demos), Solo Mode (3 demos), Best-Worst Scaling (3 demos), authentication (2 demos), and export workflows (2 demos). Start any demo with `potato start config.yaml`.

---

### Security Hardening

- Cryptographically secure session tokens with configurable expiration
- CSRF protection enabled by default
- Rate limiting on authentication endpoints
- Input sanitization for user-provided content
- Dependency audit with all packages updated
- Content Security Policy headers

---

### Other Improvements

- Custom trace converters for unsupported agent frameworks
- Hybrid Solo Mode with multi-annotator verification sampling
- BWS admin dashboard tab with score convergence charts
- Incremental Parquet export with date partitioning

---

### v2.2 vs v2.3 Comparison

| Feature | v2.2 | v2.3 |
|---------|------|------|
| Agentic Annotation | Not available | 12 converters, 3 displays, 9 schemas |
| Solo Mode | Not available | 12-phase human-LLM workflow |
| Best-Worst Scaling | Not available | BWS with 3 scoring methods |
| Authentication | Username only | + Google OAuth, GitHub OAuth, OIDC |
| Parquet Export | Not available | 3-file Parquet with 6 compression options |
| Demo Projects | 125+ | 140+ (15 new) |
| Security | Basic | CSRF, rate limiting, CSP, secure sessions |

---

## Potato 2.2.0

*Released February 20, 2026*

Potato 2.2 is a major feature release with 9 new annotation schemas, a pluggable export system, MACE competence estimation, 55 validated survey instruments, and remote data sources.

### New Annotation Schemas (9)

**Event Annotation** — N-ary event structures with trigger spans and typed argument roles. Annotate events like ATTACK, HIRE, and TRAVEL with constrained entity arguments and hub-spoke arc visualization.

```yaml
annotation_schemes:
  - annotation_type: event_annotation
    name: events
    span_schema: entities
    event_types:
      - type: "ATTACK"
        trigger_labels: ["EVENT_TRIGGER"]
        arguments:
          - role: "attacker"
            entity_types: ["PERSON", "ORGANIZATION"]
            required: true
```

[Learn more about Event Annotation →](/docs/annotation-types/event-annotation)

**Entity Linking** — Link span annotations to external knowledge bases (Wikidata, UMLS, custom REST APIs). Add an `entity_linking:` block to any span schema to enable KB search and linking.

[Learn more about Entity Linking →](/docs/annotation-types/entity-linking)

**Triage** — Prodigy-style accept/reject/skip interface for rapid data screening. Customizable labels, keyboard shortcuts, and auto-advance for high-throughput annotation.

[Learn more about Triage →](/docs/annotation-types/triage)

**Pairwise Comparison** — Compare two items with binary (click preferred tile) or scale (slider) modes. Supports `items_key`, `allow_tie`, `scale:` block with configurable range.

[Learn more about Pairwise Comparison →](/docs/annotation-types/pairwise-comparison)

**Conversation Trees** — Annotate hierarchical conversation structures with per-node ratings, path selection, and branch comparison.

[Learn more about Conversation Trees →](/docs/annotation-types/conversation-trees)

**Coreference Chains** — Group coreferring text mentions into chains with visual indicators. Supports entity types, singleton control, and multiple highlight modes.

[Learn more about Coreference Chains →](/docs/annotation-types/coreference)

**Segmentation Masks** — New `fill`, `eraser`, and `brush` tools for pixel-level image segmentation.

**Bounding Box for PDF/Documents** — Draw boxes on PDF pages for document annotation tasks.

**Discontinuous Spans** — `allow_discontinuous: true` enables selecting non-contiguous text segments as a single span.

---

### Intelligent Annotation

**MACE Competence Estimation** — Variational Bayes EM algorithm that jointly estimates true labels and annotator competence scores (0.0-1.0). Works with radio, likert, select, and multiselect schemas.

```yaml
mace:
  enabled: true
  trigger_every_n: 10
  min_annotations_per_item: 3
```

[Learn more about MACE →](/docs/features/mace)

**Option Highlighting** — LLM-based highlighting of likely correct options for discrete annotation tasks. Highlights top-k options with a star indicator while dimming less-likely options.

```yaml
ai_support:
  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4
```

[Learn more about Option Highlighting →](/docs/features/option-highlighting)

**Diversity Ordering** — Embedding-based clustering and round-robin sampling to ensure annotators see diverse content rather than similar items in sequence.

```yaml
assignment_strategy: diversity_clustering
diversity_ordering:
  enabled: true
  prefill_count: 100
```

[Learn more about Diversity Ordering →](/docs/features/diversity-ordering)

---

### Export System

A new pluggable export CLI (`python -m potato.export`) converts annotations to 6 industry-standard formats: COCO, YOLO, Pascal VOC, CoNLL-2003, CoNLL-U, and Segmentation Masks.

```bash
python -m potato.export --config config.yaml --format coco --output ./export/
```

[Learn more about Export Formats →](/docs/features/export-formats)

---

### Remote Data Sources

Load annotation data from URLs, S3, Google Drive, Dropbox, Hugging Face, Google Sheets, and SQL databases via the new `data_sources:` config block. Includes partial loading, caching, and credential management.

[Learn more about Remote Data Sources →](/docs/features/remote-data-sources)

---

### Survey Instruments

55 validated questionnaires across 8 categories (Personality, Mental Health, Affect, Self-Concept, Social Attitudes, Response Style, Short-Form, Demographics). Use in prestudy/poststudy phases with `instrument: "tipi"`.

[Learn more about Survey Instruments →](/docs/features/survey-instruments)

---

### Other Improvements

- Video object tracking with keyframe interpolation
- External AI config file support
- Form layout grid improvements
- Format handlers for PDF, Word, code, and spreadsheets

---

## Potato 2.1.0

*Released February 5, 2026*

Potato 2.1 introduces the instance display system, visual AI support, span linking, multi-field span annotation, and layout customization.

### Instance Display System

A new `instance_display` config block that separates content display from annotation. Display any combination of images, videos, audio, text, and dialogues alongside any annotation schemes.

```yaml
instance_display:
  fields:
    - key: image_url
      type: image
      display_options:
        max_width: 600
        zoomable: true
    - key: description
      type: text

annotation_schemes:
  - annotation_type: radio
    name: category
    labels: [nature, urban, people]
```

Supports 11 display types including `text`, `html`, `image`, `video`, `audio`, `dialogue`, `pairwise`, `code`, `spreadsheet`, `document`, and `pdf`.

[Learn more about Instance Display →](/docs/core-concepts/instance-display)

---

### Multi-Field Span Annotation

Span annotation schemes now support a `target_field` option to annotate across multiple text fields in the same instance.

```yaml
annotation_schemes:
  - annotation_type: span
    name: source_entities
    target_field: "source_text"
    labels: [PERSON, ORGANIZATION]

  - annotation_type: span
    name: summary_entities
    target_field: "summary"
    labels: [PERSON, ORGANIZATION]
```

[Learn more about Span Annotation →](/docs/annotation-types/span-annotation)

---

### Span Linking

A new `span_link` annotation type for creating typed relationships between annotated spans. Supports directed and undirected links, n-ary relationships, visual arc display, and label constraints.

```yaml
annotation_schemes:
  - annotation_type: span
    name: entities
    labels:
      - name: "PERSON"
        color: "#3b82f6"
      - name: "ORGANIZATION"
        color: "#22c55e"

  - annotation_type: span_link
    name: relations
    span_schema: entities
    link_types:
      - name: "WORKS_FOR"
        directed: true
        allowed_source_labels: ["PERSON"]
        allowed_target_labels: ["ORGANIZATION"]
        color: "#dc2626"
```

[Learn more about Span Linking →](/docs/annotation-types/span-linking)

---

### Visual AI Support

Four new vision endpoints for AI-powered image and video annotation assistance:

- **YOLO** — Fast local object detection
- **Ollama Vision** — Local vision-language models (LLaVA, Qwen-VL)
- **OpenAI Vision** — GPT-4o cloud vision
- **Anthropic Vision** — Claude with vision

Features include object detection, pre-annotation, classification, hints, scene detection, keyframe detection, and object tracking.

[Learn more about Visual AI Support →](/docs/features/visual-ai-support)

---

### Layout Customization

Create sophisticated custom visual layouts using HTML templates and CSS. Potato generates an editable layout file, or you can provide a fully custom template with grid layouts, color-coded options, and section styling.

```yaml
task_layout: layouts/custom_task_layout.html
```

Three example layouts included: content moderation, dialogue QA, and medical review.

[Learn more about Layout Customization →](/docs/features/layout-customization)

---

### Label Rationales

A fourth AI capability that generates balanced explanations for why each label might apply, helping annotators understand different classification perspectives.

```yaml
ai_support:
  features:
    rationales:
      enabled: true
```

[Learn more about AI Support →](/docs/features/ai-support)

---

### Other Improvements

- 50+ new tests for improved reliability
- Responsive design improvements
- Enhanced project-hub organization with layout examples
- Bug fixes across annotation types

### v2.0 vs v2.1 Comparison

| Feature | v2.0 | v2.1 |
|---------|------|------|
| Instance Display | Via annotation hacks | Dedicated `instance_display` block |
| Span Targets | Single text field | Multi-field with `target_field` |
| Span Linking | Not available | Full `span_link` type |
| Visual AI | Not available | YOLO, Ollama Vision, OpenAI Vision, Anthropic Vision |
| Layout Customization | Basic auto-generated | Auto-generated + custom templates |
| AI Capabilities | 3 (hints, keywords, suggestions) | 4 (+ rationales) |

---

## Potato 2.0

Potato 2.0 is a major release that adds AI assistance, active learning, training phases, multi-phase workflows, and a MySQL backend.

### AI Support

Integrate Large Language Models to assist annotators with intelligent hints, keyword highlighting, and label suggestions.

**Supported providers:**
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3, Claude 3.5)
- Google (Gemini)
- Ollama (local models)
- vLLM (self-hosted)

```yaml
ai_support:
  enabled: true
  endpoint_type: openai
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
  features:
    hints:
      enabled: true
    label_suggestions:
      enabled: true
```

[Learn more about AI Support →](/docs/features/ai-support)

---

### Audio Annotation

Full-featured audio annotation with waveform visualization powered by Peaks.js. Create segments, label time regions, and annotate speech with keyboard shortcuts.

**Key features:**
- Waveform visualization
- Segment creation and labeling
- Per-segment annotation questions
- 15+ keyboard shortcuts
- Server-side waveform caching

```yaml
annotation_schemes:
  - annotation_type: audio
    name: speakers
    mode: label
    labels:
      - Speaker A
      - Speaker B
```

[Learn more about Audio Annotation →](/docs/features/audio-annotation)

---

### Active Learning

Automatically prioritize annotation instances based on model uncertainty. Train classifiers on existing annotations and focus annotators on the most informative examples.

**Capabilities:**
- Multiple classifier options (LogisticRegression, RandomForest, SVC, MultinomialNB)
- Various vectorizers (TF-IDF, Count, Hashing)
- Model persistence across restarts
- LLM-enhanced selection
- Multi-schema support

```yaml
active_learning:
  enabled: true
  schema_names:
    - sentiment
  min_instances_for_training: 30
  update_frequency: 50
  classifier:
    type: LogisticRegression
```

[Learn more about Active Learning →](/docs/features/active-learning)

---

### Training Phase

Qualify annotators with practice questions before the main task. Provide immediate feedback and ensure quality through configurable passing criteria.

**Features:**
- Practice questions with known answers
- Immediate feedback and explanations
- Configurable passing criteria
- Retry options
- Progress tracking in admin dashboard

```yaml
phases:
  training:
    enabled: true
    data_file: "data/training.json"
    passing_criteria:
      min_correct: 8
      total_questions: 10
```

[Learn more about Training Phase →](/docs/features/training-phase)

---

### Enhanced Admin Dashboard

Comprehensive monitoring and management interface for annotation tasks.

**Dashboard tabs:**
- **Overview**: High-level metrics and completion rates
- **Annotators**: Performance tracking, timing analysis
- **Instances**: Browse data with disagreement scores
- **Configuration**: Real-time settings adjustment

```yaml
admin_api_key: ${ADMIN_API_KEY}
```

[Learn more about Admin Dashboard →](/docs/features/admin-dashboard)

---

### Database Backend

MySQL support for large-scale deployments with connection pooling and transaction support.

```yaml
database:
  type: mysql
  host: localhost
  database: potato_db
  user: ${DB_USER}
  password: ${DB_PASSWORD}
```

Potato automatically creates required tables on first startup.

---

### Annotation History

Complete tracking of all annotation changes with timestamps, user IDs, and action types. Enables auditing and behavioral analysis.

```json
{
  "history": [
    {
      "timestamp": "2024-01-15T10:30:00Z",
      "user": "annotator_1",
      "action": "create",
      "schema": "sentiment",
      "value": "Positive"
    }
  ]
}
```

---

### Multi-Phase Workflows

Build complex annotation workflows with multiple sequential phases:

1. **Consent** - Informed consent collection
2. **Pre-study** - Demographics and screening
3. **Instructions** - Task guidelines
4. **Training** - Practice questions
5. **Annotation** - Main task
6. **Post-study** - Feedback surveys

```yaml
phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
  training:
    enabled: true
    data_file: "data/training.json"
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
```

[Learn more about Multi-Phase Workflows →](/docs/features/surveyflow)

---

## v2.0 Configuration Changes

### New Configuration Structure

Potato 2.0 uses a cleaner configuration format:

**v1 (old):**
```yaml
data_files:
  - data.json
id_key: id
text_key: text
output_file: annotations.json
```

**v2 (new):**
```yaml
data_files:
  - "data/data.json"

item_properties:
  id_key: id
  text_key: text

output_annotation_dir: "output/"
output_annotation_format: "json"
```

### Security Requirement

Configuration files must now be located within the `task_dir`:

```yaml
# Valid - config.yaml is in the project directory
task_dir: "."

# Valid - config in configs/ subdirectory
task_dir: "my_project/"
```

---

## Quick Comparison

| Feature | v1 | v2.0 | v2.1 | v2.2 | v2.3 |
|---------|----|----|------|------|------|
| AI/LLM Support | No | Yes | Yes + Visual AI + Rationales | + Option Highlighting | + Solo Mode |
| Agentic Annotation | No | No | No | No | 12 converters, 3 displays |
| Best-Worst Scaling | No | No | No | No | Yes (3 scoring methods) |
| Audio Annotation | Basic | Full waveform | Full waveform | Full waveform | Full waveform |
| Active Learning | No | Yes | Yes | Yes + Diversity Ordering | + Solo Mode integration |
| Instance Display | No | No | Yes | Yes | Yes |
| Span Linking | No | No | Yes | Yes | Yes |
| Event Annotation | No | No | No | Yes | Yes |
| Entity Linking | No | No | No | Yes | Yes |
| Pairwise/Triage/Coreference/Trees | No | No | No | Yes | Yes |
| Layout Customization | No | Auto-generated | Auto + Custom templates | Auto + Custom templates | Auto + Custom templates |
| Training Phase | No | Yes | Yes | Yes | Yes |
| Admin Dashboard | Basic | Enhanced | Enhanced | Enhanced + MACE | + BWS tab, Solo Mode |
| Database Backend | File only | File + MySQL | File + MySQL | File + MySQL | File + MySQL |
| Export CLI | No | No | No | Yes (COCO, YOLO, CoNLL, etc.) | + Parquet |
| Authentication | Username | Username | Username | Username | + Google/GitHub OAuth, OIDC |
| Survey Instruments | No | No | No | 55 validated questionnaires | 55 validated questionnaires |
| Remote Data Sources | No | No | No | S3, GDrive, HuggingFace, etc. | S3, GDrive, HuggingFace, etc. |

---

## Migration Guide

### Updating Your Configuration (v1 to v2)

1. **Data configuration**
   ```yaml
   # Old
   id_key: id
   text_key: text

   # New
   item_properties:
     id_key: id
     text_key: text
   ```

2. **Output configuration**
   ```yaml
   # Old
   output_file: annotations.json

   # New
   output_annotation_dir: "output/"
   output_annotation_format: "json"
   ```

3. **Config file location**
   Ensure your config file is inside the project directory.

### Starting the Server

```bash
# v2 command
python -m potato start config.yaml -p 8000

# Or shorthand
potato start config.yaml
```

---

## Getting Started

Ready to try Potato? Start with the [Quick Start Guide](/docs/getting-started/quick-start) or explore specific features:

**v2.3 Features:**
- [Agentic Annotation](/docs/features/agentic-annotation) - Evaluate AI agents with 12 converters and 3 display types
- [Solo Mode](/docs/features/solo-mode) - Human-LLM collaborative labeling
- [Best-Worst Scaling](/docs/annotation-types/best-worst-scaling) - Comparative annotation with scoring
- [SSO & OAuth](/docs/deployment/sso-oauth) - Google, GitHub, and OIDC authentication
- [Parquet Export](/docs/features/parquet-export) - Columnar data export

**v2.2 Features:**
- [Event Annotation](/docs/annotation-types/event-annotation) - N-ary event structures
- [Entity Linking](/docs/annotation-types/entity-linking) - Knowledge base linking
- [Triage](/docs/annotation-types/triage) - Rapid data screening
- [Coreference Chains](/docs/annotation-types/coreference) - Entity coreference
- [Conversation Trees](/docs/annotation-types/conversation-trees) - Hierarchical dialogue annotation
- [MACE](/docs/features/mace) - Annotator competence estimation
- [Option Highlighting](/docs/features/option-highlighting) - AI-assisted option guidance
- [Diversity Ordering](/docs/features/diversity-ordering) - Embedding-based item ordering
- [Export Formats](/docs/features/export-formats) - Export CLI with 6 formats
- [Remote Data Sources](/docs/features/remote-data-sources) - Cloud data loading
- [Survey Instruments](/docs/features/survey-instruments) - 55 validated questionnaires

**v2.1 Features:**
- [Instance Display](/docs/core-concepts/instance-display) - Multi-modal content display
- [Visual AI Support](/docs/features/visual-ai-support) - AI for image and video annotation
- [Span Linking](/docs/annotation-types/span-linking) - Entity relationship annotation

**Core Features:**
- [AI Support](/docs/features/ai-support) - Intelligent annotation assistance
- [Active Learning](/docs/features/active-learning) - Smart instance prioritization
- [Audio Annotation](/docs/features/audio-annotation) - Waveform-based annotation
- [Training Phase](/docs/features/training-phase) - Annotator qualification
- [Admin Dashboard](/docs/features/admin-dashboard) - Monitoring and management
