Docs/Getting Started

Configuration Basics

Learn the fundamentals of Potato configuration files.

Configuration Basics

Potato uses YAML configuration files to define annotation tasks. This guide covers the essential configuration options.

Configuration File Structure

A basic Potato configuration has these main sections:

# Server settings
port: 8000
server_name: localhost
task_name: "My Annotation Task"
 
# Data configuration
data_files:
  - data.json
id_key: id
text_key: text
 
# Output settings
output_file: annotations.json
 
# Annotation schemes
annotation_schemes:
  - annotation_type: radio
    name: my_annotation
    labels:
      - Label 1
      - Label 2
 
# User settings
allow_all_users: true

Essential Settings

Server Configuration

port: 8000              # Port to run the server on
server_name: localhost  # Server hostname
task_name: "My Task"    # Display name for your task

Data Configuration

data_files:
  - data.json           # Path to your data file(s)
  - more_data.json      # You can specify multiple files
 
id_key: id              # Field containing unique ID
text_key: text          # Field containing text to annotate

Supported data formats:

  • JSON (.json)
  • JSON Lines (.jsonl)
  • CSV (.csv)
  • TSV (.tsv)

Output Configuration

output_file: annotations.json   # Where to save annotations

Annotation Schemes

Define one or more annotation schemes:

annotation_schemes:
  - annotation_type: radio      # Type of annotation
    name: sentiment             # Internal name
    description: "Select the sentiment"  # Instructions
    labels:                     # Options for annotators
      - Positive
      - Negative
      - Neutral

Available Annotation Types

TypeDescription
radioSingle choice selection
multiselectMultiple choice selection
likertRating on a scale
textFree text input
numberNumeric input
spanText span highlighting
sliderContinuous range selection
multirateRate multiple items

User Configuration

Allow all users

allow_all_users: true

Restrict to specific users

allow_all_users: false
authorized_users:
  - user1@example.com
  - user2@example.com

Task Directory

The task_dir setting defines the root directory for relative paths:

task_dir: ./my-task/
data_files:
  - data/input.json    # Resolves to ./my-task/data/input.json

Full Example

Here's a complete configuration for a sentiment analysis task:

# config.yaml
port: 8000
server_name: localhost
task_name: "Sentiment Analysis"
task_dir: ./
 
# Data
data_files:
  - data/tweets.json
id_key: id
text_key: text
context_key: metadata
 
# Output
output_file: output/annotations.json
 
# Annotation
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment expressed in this tweet?"
    labels:
      - Positive
      - Negative
      - Neutral
    keyboard_shortcuts:
      Positive: "1"
      Negative: "2"
      Neutral: "3"
 
# Users
allow_all_users: true
 
# Assignment
instances_per_annotator: 100
annotation_per_instance: 2

Next Steps