Skip to content

Configuration Basics

Learn the fundamentals of Potato configuration files.

Configuration Basics

Potato uses YAML configuration files to define annotation tasks. This guide covers the essential configuration options.

Configuration File Structure

A basic Potato configuration has these main sections:

yaml
# Task settings
annotation_task_name: "My Annotation Task"
port: 8000
 
# Data configuration
data_files:
  - data.json
 
item_properties:
  id_key: id
  text_key: text
 
# Output settings
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
 
# Annotation schemes
annotation_schemes:
  - annotation_type: radio
    name: my_annotation
    labels:
      - Label 1
      - Label 2
 
# User settings
allow_all_users: true

Essential Settings

Task and Server Configuration

yaml
annotation_task_name: "My Task"  # Display name for your task
port: 8000                       # Port to run the server on

Data Configuration

yaml
data_files:
  - data.json           # Path to your data file(s)
  - more_data.json      # You can specify multiple files
 
item_properties:
  id_key: id            # Field containing unique ID
  text_key: text        # Field containing text to annotate

Supported data formats:

  • JSON (.json)
  • JSON Lines (.jsonl)
  • CSV (.csv)
  • TSV (.tsv)

Output Configuration

yaml
output_annotation_dir: "annotation_output/"   # Directory for annotation files
output_annotation_format: "json"              # Format: json, jsonl, csv, tsv

Annotation Schemes

Define one or more annotation schemes:

yaml
annotation_schemes:
  - annotation_type: radio      # Type of annotation
    name: sentiment             # Internal name
    description: "Select the sentiment"  # Instructions
    labels:                     # Options for annotators
      - Positive
      - Negative
      - Neutral

Available Annotation Types

TypeDescription
radioSingle choice selection
multiselectMultiple choice selection
likertRating on a scale
textFree text input
numberNumeric input
spanText span highlighting
sliderContinuous range selection
multirateRate multiple items

User Configuration

Allow all users

yaml
allow_all_users: true

Restrict to specific users

yaml
allow_all_users: false
authorized_users:
  - user1@example.com
  - user2@example.com

Task Directory

The task_dir setting defines the root directory for relative paths:

yaml
task_dir: ./my-task/
data_files:
  - data/input.json    # Resolves to ./my-task/data/input.json

Full Example

Here's a complete configuration for a sentiment analysis task:

yaml
# config.yaml
annotation_task_name: "Sentiment Analysis"
port: 8000
task_dir: ./
 
# Data
data_files:
  - data/tweets.json
 
item_properties:
  id_key: id
  text_key: text
  context_key: metadata
 
# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
 
# Annotation
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment expressed in this tweet?"
    labels:
      - name: Positive
        key_value: "1"
      - name: Negative
        key_value: "2"
      - name: Neutral
        key_value: "3"
    sequential_key_binding: true
 
# Users
allow_all_users: true
 
# Assignment
instances_per_annotator: 100
annotation_per_instance: 2

Next Steps