Configuration Basics

Learn the fundamentals of Potato configuration files.

Configuration Basics

Potato uses YAML configuration files to define annotation tasks. This guide covers the essential configuration options.

Configuration File Structure

A basic Potato configuration has these main sections:

yaml

# Task settings
annotation_task_name: "My Annotation Task"
port: 8000
 
# Data configuration
data_files:
  - data.json
 
item_properties:
  id_key: id
  text_key: text
 
# Output settings
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
 
# Annotation schemes
annotation_schemes:
  - annotation_type: radio
    name: my_annotation
    labels:
      - Label 1
      - Label 2
 
# User settings
allow_all_users: true

Essential Settings

Task and Server Configuration

yaml

annotation_task_name: "My Task"  # Display name for your task
port: 8000                       # Port to run the server on

Data Configuration

yaml

data_files:
  - data.json           # Path to your data file(s)
  - more_data.json      # You can specify multiple files
 
item_properties:
  id_key: id            # Field containing unique ID
  text_key: text        # Field containing text to annotate

Supported data formats:

JSON (.json)
JSON Lines (.jsonl)
CSV (.csv)
TSV (.tsv)

Output Configuration

yaml

output_annotation_dir: "annotation_output/"   # Directory for annotation files
output_annotation_format: "json"              # Format: json, jsonl, csv, tsv

Annotation Schemes

Define one or more annotation schemes:

yaml

annotation_schemes:
  - annotation_type: radio      # Type of annotation
    name: sentiment             # Internal name
    description: "Select the sentiment"  # Instructions
    labels:                     # Options for annotators
      - Positive
      - Negative
      - Neutral

Available Annotation Types

Type	Description
`radio`	Single choice selection
`multiselect`	Multiple choice selection
`likert`	Rating on a scale
`text`	Free text input
`number`	Numeric input
`span`	Text span highlighting
`slider`	Continuous range selection
`multirate`	Rate multiple items

User Configuration

Allow all users

yaml

allow_all_users: true

Restrict to specific users

yaml

allow_all_users: false
authorized_users:
  - user1@example.com
  - user2@example.com

Task Directory

The task_dir setting defines the root directory for relative paths:

yaml

task_dir: ./my-task/
data_files:
  - data/input.json    # Resolves to ./my-task/data/input.json

Full Example

Here's a complete configuration for a sentiment analysis task:

yaml

# config.yaml
annotation_task_name: "Sentiment Analysis"
port: 8000
task_dir: ./
 
# Data
data_files:
  - data/tweets.json
 
item_properties:
  id_key: id
  text_key: text
  context_key: metadata
 
# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
 
# Annotation
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment expressed in this tweet?"
    labels:
      - name: Positive
        key_value: "1"
      - name: Negative
        key_value: "2"
      - name: Neutral
        key_value: "3"
    sequential_key_binding: true
 
# Users
allow_all_users: true
 
# Assignment
instances_per_annotator: 100
annotation_per_instance: 2

Next Steps

Explore Data Formats in detail
Learn about Annotation Schemes
Customize the interface with UI Configuration
Use the Preview CLI to validate your configuration