Configuration Basics
Learn Potato's YAML configuration format — task settings, data file paths, annotation schemes, output formats, and user management essentials.
Configuration Basics
Potato uses YAML configuration files to define annotation tasks. This guide covers the essential configuration options.
Configuration File Structure
A basic Potato configuration has these main sections:
yaml
# Task settings
annotation_task_name: "My Annotation Task"
port: 8000
# Data configuration
data_files:
- data.json
item_properties:
id_key: id
text_key: text
# Output settings
output_annotation_dir: "annotation_output/"
export_annotation_format: "json"
# Annotation schemes
annotation_schemes:
- annotation_type: radio
name: my_annotation
labels:
- Label 1
- Label 2
# User settings
user_config:
allow_all_users: trueEssential Settings
Task and Server Configuration
yaml
annotation_task_name: "My Task" # Display name for your task
port: 8000 # Port to run the server onData Configuration
yaml
data_files:
- data.json # Path to your data file(s)
- more_data.json # You can specify multiple files
item_properties:
id_key: id # Field containing unique ID
text_key: text # Field containing text to annotateSupported data formats:
- JSON (
.json) - JSON Lines (
.jsonl) - CSV (
.csv) - TSV (
.tsv)
Output Configuration
yaml
output_annotation_dir: "annotation_output/" # Directory for annotation files
export_annotation_format: "json" # Format: json, jsonl, csv, tsvAnnotation Schemes
Define one or more annotation schemes:
yaml
annotation_schemes:
- annotation_type: radio # Type of annotation
name: sentiment # Internal name
description: "Select the sentiment" # Instructions
labels: # Options for annotators
- Positive
- Negative
- NeutralAvailable Annotation Types
| Type | Description |
|---|---|
radio | Single choice selection |
multiselect | Multiple choice selection |
likert | Rating on a scale |
text | Free text input |
number | Numeric input |
span | Text span highlighting |
slider | Continuous range selection |
multirate | Rate multiple items |
select | Dropdown single selection |
pairwise | Pairwise comparison |
best_worst | Best-worst scaling |
soft_label | Soft label distribution |
confidence_annotation | Annotation with confidence |
constant_sum | Constant sum allocation |
range_slider | Range slider selection |
semantic_differential | Semantic differential scale |
hierarchical_multiselect | Hierarchical multi-level selection |
card_sort | Card sorting |
rubric_eval | Rubric-based evaluation |
extractive_qa | Extractive question answering |
error_span | Error span highlighting |
triage | Triage classification |
coreference | Coreference annotation |
span_link | Span linking |
entity_linking | Entity linking |
User Configuration
Allow all users
yaml
user_config:
allow_all_users: trueRestrict to specific users
yaml
user_config:
allow_all_users: false
authorized_users:
- user1@example.com
- user2@example.comTask Directory
The task_dir setting defines the root directory for relative paths:
yaml
task_dir: ./my-task/
data_files:
- data/input.json # Resolves to ./my-task/data/input.jsonFull Example
Here's a complete configuration for a sentiment analysis task:
yaml
# config.yaml
annotation_task_name: "Sentiment Analysis"
port: 8000
task_dir: ./
# Data
data_files:
- data/tweets.json
item_properties:
id_key: id
text_key: text
context_key: metadata
# Output
output_annotation_dir: "annotation_output/"
export_annotation_format: "json"
# Annotation
annotation_schemes:
- annotation_type: radio
name: sentiment
description: "What is the sentiment expressed in this tweet?"
labels:
- name: Positive
key_value: "1"
- name: Negative
key_value: "2"
- name: Neutral
key_value: "3"
sequential_key_binding: true
# Users
user_config:
allow_all_users: true
# Assignment
instances_per_annotator: 100
annotation_per_instance: 2Next Steps
- Explore Data Formats in detail
- Learn about Annotation Schemes
- Customize the interface with UI Configuration
- Use the Preview CLI to validate your configuration