Docs/Features

Multi-Phase Workflows

Build complex annotation workflows with surveys, training, and branching logic.

Multi-Phase Workflows

Potato 2.0 supports structured annotation workflows with multiple sequential phases including consent, pre-study surveys, instructions, training, annotation, and post-study feedback.

Available Phases

PhaseDescription
consentInformed consent collection
prestudyPre-annotation surveys (demographics, screening)
instructionsTask guidelines and information
trainingPractice questions with feedback
annotationMain annotation task (always required)
poststudyPost-annotation surveys and feedback

Basic Configuration

Use the phases section in your configuration:

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

Survey Question Types

Survey phases support these question types:

Radio (Single Choice)

{
  "name": "experience",
  "type": "radio",
  "description": "How much annotation experience do you have?",
  "labels": ["None", "Some (< 10 hours)", "Moderate", "Extensive"],
  "required": true
}

Checkbox/Multiselect

{
  "name": "languages",
  "type": "checkbox",
  "description": "What languages do you speak fluently?",
  "labels": ["English", "Spanish", "French", "German", "Chinese", "Other"]
}

Text Input

{
  "name": "occupation",
  "type": "text",
  "description": "What is your occupation?",
  "required": true
}

Number Input

{
  "name": "years_experience",
  "type": "number",
  "description": "Years of professional experience",
  "min": 0,
  "max": 50
}

Likert Scale

{
  "name": "familiarity",
  "type": "likert",
  "description": "How familiar are you with this topic?",
  "size": 5,
  "min_label": "Not familiar",
  "max_label": "Very familiar"
}
{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "labels": ["USA", "Canada", "UK", "Germany", "France", "Other"]
}

Collect informed consent before starting:

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"

consent.json:

[
  {
    "name": "consent_agreement",
    "type": "radio",
    "description": "I have read and understood the research consent form and agree to participate.",
    "labels": ["I agree", "I do not agree"],
    "right_label": "I agree",
    "required": true
  }
]

The right_label field specifies the required answer to proceed.

Pre-Study Surveys

Collect demographics or screening questions:

phases:
  prestudy:
    enabled: true
    data_file: "data/demographics.json"

demographics.json:

[
  {
    "name": "age_range",
    "type": "radio",
    "description": "What is your age range?",
    "labels": ["18-24", "25-34", "35-44", "45-54", "55+"],
    "required": true
  },
  {
    "name": "education",
    "type": "radio",
    "description": "Highest level of education completed",
    "labels": ["High school", "Bachelor's degree", "Master's degree", "Doctoral degree", "Other"],
    "required": true
  },
  {
    "name": "english_native",
    "type": "radio",
    "description": "Is English your native language?",
    "labels": ["Yes", "No"],
    "required": true
  }
]

Instructions Phase

Display task instructions:

phases:
  instructions:
    enabled: true
    content: "data/instructions.html"

Or use inline content:

phases:
  instructions:
    enabled: true
    inline_content: |
      <h2>Task Instructions</h2>
      <p>In this task, you will classify the sentiment of product reviews.</p>
      <ul>
        <li><strong>Positive:</strong> Expresses satisfaction or praise</li>
        <li><strong>Negative:</strong> Expresses dissatisfaction or criticism</li>
        <li><strong>Neutral:</strong> Factual or mixed sentiment</li>
      </ul>

Training Phase

Practice questions with feedback (see Training Phase for details):

phases:
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    show_explanations: true

Post-Study Surveys

Collect feedback after annotation:

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

feedback.json:

[
  {
    "name": "difficulty",
    "type": "likert",
    "description": "How difficult was this task?",
    "size": 5,
    "min_label": "Very easy",
    "max_label": "Very difficult"
  },
  {
    "name": "clarity",
    "type": "likert",
    "description": "How clear were the instructions?",
    "size": 5,
    "min_label": "Very unclear",
    "max_label": "Very clear"
  },
  {
    "name": "suggestions",
    "type": "text",
    "description": "Any suggestions for improvement?",
    "textarea": true,
    "required": false
  }
]

Built-in Templates

Potato includes predefined label sets for common survey questions:

TemplateLabels
countriesList of countries
languagesCommon languages
ethnicityEthnicity options
religionReligion options

Use templates in your questions:

{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "template": "countries"
}

Free Response Fields

Add optional text input alongside structured questions:

{
  "name": "topics",
  "type": "checkbox",
  "description": "Which topics interest you?",
  "labels": ["Technology", "Sports", "Politics", "Entertainment"],
  "free_response": true,
  "free_response_label": "Other (please specify)"
}

Page Headers

Customize survey section headers:

{
  "page_header": "Demographics Survey",
  "questions": [
    {"name": "age", "type": "radio", ...},
    {"name": "gender", "type": "radio", ...}
  ]
}

Complete Example

task_name: "Sentiment Analysis Study"
task_dir: "."
port: 8000
 
# Data configuration
data_files:
  - "data/reviews.json"
 
item_properties:
  id_key: id
  text_key: text
 
# Annotation scheme
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this review?"
    labels:
      - Positive
      - Negative
      - Neutral
    sequential_key_binding: true
 
# Multi-phase workflow
phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    retries:
      enabled: true
      max_retries: 2
    show_explanations: true
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
 
# Output
output_annotation_dir: "output/"
output_annotation_format: "json"
 
# User access
allow_all_users: true

Legacy Configuration

The older surveyflow configuration format is still supported for backward compatibility:

surveyflow:
  enabled: true
  phases:
    - name: pre_survey
      type: survey
      questions: survey_questions.json
    - name: main_annotation
      type: annotation

However, we recommend migrating to the new phases format for new projects.

Best Practices

1. Keep Surveys Concise

Long surveys reduce completion rates. Focus on essential questions only.

2. Use Training for Complex Tasks

Training phases improve annotation quality, especially for nuanced tasks.

3. Set Reasonable Passing Criteria

# Too strict - may exclude good annotators
passing_criteria:
  require_all_correct: true
 
# Better - allows for learning
passing_criteria:
  min_correct: 8
  total_questions: 10

4. Provide Clear Instructions

Include examples in your instructions phase to clarify expectations.

5. Test the Full Flow

Complete the entire workflow yourself before deployment to catch issues.

6. Use Required Fields Wisely

Only mark questions as required if they're essential - optional questions get better response quality.

Crowdsourcing Integration

For Prolific or MTurk, configure completion codes:

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
    show_completion_code: true
    completion_code_format: "POTATO-{user_id}-{timestamp}"

See Crowdsourcing for more details.