Multi-Phase Workflows

Build complex annotation workflows with surveys, training, and branching logic.

Multi-Phase Workflows

Potato 2.0 supports structured annotation workflows with multiple sequential phases including consent, pre-study surveys, instructions, training, annotation, and post-study feedback.

Available Phases

Phase	Description
`consent`	Informed consent collection
`prestudy`	Pre-annotation surveys (demographics, screening)
`instructions`	Task guidelines and information
`training`	Practice questions with feedback
`annotation`	Main annotation task (always required)
`poststudy`	Post-annotation surveys and feedback

Basic Configuration

Use the phases section in your configuration:

yaml

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

Survey Question Types

Survey phases support these question types:

Radio (Single Choice)

json

{
  "name": "experience",
  "type": "radio",
  "description": "How much annotation experience do you have?",
  "labels": ["None", "Some (< 10 hours)", "Moderate", "Extensive"],
  "required": true
}

Checkbox/Multiselect

json

{
  "name": "languages",
  "type": "checkbox",
  "description": "What languages do you speak fluently?",
  "labels": ["English", "Spanish", "French", "German", "Chinese", "Other"]
}

Text Input

json

{
  "name": "occupation",
  "type": "text",
  "description": "What is your occupation?",
  "required": true
}

Number Input

json

{
  "name": "years_experience",
  "type": "number",
  "description": "Years of professional experience",
  "min": 0,
  "max": 50
}

Likert Scale

json

{
  "name": "familiarity",
  "type": "likert",
  "description": "How familiar are you with this topic?",
  "size": 5,
  "min_label": "Not familiar",
  "max_label": "Very familiar"
}

json

{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "labels": ["USA", "Canada", "UK", "Germany", "France", "Other"]
}

Collect informed consent before starting:

yaml

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"

consent.json:

json

[
  {
    "name": "consent_agreement",
    "type": "radio",
    "description": "I have read and understood the research consent form and agree to participate.",
    "labels": ["I agree", "I do not agree"],
    "right_label": "I agree",
    "required": true
  }
]

The right_label field specifies the required answer to proceed.

Pre-Study Surveys

Collect demographics or screening questions:

yaml

phases:
  prestudy:
    enabled: true
    data_file: "data/demographics.json"

demographics.json:

json

[
  {
    "name": "age_range",
    "type": "radio",
    "description": "What is your age range?",
    "labels": ["18-24", "25-34", "35-44", "45-54", "55+"],
    "required": true
  },
  {
    "name": "education",
    "type": "radio",
    "description": "Highest level of education completed",
    "labels": ["High school", "Bachelor's degree", "Master's degree", "Doctoral degree", "Other"],
    "required": true
  },
  {
    "name": "english_native",
    "type": "radio",
    "description": "Is English your native language?",
    "labels": ["Yes", "No"],
    "required": true
  }
]

Instructions Phase

Display task instructions:

yaml

phases:
  instructions:
    enabled: true
    content: "data/instructions.html"

Or use inline content:

yaml

phases:
  instructions:
    enabled: true
    inline_content: |
      <h2>Task Instructions</h2>
      <p>In this task, you will classify the sentiment of product reviews.</p>
      <ul>
        <li><strong>Positive:</strong> Expresses satisfaction or praise</li>
        <li><strong>Negative:</strong> Expresses dissatisfaction or criticism</li>
        <li><strong>Neutral:</strong> Factual or mixed sentiment</li>
      </ul>

Training Phase

Practice questions with feedback (see Training Phase for details):

yaml

phases:
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    show_explanations: true

Post-Study Surveys

Collect feedback after annotation:

yaml

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

feedback.json:

json

[
  {
    "name": "difficulty",
    "type": "likert",
    "description": "How difficult was this task?",
    "size": 5,
    "min_label": "Very easy",
    "max_label": "Very difficult"
  },
  {
    "name": "clarity",
    "type": "likert",
    "description": "How clear were the instructions?",
    "size": 5,
    "min_label": "Very unclear",
    "max_label": "Very clear"
  },
  {
    "name": "suggestions",
    "type": "text",
    "description": "Any suggestions for improvement?",
    "textarea": true,
    "required": false
  }
]

Built-in Templates

Potato includes predefined label sets for common survey questions:

Template	Labels
`countries`	List of countries
`languages`	Common languages
`ethnicity`	Ethnicity options
`religion`	Religion options

Use templates in your questions:

json

{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "template": "countries"
}

Free Response Fields

Add optional text input alongside structured questions:

json

{
  "name": "topics",
  "type": "checkbox",
  "description": "Which topics interest you?",
  "labels": ["Technology", "Sports", "Politics", "Entertainment"],
  "free_response": true,
  "free_response_label": "Other (please specify)"
}

Page Headers

Customize survey section headers:

json

{
  "page_header": "Demographics Survey",
  "questions": [
    {"name": "age", "type": "radio", ...},
    {"name": "gender", "type": "radio", ...}
  ]
}

Complete Example

yaml

task_name: "Sentiment Analysis Study"
task_dir: "."
port: 8000
 
# Data configuration
data_files:
  - "data/reviews.json"
 
item_properties:
  id_key: id
  text_key: text
 
# Annotation scheme
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this review?"
    labels:
      - Positive
      - Negative
      - Neutral
    sequential_key_binding: true
 
# Multi-phase workflow
phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    retries:
      enabled: true
      max_retries: 2
    show_explanations: true
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
 
# Output
output_annotation_dir: "output/"
output_annotation_format: "json"
 
# User access
allow_all_users: true

Legacy Configuration

The older surveyflow configuration format is still supported for backward compatibility:

yaml

surveyflow:
  enabled: true
  phases:
    - name: pre_survey
      type: survey
      questions: survey_questions.json
    - name: main_annotation
      type: annotation

However, we recommend migrating to the new phases format for new projects.

Best Practices

1. Keep Surveys Concise

Long surveys reduce completion rates. Focus on essential questions only.

2. Use Training for Complex Tasks

Training phases improve annotation quality, especially for nuanced tasks.

3. Set Reasonable Passing Criteria

yaml

# Too strict - may exclude good annotators
passing_criteria:
  require_all_correct: true
 
# Better - allows for learning
passing_criteria:
  min_correct: 8
  total_questions: 10

4. Provide Clear Instructions

Include examples in your instructions phase to clarify expectations.

5. Test the Full Flow

Complete the entire workflow yourself before deployment to catch issues.

6. Use Required Fields Wisely

Only mark questions as required if they're essential - optional questions get better response quality.

Crowdsourcing Integration

For Prolific or MTurk, configure completion codes:

yaml

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
    show_completion_code: true
    completion_code_format: "POTATO-{user_id}-{timestamp}"

See Crowdsourcing for more details.

Multi-Phase Workflows

Multi-Phase Workflows

Available Phases

Basic Configuration

Survey Question Types

Radio (Single Choice)

Checkbox/Multiselect

Text Input

Number Input

Likert Scale

Dropdown Select

Consent Phase

Pre-Study Surveys

Instructions Phase

Training Phase

Post-Study Surveys

Built-in Templates

Free Response Fields

Page Headers

Complete Example

Legacy Configuration

Best Practices

1. Keep Surveys Concise

2. Use Training for Complex Tasks

3. Set Reasonable Passing Criteria

4. Provide Clear Instructions

5. Test the Full Flow

6. Use Required Fields Wisely

Crowdsourcing Integration