다단계 워크플로

Potato에서 다단계 주석 워크플로를 구축합니다 — 훈련 단계, 주석 작업, 맞춤 설문 페이지를 동의서 및 조건부 분기와 결합합니다.

Potato 2.0은 동의, 사전 연구 설문, 안내, 훈련, 주석, 사후 연구 피드백을 포함한 여러 순차 단계로 구성된 구조화된 주석 워크플로를 지원합니다.

설문 흐름 주석 워크플로 — 순차적 단계가 핵심 주석 작업을 감쌉니다. 주석을 제외한 모든 단계는 선택입니다 설문 흐름 주석 워크플로

사용 가능한 단계

단계	설명
`consent`	사전 동의 수집
`prestudy`	사전 주석 설문(인구통계, 선별)
`instructions`	작업 지침과 정보
`training`	피드백이 있는 연습 문항
`annotation`	주요 주석 작업(항상 필수)
`poststudy`	사후 주석 설문과 피드백

기본 구성

구성에서 phases 섹션을 사용합니다:

yaml

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

설문 문항 유형

설문 단계는 다음 문항 유형을 지원합니다:

라디오(단일 선택)

json

{
  "name": "experience",
  "type": "radio",
  "description": "How much annotation experience do you have?",
  "labels": ["None", "Some (< 10 hours)", "Moderate", "Extensive"],
  "required": true
}

체크박스/다중 선택

json

{
  "name": "languages",
  "type": "checkbox",
  "description": "What languages do you speak fluently?",
  "labels": ["English", "Spanish", "French", "German", "Chinese", "Other"]
}

텍스트 입력

json

{
  "name": "occupation",
  "type": "text",
  "description": "What is your occupation?",
  "required": true
}

숫자 입력

json

{
  "name": "years_experience",
  "type": "number",
  "description": "Years of professional experience",
  "min": 0,
  "max": 50
}

리커트 척도

json

{
  "name": "familiarity",
  "type": "likert",
  "description": "How familiar are you with this topic?",
  "size": 5,
  "min_label": "Not familiar",
  "max_label": "Very familiar"
}

드롭다운 선택

json

{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "labels": ["USA", "Canada", "UK", "Germany", "France", "Other"]
}

동의 단계

시작 전에 사전 동의를 수집합니다:

yaml

phases:
  consent:
    enabled: true
    data_file: "data/consent.json"

consent.json:

json

[
  {
    "name": "consent_agreement",
    "type": "radio",
    "description": "I have read and understood the research consent form and agree to participate.",
    "labels": ["I agree", "I do not agree"],
    "right_label": "I agree",
    "required": true
  }
]

right_label 필드는 진행에 필요한 답변을 지정합니다.

사전 연구 설문

인구통계 또는 선별 문항을 수집합니다:

yaml

phases:
  prestudy:
    enabled: true
    data_file: "data/demographics.json"

demographics.json:

json

[
  {
    "name": "age_range",
    "type": "radio",
    "description": "What is your age range?",
    "labels": ["18-24", "25-34", "35-44", "45-54", "55+"],
    "required": true
  },
  {
    "name": "education",
    "type": "radio",
    "description": "Highest level of education completed",
    "labels": ["High school", "Bachelor's degree", "Master's degree", "Doctoral degree", "Other"],
    "required": true
  },
  {
    "name": "english_native",
    "type": "radio",
    "description": "Is English your native language?",
    "labels": ["Yes", "No"],
    "required": true
  }
]

안내 단계

작업 지침을 표시합니다:

yaml

phases:
  instructions:
    enabled: true
    content: "data/instructions.html"

또는 인라인 콘텐츠를 사용합니다:

yaml

phases:
  instructions:
    enabled: true
    inline_content: |
      <h2>Task Instructions</h2>
      <p>In this task, you will classify the sentiment of product reviews.</p>
      <ul>
        <li><strong>Positive:</strong> Expresses satisfaction or praise</li>
        <li><strong>Negative:</strong> Expresses dissatisfaction or criticism</li>
        <li><strong>Neutral:</strong> Factual or mixed sentiment</li>
      </ul>

훈련 단계

피드백이 있는 연습 문항(자세한 내용은 훈련 단계 참조):

yaml

phases:
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    show_explanations: true

사후 연구 설문

주석 이후에 피드백을 수집합니다:

yaml

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"

feedback.json:

json

[
  {
    "name": "difficulty",
    "type": "likert",
    "description": "How difficult was this task?",
    "size": 5,
    "min_label": "Very easy",
    "max_label": "Very difficult"
  },
  {
    "name": "clarity",
    "type": "likert",
    "description": "How clear were the instructions?",
    "size": 5,
    "min_label": "Very unclear",
    "max_label": "Very clear"
  },
  {
    "name": "suggestions",
    "type": "text",
    "description": "Any suggestions for improvement?",
    "textarea": true,
    "required": false
  }
]

내장 템플릿

Potato에는 일반적인 설문 문항을 위한 사전 정의된 라벨 세트가 포함되어 있습니다:

템플릿	라벨
`countries`	국가 목록
`languages`	일반 언어
`ethnicity`	민족 선택지
`religion`	종교 선택지

문항에서 템플릿을 사용합니다:

json

{
  "name": "country",
  "type": "select",
  "description": "Select your country",
  "template": "countries"
}

자유 응답 필드

구조화된 문항과 함께 선택적 텍스트 입력을 추가합니다:

json

{
  "name": "topics",
  "type": "checkbox",
  "description": "Which topics interest you?",
  "labels": ["Technology", "Sports", "Politics", "Entertainment"],
  "free_response": true,
  "free_response_label": "Other (please specify)"
}

페이지 헤더

설문 섹션 헤더를 맞춤 설정합니다:

json

{
  "page_header": "Demographics Survey",
  "questions": [
    {"name": "age", "type": "radio", ...},
    {"name": "gender", "type": "radio", ...}
  ]
}

전체 예시

yaml

task_name: "Sentiment Analysis Study"
task_dir: "."
port: 8000
 
# Data configuration
data_files:
  - "data/reviews.json"
 
item_properties:
  id_key: id
  text_key: text
 
# Annotation scheme
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this review?"
    labels:
      - Positive
      - Negative
      - Neutral
    sequential_key_binding: true
 
# Multi-phase workflow
phases:
  consent:
    enabled: true
    data_file: "data/consent.json"
 
  prestudy:
    enabled: true
    data_file: "data/demographics.json"
 
  instructions:
    enabled: true
    content: "data/instructions.html"
 
  training:
    enabled: true
    data_file: "data/training.json"
    schema_name: sentiment
    passing_criteria:
      min_correct: 8
      total_questions: 10
    retries:
      enabled: true
      max_retries: 2
    show_explanations: true
 
  # annotation phase is always enabled
 
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
 
# Output
output_annotation_dir: "output/"
output_annotation_format: "json"
 
# User access
allow_all_users: true

레거시 구성

이전의 surveyflow 구성 형식은 하위 호환성을 위해 여전히 지원됩니다:

yaml

surveyflow:
  enabled: true
  phases:
    - name: pre_survey
      type: survey
      questions: survey_questions.json
    - name: main_annotation
      type: annotation

다만 새 프로젝트에서는 새로운 phases 형식으로 마이그레이션할 것을 권장합니다.

모범 사례

1. 설문을 간결하게 유지하세요

긴 설문은 완료율을 낮춥니다. 필수 문항에만 집중하세요.

2. 복잡한 작업에는 훈련을 사용하세요

훈련 단계는 특히 미묘한 작업에서 주석 품질을 향상합니다.

3. 합리적인 통과 기준을 설정하세요

yaml

# Too strict - may exclude good annotators
passing_criteria:
  require_all_correct: true
 
# Better - allows for learning
passing_criteria:
  min_correct: 8
  total_questions: 10

4. 명확한 지침을 제공하세요

기대 사항을 명확히 하기 위해 안내 단계에 예시를 포함하세요.

5. 전체 흐름을 테스트하세요

배포 전에 전체 워크플로를 직접 완료해 문제를 잡아내세요.

6. 필수 필드를 신중하게 사용하세요

필수적인 경우에만 문항을 필수로 표시하세요 - 선택 문항이 더 나은 응답 품질을 얻습니다.

크라우드소싱 통합

Prolific 또는 MTurk의 경우 완료 코드를 구성합니다:

yaml

phases:
  poststudy:
    enabled: true
    data_file: "data/feedback.json"
    show_completion_code: true
    completion_code_format: "POTATO-{user_id}-{timestamp}"

자세한 내용은 크라우드소싱을 참조하세요.