任务分配

控制标注实例如何分配给标注者。

Potato 提供灵活的任务分配策略，控制标注实例如何分配给标注者。

概述

任务分配控制：

每个标注者看到哪些项目
每个标注者完成多少项目
每个项目获得多少标注
项目呈现的顺序

关键配置选项

选项	描述	默认值
`assignment_strategy`	分配项目的策略	`random`
`max_annotations_per_user`	每个标注者的最大项目数	无限制
`max_annotations_per_item`	每个项目的目标标注数	3

分配策略

随机分配

将项目随机分配给标注者，确保无偏分配。

yaml

assignment_strategy: random
max_annotations_per_item: 3

最适用于：顺序无关紧要的通用标注任务。

固定顺序分配

按数据集中出现的顺序分配项目。

yaml

assignment_strategy: fixed_order
max_annotations_per_item: 2

最适用于：标注者需要按特定顺序查看项目的任务。

最少标注优先

优先分配现有标注最少的项目，确保均匀分配。

yaml

assignment_strategy: least_annotated
max_annotations_per_item: 5

最适用于：确保所有项目在任何项目获得过多标注之前获得足够覆盖。

最大分歧优先

优先分配现有标注中分歧最大的项目。

yaml

assignment_strategy: max_diversity
max_annotations_per_item: 4

最适用于：质量控制和解决歧义项目。

主动学习分配

使用机器学习优先处理不确定的实例。

yaml

assignment_strategy: active_learning
 
active_learning:
  enabled: true
  schema_names: ["sentiment"]
  min_annotations_per_instance: 2
  min_instances_for_training: 20
  update_frequency: 10

详见主动学习的完整配置。

配置

现代配置（推荐）

yaml

# Strategy selection
assignment_strategy: random
 
# Limits
max_annotations_per_user: 10    # -1 for unlimited
max_annotations_per_item: 3     # -1 for unlimited
 
# Optional: nested configuration
assignment:
  strategy: random
  max_annotations_per_item: 3
  random_seed: 1234

旧版配置

旧的 automatic_assignment 配置仍然支持：

yaml

automatic_assignment:
  on: true
  output_filename: task_assignment.json
  sampling_strategy: random    # 'random' or 'ordered'
  labels_per_instance: 3       # Annotations per item
  instance_per_annotator: 5    # Items per annotator
  test_question_per_annotator: 0

测试问题

将注意力检查问题插入标注队列：

定义测试问题

在数据文件中的实例 ID 中添加 _testing：

csv

text,id
"This is test question 1",0_testing
"Regular item",dkjfd

或在 JSON 中：

json

[
  {"id": "0_testing", "text": "This is a test question"},
  {"id": "regular_001", "text": "Normal annotation item"}
]

配置

yaml

automatic_assignment:
  on: true
  test_question_per_annotator: 2  # Insert 2 test questions per annotator

示例配置

基本随机分配

yaml

annotation_task_name: "Sentiment Analysis"
assignment_strategy: random
max_annotations_per_user: 20
max_annotations_per_item: 3

质量导向分配

yaml

annotation_task_name: "Quality Annotation"
assignment_strategy: max_diversity
max_annotations_per_item: 5
max_annotations_per_user: 50

众包设置

yaml

annotation_task_name: "Crowdsourced Task"
assignment_strategy: random
max_annotations_per_user: 10
max_annotations_per_item: 3
 
# Crowdsourcing settings
hide_navbar: true
jumping_to_id_disabled: true
 
login:
  type: url_direct
  url_argument: workerId

主动学习设置

yaml

assignment_strategy: active_learning
 
active_learning:
  enabled: true
  schema_names: ["sentiment", "topic"]
  min_annotations_per_instance: 2
  min_instances_for_training: 20
  update_frequency: 10
  classifier_name: "sklearn.linear_model.LogisticRegression"
  vectorizer_name: "sklearn.feature_extraction.text.TfidfVectorizer"

管理员仪表板集成

通过管理员仪表板监控和调整分配设置：

导航到 /admin
转到配置标签页
修改：
- 每个用户的最大标注数
- 每个项目的最大标注数
- 分配策略

更改立即生效，无需重启服务器。

任务分配

概述

关键配置选项

分配策略

随机分配

固定顺序分配

最少标注优先

最大分歧优先

主动学习分配

配置

现代配置（推荐）

旧版配置

测试问题

定义测试问题

配置

示例配置

基本随机分配

质量导向分配

众包设置

主动学习设置

管理员仪表板集成

延伸阅读