成对比较

比较成对项目，用于偏好和质量评估。

成对比较允许标注者并排比较两个项目并表明其偏好。支持两种模式：

二元模式（默认）：点击首选的选项卡（A 或 B），可选平局按钮
量表模式：使用滑块评估一个选项相对于另一个的偏好程度

常见使用场景包括比较模型输出、RLHF 偏好学习、翻译或摘要的质量比较以及 A/B 测试。

二元模式

二元模式显示两个可点击的选项卡。标注者点击其首选选项。

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: preference
    description: "Which response is better?"
    mode: binary
 
    # Data source - key in instance data containing items to compare
    items_key: "responses"
 
    # Display options
    show_labels: true
    labels:
      - "Response A"
      - "Response B"
 
    # Tie option
    allow_tie: true
    tie_label: "No preference"
 
    # Keyboard shortcuts
    sequential_key_binding: true
 
    # Validation
    label_requirement:
      required: true

量表模式

量表模式在两个项目之间显示一个滑块，允许标注者表示偏好程度。

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: preference_scale
    description: "Rate how much better A is than B"
    mode: scale
 
    items_key: "responses"
 
    labels:
      - "Response A"
      - "Response B"
 
    # Scale configuration
    scale:
      min: -3           # Negative = prefer left item (A)
      max: 3            # Positive = prefer right item (B)
      step: 1
      default: 0
 
      # Endpoint labels
      labels:
        min: "A is much better"
        max: "B is much better"
        center: "Equal"
 
    label_requirement:
      required: true

数据格式

该方案期望实例数据包含要比较的项目列表：

json

{"id": "1", "responses": ["Response A text", "Response B text"]}
{"id": "2", "responses": ["First option here", "Second option here"]}

items_key 配置指定哪个字段包含要比较的项目。该字段应包含至少 2 个项目的列表。

键盘快捷键

在 sequential_key_binding: true 的二元模式中：

按键	操作
`1`	选择选项 A
`2`	选择选项 B
`0`	选择平局/无偏好（如果 `allow_tie: true`）

量表模式使用滑块交互。

输出格式

二元模式

json

{
  "preference": {
    "selection": "A"
  }
}

平局时：

json

{
  "preference": {
    "selection": "tie"
  }
}

量表模式

负值表示偏好 A，正值偏好 B，零表示相等：

json

{
  "preference_scale": {
    "scale_value": "-2"
  }
}

示例

基本二元比较

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: quality
    description: "Which text is higher quality?"
    labels: ["Text A", "Text B"]
    allow_tie: true

多方面比较

从多个维度进行比较：

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: fluency
    description: "Which response is more fluent?"
    labels: ["Response A", "Response B"]
 
  - annotation_type: pairwise
    name: relevance
    description: "Which response is more relevant?"
    labels: ["Response A", "Response B"]
 
  - annotation_type: pairwise
    name: overall
    description: "Which response is better overall?"
    labels: ["Response A", "Response B"]
    allow_tie: true

自定义范围的偏好量表

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: sentiment_comparison
    description: "Compare the sentiment of these two statements"
    mode: scale
    labels: ["Statement A", "Statement B"]
    scale:
      min: -5
      max: 5
      step: 1
      labels:
        min: "A is much more positive"
        max: "B is much more positive"
        center: "Equal sentiment"

RLHF 偏好收集

yaml

annotation_schemes:
  - annotation_type: pairwise
    name: overall
    description: "Overall, which response is better?"
    labels: ["Response A", "Response B"]
    allow_tie: true
    sequential_key_binding: true
 
  - annotation_type: multiselect
    name: criteria
    description: "What factors influenced your decision?"
    labels:
      - Accuracy
      - Helpfulness
      - Clarity
      - Safety
      - Completeness
 
  - annotation_type: text
    name: notes
    description: "Additional notes (optional)"
    textarea: true
    required: false

样式

成对比较标注使用主题系统中的 CSS 变量。添加自定义 CSS 以自定义选项卡：

css

/* Make tiles taller */
.pairwise-tile {
  min-height: 200px;
}
 
/* Change selected tile highlight */
.pairwise-tile.selected {
  border-color: #10b981;
  background-color: rgba(16, 185, 129, 0.1);
}

最佳实践

使用清晰、独特的标签 - 标注者应立即理解选项
仔细考虑平局选项 - 有时强制选择更合适
使用键盘快捷键 - 显著加快标注速度
添加理由说明字段 - 有助于理解推理过程并提高数据质量
用您的数据测试 - 确保显示效果与您的内容长度匹配

成对比较

二元模式

量表模式

数据格式

键盘快捷键

输出格式

二元模式

量表模式

示例

基本二元比较

多方面比较

自定义范围的偏好量表

RLHF 偏好收集

样式

最佳实践

延伸阅读