Skip to content

实例显示

使用instance_display配置块将内容显示与标注分离。

实例显示

v2.1.0 新增

实例显示将向标注者展示什么内容收集什么标注分离。这使您可以在任何标注方案(单选按钮、复选框、片段等)旁边显示任意组合的内容类型(图像、视频、音频、文本)。

为什么使用实例显示?

以前,如果您想在图像旁显示单选按钮进行分类,您不得不添加一个 image_annotation 方案并设置 min_annotations: 0 来显示图像。这种做法令人困惑且语义上不正确。

使用 instance_display,您可以明确配置要显示的内容:

yaml
# OLD (deprecated workaround)
annotation_schemes:
  - annotation_type: image_annotation
    name: image_display
    min_annotations: 0  # Just to show the image
    tools: [bbox]
    labels: [unused]
  - annotation_type: radio
    name: category
    labels: [A, B, C]
 
# NEW (recommended)
instance_display:
  fields:
    - key: image_url
      type: image
 
annotation_schemes:
  - annotation_type: radio
    name: category
    labels: [A, B, C]

基本配置

在 YAML 配置中添加 instance_display 部分:

yaml
instance_display:
  fields:
    - key: "image_url"           # Field name in your data JSON
      type: "image"              # Content type
      label: "Image to classify" # Optional header
      display_options:
        max_width: 600
        zoomable: true
 
  layout:
    direction: "vertical"        # vertical or horizontal
    gap: "20px"

支持的显示类型

类型描述片段目标
text纯文本内容
html经过清理的 HTML 内容
image带缩放的图像显示
video视频播放器
audio音频播放器
dialogue对话轮次
pairwise并排比较
code语法高亮的源代码
spreadsheet表格数据(Excel/CSV)是(行/单元格)
document富文档(Word、Markdown、HTML)
pdf带页面控制的 PDF 文档

显示类型选项

文本显示

yaml
- key: "text"
  type: "text"
  label: "Document"
  display_options:
    collapsible: false        # Make content collapsible
    max_height: 400           # Max height in pixels before scrolling
    preserve_whitespace: true # Preserve line breaks and spacing

图像显示

yaml
- key: "image_url"
  type: "image"
  label: "Image"
  display_options:
    max_width: 800            # Max width (number or CSS string)
    max_height: 600           # Max height
    zoomable: true            # Enable zoom controls
    alt_text: "Description"   # Alt text for accessibility
    object_fit: "contain"     # CSS object-fit property

视频显示

yaml
- key: "video_url"
  type: "video"
  label: "Video"
  display_options:
    max_width: 800
    max_height: 450
    controls: true            # Show video controls
    autoplay: false           # Auto-play on load
    loop: false               # Loop playback
    muted: false              # Start muted

音频显示

yaml
- key: "audio_url"
  type: "audio"
  label: "Audio"
  display_options:
    controls: true            # Show audio controls
    autoplay: false
    loop: false
    show_waveform: false      # Show waveform visualization

对话显示

yaml
- key: "conversation"
  type: "dialogue"
  label: "Conversation"
  display_options:
    alternating_shading: true    # Alternate background colors
    speaker_extraction: true     # Extract "Speaker:" from text
    show_turn_numbers: false     # Show turn numbers

对话的数据格式(JSONL 文件中的每一行):

json
{"id": "conv_001", "conversation": ["Speaker A: Hello there!", "Speaker B: Hi, how are you?"]}

或者使用结构化数据:

json
{"id": "conv_001", "conversation": [{"speaker": "Alice", "text": "Hello there!"}, {"speaker": "Bob", "text": "Hi, how are you?"}]}

成对显示

yaml
- key: "comparison"
  type: "pairwise"
  label: "Compare Options"
  display_options:
    cell_width: "50%"           # Width of each cell
    show_labels: true           # Show A/B labels
    labels: ["Option A", "Option B"]
    vertical_on_mobile: true    # Stack vertically on mobile

布局选项

控制多个字段的排列方式:

yaml
instance_display:
  layout:
    direction: horizontal  # horizontal or vertical
    gap: 24px              # Space between fields

片段标注支持

基于文本的显示类型(textdialogue)可以作为片段标注的目标:

yaml
instance_display:
  fields:
    - key: "document"
      type: "text"
      span_target: true  # Enable span annotation on this field
 
annotation_schemes:
  - annotation_type: span
    name: entities
    labels: [PERSON, LOCATION, ORG]

多片段目标

您可以拥有多个支持片段标注的文本字段:

yaml
instance_display:
  fields:
    - key: "source_text"
      type: "text"
      label: "Source Document"
      span_target: true
 
    - key: "summary"
      type: "text"
      label: "Summary"
      span_target: true
 
annotation_schemes:
  - annotation_type: span
    name: factual_errors
    labels: [contradiction, unsupported, fabrication]

当使用多个片段目标时,标注会带有字段关联存储:

json
{
  "factual_errors": {
    "source_text": [],
    "summary": [
      {"start": 45, "end": 67, "label": "unsupported"}
    ]
  }
}

将标注方案链接到显示字段

对于媒体标注方案(image_annotationvideo_annotationaudio_annotation),使用 source_field 将其链接到显示字段:

yaml
instance_display:
  fields:
    - key: "image_url"
      type: "image"
 
annotation_schemes:
  - annotation_type: image_annotation
    source_field: "image_url"  # Links to display field
    tools: [bbox]
    labels: [person, car]

逐轮次对话评分

为单个对话轮次添加内联评分组件:

yaml
instance_display:
  fields:
    - key: conversation
      type: dialogue
      label: "Conversation"
      display_options:
        show_turn_numbers: true
        per_turn_ratings:
          speakers: ["Agent"]          # Only show ratings for these speakers
          schema_name: "turn_quality"  # Name for the stored annotation data
          scheme:
            type: likert
            size: 5
            labels: ["Poor", "Excellent"]

评分圆圈内联显示在每个匹配说话者的轮次下方。评分会填充到选定值,所有逐轮次评分作为单个 JSON 对象存储:

json
{
  "turn_quality": "{\"0\": 4, \"2\": 5, \"4\": 3}"
}

示例:图像分类

yaml
annotation_task_name: "Image Classification"
 
data_files:
  - data/images.json
 
item_properties:
  id_key: id
  text_key: image_url
 
task_dir: .
output_annotation_dir: annotation_output
 
instance_display:
  fields:
    - key: image_url
      type: image
      label: "Image to Classify"
      display_options:
        max_width: 600
        zoomable: true
 
    - key: context
      type: text
      label: "Additional Context"
      display_options:
        collapsible: true
 
annotation_schemes:
  - annotation_type: radio
    name: category
    description: "What category best describes this image?"
    labels:
      - nature
      - urban
      - people
      - objects
 
user_config:
  allow_all_users: true

示例数据文件(data/images.json),JSONL 格式:

json
{"id": "img_001", "image_url": "https://example.com/image1.jpg", "context": "Taken in summer 2023"}
{"id": "img_002", "image_url": "https://example.com/image2.jpg", "context": "Winter landscape"}

示例:多模态标注

视频与带有片段标注的文本稿并排显示:

yaml
annotation_task_name: "Video Analysis"
 
instance_display:
  layout:
    direction: horizontal
    gap: 24px
 
  fields:
    - key: video_url
      type: video
      label: "Video"
      display_options:
        max_width: "45%"
 
    - key: transcript
      type: text
      label: "Transcript"
      span_target: true
 
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    labels: [positive, neutral, negative]
 
  - annotation_type: span
    name: highlights
    labels:
      - key_point
      - question
      - supporting_evidence

向后兼容性

  • 没有 instance_display 的现有配置继续正常工作
  • item_properties 中的 text_key 仍作为后备使用
  • 通过标注方案进行的旧版媒体检测仍然有效
  • 使用旧的仅显示模式时,日志中会出现弃用警告

从仅显示模式迁移

如果您之前使用标注方案仅仅是为了显示内容:

之前(已弃用):

yaml
annotation_schemes:
  - annotation_type: image_annotation
    name: image_display
    min_annotations: 0
    tools: [bbox]
    labels: [unused]

之后(推荐):

yaml
instance_display:
  fields:
    - key: image_url
      type: image

延伸阅读

有关实现细节,请参阅源文档