实例显示
使用instance_display配置块将内容显示与标注分离。
实例显示
v2.1.0 新增
实例显示将向标注者展示什么内容与收集什么标注分离。这使您可以在任何标注方案(单选按钮、复选框、片段等)旁边显示任意组合的内容类型(图像、视频、音频、文本)。
为什么使用实例显示?
以前,如果您想在图像旁显示单选按钮进行分类,您不得不添加一个 image_annotation 方案并设置 min_annotations: 0 来显示图像。这种做法令人困惑且语义上不正确。
使用 instance_display,您可以明确配置要显示的内容:
yaml
# OLD (deprecated workaround)
annotation_schemes:
- annotation_type: image_annotation
name: image_display
min_annotations: 0 # Just to show the image
tools: [bbox]
labels: [unused]
- annotation_type: radio
name: category
labels: [A, B, C]
# NEW (recommended)
instance_display:
fields:
- key: image_url
type: image
annotation_schemes:
- annotation_type: radio
name: category
labels: [A, B, C]基本配置
在 YAML 配置中添加 instance_display 部分:
yaml
instance_display:
fields:
- key: "image_url" # Field name in your data JSON
type: "image" # Content type
label: "Image to classify" # Optional header
display_options:
max_width: 600
zoomable: true
layout:
direction: "vertical" # vertical or horizontal
gap: "20px"支持的显示类型
| 类型 | 描述 | 片段目标 |
|---|---|---|
text | 纯文本内容 | 是 |
html | 经过清理的 HTML 内容 | 否 |
image | 带缩放的图像显示 | 否 |
video | 视频播放器 | 否 |
audio | 音频播放器 | 否 |
dialogue | 对话轮次 | 是 |
pairwise | 并排比较 | 否 |
code | 语法高亮的源代码 | 是 |
spreadsheet | 表格数据(Excel/CSV) | 是(行/单元格) |
document | 富文档(Word、Markdown、HTML) | 是 |
pdf | 带页面控制的 PDF 文档 | 是 |
显示类型选项
文本显示
yaml
- key: "text"
type: "text"
label: "Document"
display_options:
collapsible: false # Make content collapsible
max_height: 400 # Max height in pixels before scrolling
preserve_whitespace: true # Preserve line breaks and spacing图像显示
yaml
- key: "image_url"
type: "image"
label: "Image"
display_options:
max_width: 800 # Max width (number or CSS string)
max_height: 600 # Max height
zoomable: true # Enable zoom controls
alt_text: "Description" # Alt text for accessibility
object_fit: "contain" # CSS object-fit property视频显示
yaml
- key: "video_url"
type: "video"
label: "Video"
display_options:
max_width: 800
max_height: 450
controls: true # Show video controls
autoplay: false # Auto-play on load
loop: false # Loop playback
muted: false # Start muted音频显示
yaml
- key: "audio_url"
type: "audio"
label: "Audio"
display_options:
controls: true # Show audio controls
autoplay: false
loop: false
show_waveform: false # Show waveform visualization对话显示
yaml
- key: "conversation"
type: "dialogue"
label: "Conversation"
display_options:
alternating_shading: true # Alternate background colors
speaker_extraction: true # Extract "Speaker:" from text
show_turn_numbers: false # Show turn numbers对话的数据格式(JSONL 文件中的每一行):
json
{"id": "conv_001", "conversation": ["Speaker A: Hello there!", "Speaker B: Hi, how are you?"]}或者使用结构化数据:
json
{"id": "conv_001", "conversation": [{"speaker": "Alice", "text": "Hello there!"}, {"speaker": "Bob", "text": "Hi, how are you?"}]}成对显示
yaml
- key: "comparison"
type: "pairwise"
label: "Compare Options"
display_options:
cell_width: "50%" # Width of each cell
show_labels: true # Show A/B labels
labels: ["Option A", "Option B"]
vertical_on_mobile: true # Stack vertically on mobile布局选项
控制多个字段的排列方式:
yaml
instance_display:
layout:
direction: horizontal # horizontal or vertical
gap: 24px # Space between fields片段标注支持
基于文本的显示类型(text、dialogue)可以作为片段标注的目标:
yaml
instance_display:
fields:
- key: "document"
type: "text"
span_target: true # Enable span annotation on this field
annotation_schemes:
- annotation_type: span
name: entities
labels: [PERSON, LOCATION, ORG]多片段目标
您可以拥有多个支持片段标注的文本字段:
yaml
instance_display:
fields:
- key: "source_text"
type: "text"
label: "Source Document"
span_target: true
- key: "summary"
type: "text"
label: "Summary"
span_target: true
annotation_schemes:
- annotation_type: span
name: factual_errors
labels: [contradiction, unsupported, fabrication]当使用多个片段目标时,标注会带有字段关联存储:
json
{
"factual_errors": {
"source_text": [],
"summary": [
{"start": 45, "end": 67, "label": "unsupported"}
]
}
}将标注方案链接到显示字段
对于媒体标注方案(image_annotation、video_annotation、audio_annotation),使用 source_field 将其链接到显示字段:
yaml
instance_display:
fields:
- key: "image_url"
type: "image"
annotation_schemes:
- annotation_type: image_annotation
source_field: "image_url" # Links to display field
tools: [bbox]
labels: [person, car]逐轮次对话评分
为单个对话轮次添加内联评分组件:
yaml
instance_display:
fields:
- key: conversation
type: dialogue
label: "Conversation"
display_options:
show_turn_numbers: true
per_turn_ratings:
speakers: ["Agent"] # Only show ratings for these speakers
schema_name: "turn_quality" # Name for the stored annotation data
scheme:
type: likert
size: 5
labels: ["Poor", "Excellent"]评分圆圈内联显示在每个匹配说话者的轮次下方。评分会填充到选定值,所有逐轮次评分作为单个 JSON 对象存储:
json
{
"turn_quality": "{\"0\": 4, \"2\": 5, \"4\": 3}"
}示例:图像分类
yaml
annotation_task_name: "Image Classification"
data_files:
- data/images.json
item_properties:
id_key: id
text_key: image_url
task_dir: .
output_annotation_dir: annotation_output
instance_display:
fields:
- key: image_url
type: image
label: "Image to Classify"
display_options:
max_width: 600
zoomable: true
- key: context
type: text
label: "Additional Context"
display_options:
collapsible: true
annotation_schemes:
- annotation_type: radio
name: category
description: "What category best describes this image?"
labels:
- nature
- urban
- people
- objects
user_config:
allow_all_users: true示例数据文件(data/images.json),JSONL 格式:
json
{"id": "img_001", "image_url": "https://example.com/image1.jpg", "context": "Taken in summer 2023"}
{"id": "img_002", "image_url": "https://example.com/image2.jpg", "context": "Winter landscape"}示例:多模态标注
视频与带有片段标注的文本稿并排显示:
yaml
annotation_task_name: "Video Analysis"
instance_display:
layout:
direction: horizontal
gap: 24px
fields:
- key: video_url
type: video
label: "Video"
display_options:
max_width: "45%"
- key: transcript
type: text
label: "Transcript"
span_target: true
annotation_schemes:
- annotation_type: radio
name: sentiment
labels: [positive, neutral, negative]
- annotation_type: span
name: highlights
labels:
- key_point
- question
- supporting_evidence向后兼容性
- 没有
instance_display的现有配置继续正常工作 item_properties中的text_key仍作为后备使用- 通过标注方案进行的旧版媒体检测仍然有效
- 使用旧的仅显示模式时,日志中会出现弃用警告
从仅显示模式迁移
如果您之前使用标注方案仅仅是为了显示内容:
之前(已弃用):
yaml
annotation_schemes:
- annotation_type: image_annotation
name: image_display
min_annotations: 0
tools: [bbox]
labels: [unused]之后(推荐):
yaml
instance_display:
fields:
- key: image_url
type: image延伸阅读
有关实现细节,请参阅源文档。