RT-2 - Robotic Action Annotation
Robotic manipulation task evaluation and action segmentation based on RT-2 (Brohan et al., CoRL 2023). Annotators evaluate task success, describe actions, rate execution quality, and segment video into action phases.
Configuration Fileconfig.yaml
# RT-2 - Robotic Action Annotation
# Based on Brohan et al., CoRL 2023
# Paper: https://arxiv.org/abs/2307.15818
# Dataset: https://robotics-transformer2.github.io/
#
# This task evaluates robotic manipulation episodes from the RT-2 benchmark.
# Annotators watch a video of a robot performing a task, evaluate the success
# of the execution, describe the actions taken, rate overall quality, and
# segment the video into distinct action phases.
#
# Task Success:
# - Success: The robot completed the task as instructed
# - Partial Success: The robot made progress but did not fully complete the task
# - Failure: The robot failed to make meaningful progress on the task
#
# Action Phases (Video Annotation):
# - Reaching: Robot arm moving toward the target object
# - Grasping: Robot closing gripper on the object
# - Placing: Robot positioning the object at the target location
# - Moving: Robot transporting the object through space
# - Idle: Robot stationary or resetting
#
# Annotation Guidelines:
# 1. Read the task instruction
# 2. Watch the video carefully
# 3. Evaluate task success
# 4. Describe the actions taken by the robot
# 5. Rate the execution quality
# 6. Segment the video into action phases
annotation_task_name: "RT-2 - Robotic Action Annotation"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: radio
name: task_success
description: "Did the robot successfully complete the instructed task?"
labels:
- "Success"
- "Partial Success"
- "Failure"
keyboard_shortcuts:
"Success": "1"
"Partial Success": "2"
"Failure": "3"
tooltips:
"Success": "The robot fully completed the task as described in the instruction"
"Partial Success": "The robot made progress but did not fully complete the task"
"Failure": "The robot failed to make meaningful progress on the task"
- annotation_type: text
name: action_description
description: "Describe the sequence of actions the robot performed"
- annotation_type: likert
name: execution_quality
description: "Rate the overall quality of the robot's execution"
min_label: "Very Poor"
max_label: "Excellent"
size: 5
- annotation_type: video_annotation
name: action_phases
description: "Segment the video into distinct action phases"
mode: "segment"
labels:
- "Reaching"
- "Grasping"
- "Placing"
- "Moving"
- "Idle"
annotation_instructions: |
You will be shown a task instruction and a video of a robot attempting to complete it.
1. Read the task instruction carefully.
2. Watch the full video of the robot's execution.
3. Judge whether the task was a Success, Partial Success, or Failure.
4. Describe the sequence of actions the robot performed.
5. Rate the overall execution quality on a 5-point scale.
6. Segment the video into action phases: Reaching, Grasping, Placing, Moving, or Idle.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #a16207;">Task Instruction:</strong>
<p style="font-size: 16px; line-height: 1.6; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #1e293b; border-radius: 8px; padding: 16px; margin-bottom: 16px; text-align: center;">
<video controls style="max-width: 100%; border-radius: 4px;">
<source src="{{video_url}}" type="video/mp4">
Your browser does not support the video tag.
</video>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "rt2_001",
"text": "Pick up the red apple from the table and place it in the bowl.",
"video_url": "videos/robot_episode_001.mp4"
},
{
"id": "rt2_002",
"text": "Move the blue cup to the left side of the counter.",
"video_url": "videos/robot_episode_002.mp4"
}
]
// ... and 8 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/multimodal/rt2-robotic-action-annotation potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
SayCan - Robot Task Planning Evaluation
Evaluate robot action plans generated from natural language instructions, based on the SayCan framework (Ahn et al., CoRL 2022). Annotators assess feasibility, identify primitive actions, describe plans, and rate safety of grounded language-conditioned robot manipulation tasks.
MVBench Video Understanding
Comprehensive video understanding benchmark with multiple-choice questions, video segment annotation, and reasoning, based on MVBench (Li et al., arXiv 2023). Tests temporal perception, action recognition, and state change detection in videos.
Coreference Resolution (OntoNotes)
Link pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.