Skip to content
Showcase/Mind2Web: Web Agent Task Annotation
advancedtext

Mind2Web: Web Agent Task Annotation

Web agent task annotation. Annotators describe web navigation tasks, identify target HTML elements for each action step, and label the action type (click, type, select) needed to complete the task.

Q1: Rate your experience12345Q2: Primary use case?ResearchIndustryEducationQ3: Additional feedback

कॉन्फ़िगरेशन फ़ाइलconfig.yaml

# Mind2Web: Web Agent Task Annotation
# Based on "Mind2Web: Towards a Generalist Agent for the Web" (Deng et al., NeurIPS 2023)
# Task: Annotate web navigation actions - identify target elements and action types

annotation_task_name: "Mind2Web Web Agent Task Annotation"
task_dir: "."

# Data configuration
data_files:
  - sample-data.json
item_properties:
  id_key: "id"
  text_key: "text"

# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

# Display layout showing task description, URL, and HTML snippet
html_layout: |
  <div class="mind2web-container">
    <div class="task-section" style="background: #e8f5e9; padding: 15px; border-radius: 8px; margin-bottom: 15px;">
      <h3 style="margin-top: 0;">Task Description:</h3>
      <div class="task-text" style="font-size: 16px; font-weight: bold;">{{text}}</div>
    </div>
    <div class="url-section" style="background: #e8eaf6; padding: 10px; border-radius: 8px; margin-bottom: 15px;">
      <strong>Website URL:</strong> {{url}}
    </div>
    <div class="action-section" style="background: #fff3e0; padding: 10px; border-radius: 8px; margin-bottom: 15px;">
      <strong>Action Sequence Context:</strong> {{action_sequence}}
    </div>
    <div class="html-section" style="background: #f5f5f5; padding: 15px; border-radius: 8px; border: 2px solid #424242;">
      <h3 style="margin-top: 0;">HTML Snippet (identify the target element):</h3>
      <pre style="white-space: pre-wrap; font-family: monospace; font-size: 13px; line-height: 1.5; overflow-x: auto;">{{html_snippet}}</pre>
    </div>
  </div>

# Annotation schemes
annotation_schemes:
  # Span annotation for target element identification in HTML
  - name: "target_element"
    description: "Highlight the target HTML element that the agent should interact with to perform the action."
    annotation_type: span
    labels:
      - "Target Element"
      - "Secondary Element"
      - "Context Element"
    label_colors:
      "Target Element": "#f44336"
      "Secondary Element": "#ff9800"
      "Context Element": "#9e9e9e"

  # Action type
  - name: "action_type"
    description: "What type of action should the agent perform on the target element?"
    annotation_type: radio
    labels:
      - "Click"
      - "Type"
      - "Select (dropdown)"
      - "Scroll"
      - "Navigate"
      - "Hover"
    keyboard_shortcuts:
      "Click": "1"
      "Type": "2"
      "Select (dropdown)": "3"
      "Scroll": "4"
      "Navigate": "5"
      "Hover": "6"

  # Action parameter (e.g., text to type, option to select)
  - name: "action_value"
    description: "If the action is Type or Select, what value should be entered or selected? Leave blank for Click/Scroll/Navigate."
    annotation_type: text
    required: false
    placeholder: "e.g., 'New York' for a search field, or 'Economy' for a dropdown"

  # Step correctness
  - name: "step_completeness"
    description: "Does this action step make progress toward completing the task?"
    annotation_type: radio
    labels:
      - "Yes - directly advances the task"
      - "Partially - indirect but useful"
      - "No - does not help complete the task"
    keyboard_shortcuts:
      "Yes - directly advances the task": "a"
      "Partially - indirect but useful": "s"
      "No - does not help complete the task": "d"

# User configuration
allow_all_users: true

# Task assignment
instances_per_annotator: 100
annotation_per_instance: 2

नमूना डेटाsample-data.json

[
  {
    "id": "m2w_001",
    "text": "Book a one-way flight from New York to Los Angeles for December 15th",
    "html_snippet": "<div class=\"search-form\">\n  <div class=\"trip-type\">\n    <label><input type=\"radio\" name=\"trip\" value=\"roundtrip\" checked> Round Trip</label>\n    <label><input type=\"radio\" name=\"trip\" value=\"oneway\"> One Way</label>\n    <label><input type=\"radio\" name=\"trip\" value=\"multi\"> Multi-City</label>\n  </div>\n  <div class=\"search-fields\">\n    <input type=\"text\" id=\"origin\" placeholder=\"From\" value=\"\">\n    <input type=\"text\" id=\"destination\" placeholder=\"To\" value=\"\">\n    <input type=\"date\" id=\"depart-date\" placeholder=\"Departure Date\">\n    <button class=\"search-btn\">Search Flights</button>\n  </div>\n</div>",
    "url": "https://www.example-airline.com/flights",
    "action_sequence": "Step 1 of 4: Select trip type"
  },
  {
    "id": "m2w_002",
    "text": "Find a hotel in San Francisco with a pool for January 5-8 under $200/night",
    "html_snippet": "<div class=\"hotel-search\">\n  <input type=\"text\" id=\"location\" placeholder=\"Where are you going?\" class=\"location-input\">\n  <div class=\"date-picker\">\n    <input type=\"date\" id=\"checkin\" placeholder=\"Check-in\">\n    <input type=\"date\" id=\"checkout\" placeholder=\"Check-out\">\n  </div>\n  <div class=\"guests\">\n    <select id=\"rooms\">\n      <option value=\"1\">1 Room</option>\n      <option value=\"2\">2 Rooms</option>\n    </select>\n    <select id=\"adults\">\n      <option value=\"1\">1 Adult</option>\n      <option value=\"2\">2 Adults</option>\n    </select>\n  </div>\n  <button id=\"search-hotels\" class=\"btn-primary\">Search Hotels</button>\n</div>",
    "url": "https://www.example-hotels.com/search",
    "action_sequence": "Step 1 of 5: Enter destination"
  }
]

// ... and 8 more items

यह डिज़ाइन प्राप्त करें

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/agentic/mind2web-web-agent-tasks
potato start config.yaml

विवरण

एनोटेशन प्रकार

spantextradio

डोमेन

Web AgentsHCIAutomation

उपयोग के मामले

Web NavigationTask GroundingAgent Training

टैग

web-agentmind2webhtmlaction-annotationgroundingnavigation

कोई समस्या मिली या इस डिज़ाइन को सुधारना चाहते हैं?

एक Issue खोलें