Esquemas de Anotación

Define qué y cómo los anotadores etiquetarán tus datos.

Los esquemas de anotación definen las tareas de etiquetado para tus anotadores. Potato soporta más de 30 tipos de anotación que se pueden combinar para crear tareas de anotación complejas.

Estructura Básica

Cada esquema se define en el array annotation_schemes:

yaml

annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment?"
    labels:
      - Positive
      - Negative
      - Neutral

Campos Requeridos

Campo	Descripción
`annotation_type`	Tipo de anotación (radio, multiselect, likert, span, text, number, slider, multirate)
`name`	Identificador interno (sin espacios, usado en la salida)
`description`	Instrucciones mostradas a los anotadores

Tipos de Anotación Soportados

1. Radio (Opción Única)

Selecciona exactamente una opción de una lista:

yaml

- annotation_type: radio
  name: sentiment
  description: "What is the sentiment of this text?"
  labels:
    - Positive
    - Negative
    - Neutral
 
  # Optional features
  keyboard_shortcuts:
    Positive: "1"
    Negative: "2"
    Neutral: "3"
 
  # Or use sequential binding (1, 2, 3... automatically)
  sequential_key_binding: true
 
  # Horizontal layout instead of vertical
  horizontal: true

2. Escala Likert

Escalas de calificación con etiquetas en los extremos:

yaml

- annotation_type: likert
  name: agreement
  description: "How much do you agree with this statement?"
  size: 5  # Number of scale points
  min_label: "Strongly Disagree"
  max_label: "Strongly Agree"
 
  # Optional mid-point label
  mid_label: "Neutral"
 
  # Show numeric values
  show_numbers: true

3. Multiselect (Opción Múltiple)

Selecciona múltiples opciones de una lista:

yaml

- annotation_type: multiselect
  name: topics
  description: "Select all relevant topics"
  labels:
    - Politics
    - Technology
    - Sports
    - Entertainment
    - Science
 
  # Selection constraints
  min_selections: 1
  max_selections: 3
 
  # Allow free text response
  free_response: true
  free_response_label: "Other (specify)"

4. Anotación de Segmentos

Resalta y etiqueta segmentos de texto:

yaml

- annotation_type: span
  name: entities
  description: "Highlight named entities in the text"
  labels:
    - PERSON
    - ORGANIZATION
    - LOCATION
    - DATE
 
  # Visual customization
  label_colors:
    PERSON: "#3b82f6"
    ORGANIZATION: "#10b981"
    LOCATION: "#f59e0b"
    DATE: "#8b5cf6"
 
  # Allow overlapping spans
  allow_overlapping: false
 
  # Keyboard shortcuts for labels
  sequential_key_binding: true

5. Deslizador

Rango numérico continuo:

yaml

- annotation_type: slider
  name: confidence
  description: "How confident are you in your answer?"
  min: 0
  max: 100
  step: 1
  default: 50
 
  # Endpoint labels
  min_label: "Not confident"
  max_label: "Very confident"
 
  # Show current value
  show_value: true

6. Entrada de Texto

Respuestas de texto libre:

yaml

- annotation_type: text
  name: explanation
  description: "Explain your reasoning"
 
  # Multi-line input
  textarea: true
 
  # Character limits
  min_length: 10
  max_length: 500
 
  # Placeholder text
  placeholder: "Enter your explanation here..."
 
  # Disable paste (for transcription tasks)
  disable_paste: true

7. Entrada Numérica

Entrada numérica con restricciones:

yaml

- annotation_type: number
  name: count
  description: "How many entities are mentioned?"
  min: 0
  max: 100
  step: 1
  default: 0

8. Multirate (Calificación Matricial)

Califica múltiples elementos en la misma escala:

yaml

- annotation_type: multirate
  name: quality_aspects
  description: "Rate each aspect of the response"
  items:
    - Accuracy
    - Clarity
    - Completeness
    - Relevance
  size: 5  # Scale points
  min_label: "Poor"
  max_label: "Excellent"
 
  # Randomize item order
  randomize: true
 
  # Layout options
  compact: false

Opciones Comunes

Atajos de Teclado

Acelera la anotación con asignaciones de teclado:

yaml

# Manual shortcuts
keyboard_shortcuts:
  Positive: "1"
  Negative: "2"
  Neutral: "3"
 
# Or automatic sequential binding
sequential_key_binding: true  # Assigns 1, 2, 3...

Tooltips

Proporciona pistas al pasar el cursor sobre las etiquetas:

yaml

tooltips:
  Positive: "Expresses happiness, approval, or satisfaction"
  Negative: "Expresses sadness, anger, or disappointment"
  Neutral: "No clear emotional content"

Colores de Etiquetas

Colores personalizados para distinción visual:

yaml

label_colors:
  PERSON: "#3b82f6"
  LOCATION: "#10b981"
  ORGANIZATION: "#f59e0b"

Campos Requeridos

Haz que un esquema sea obligatorio antes del envío:

yaml

- annotation_type: radio
  name: sentiment
  required: true
  labels:
    - Positive
    - Negative

Múltiples Esquemas

Combina múltiples tipos de anotación por instancia:

yaml

annotation_schemes:
  # Primary classification
  - annotation_type: radio
    name: sentiment
    description: "Overall sentiment"
    labels:
      - Positive
      - Negative
      - Neutral
    required: true
    sequential_key_binding: true
 
  # Confidence rating
  - annotation_type: likert
    name: confidence
    description: "How confident are you?"
    size: 5
    min_label: "Guessing"
    max_label: "Certain"
 
  # Topic tags
  - annotation_type: multiselect
    name: topics
    description: "Select all relevant topics"
    labels:
      - Politics
      - Technology
      - Sports
      - Entertainment
    free_response: true
 
  # Notes
  - annotation_type: text
    name: notes
    description: "Any additional observations?"
    textarea: true
    required: false

Funcionalidades Avanzadas

Comparación por Pares

Compara dos elementos:

yaml

- annotation_type: pairwise
  name: preference
  description: "Which response is better?"
  options:
    - label: "Response A"
      value: "A"
    - label: "Response B"
      value: "B"
    - label: "Equal"
      value: "tie"
 
  # Allow tie selection
  allow_tie: true

Escalamiento Mejor-Peor

Clasifica elementos seleccionando el mejor y el peor:

yaml

- annotation_type: best_worst
  name: ranking
  description: "Select the best and worst items"
  # Items come from the data file

Selección Desplegable

Selección única eficiente en espacio:

yaml

- annotation_type: select
  name: category
  description: "Select a category"
  labels:
    - Category A
    - Category B
    - Category C
    - Category D
    - Category E
 
  # Default selection
  default: "Category A"

Referencia de Formato de Datos

Entrada

Los esquemas de anotación funcionan con tu formato de datos:

json

{
  "id": "doc_1",
  "text": "This is the text to annotate."
}

Salida

Las anotaciones se guardan con los nombres de los esquemas como claves:

json

{
  "id": "doc_1",
  "annotations": {
    "sentiment": "Positive",
    "confidence": 4,
    "topics": ["Technology", "Science"],
    "entities": [
      {"start": 0, "end": 4, "label": "ORGANIZATION", "text": "This"}
    ],
    "notes": "Clear positive sentiment about technology."
  }
}

Mejores Prácticas

1. Etiquetas Claras

Usa etiquetas inequívocas y distintas:

yaml

# Good
labels:
  - Strongly Positive
  - Somewhat Positive
  - Neutral
  - Somewhat Negative
  - Strongly Negative
 
# Avoid
labels:
  - Good
  - OK
  - Fine
  - Acceptable

2. Tooltips Útiles

Añade tooltips para etiquetas con matices:

yaml

tooltips:
  Sarcasm: "The text says the opposite of what it means"
  Irony: "A mismatch between expectation and reality"

3. Atajos de Teclado

Habilita atajos para tareas de alto volumen:

yaml

sequential_key_binding: true

4. Orden Lógico

Ordena las etiquetas de forma consistente:

Las más comunes primero
Alfabéticamente
Por intensidad (de menor a mayor)

5. Limitar Opciones

Demasiadas opciones ralentizan la anotación:

Radio: 2-7 opciones
Multiselect: 5-15 opciones
Likert: 5-7 puntos

6. Probar Primero

Anota varios ejemplos tú mismo antes del despliegue para detectar:

Etiquetas ambiguas
Categorías faltantes
Instrucciones poco claras