Skip to content
Showcase/Visual Question Answering
beginnerimage

Visual Question Answering

Answer questions about images for VQA dataset creation.

Labels:outdoornatureurbanpeopleanimal+

Archivo de configuraciónconfig.yaml

annotation_task_name: "Visual Question Answering"
task_name: "Visual Question Answering"
task_description: "Answer the question about the image."
task_dir: "."
port: 8000

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  image_key: "image_url"
  context_key: "question"

annotation_schemes:
  - annotation_type: text
    name: answer
    description: "Provide a concise answer to the question"
    required: true

  - annotation_type: radio
    name: confidence
    description: "How confident are you in your answer?"
    labels:
      - "Very confident"
      - "Somewhat confident"
      - "Not confident"
    required: true

output_annotation_dir: "output/"
output_annotation_format: "json"

Datos de ejemplosample-data.json

[
  {
    "id": "1",
    "image_url": "https://images.unsplash.com/photo-1560807707-8cc77767d783?w=640",
    "question": "What color is the dog?"
  },
  {
    "id": "2",
    "image_url": "https://images.unsplash.com/photo-1449824913935-59a10b8d2000?w=640",
    "question": "What time of day does this appear to be?"
  }
]

Obtener este diseño

View on GitHub

Clone or download from the repository

Inicio rápido:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/evaluation/visual-qa
potato start config.yaml

Detalles

Tipos de anotación

radiotext

Dominio

Computer VisionNLP

Casos de uso

Visual QAMultimodal

Etiquetas

vqavisual-qamultimodalimage

¿Encontró un problema o desea mejorar este diseño?

Abrir un issue