Skip to content
Showcase/Visual Question Answering
beginnerimage

Visual Question Answering

Answer questions about images for VQA dataset creation.

Labels:outdoornatureurbanpeopleanimal+

Configuration Fileconfig.yaml

annotation_task_name: "Visual Question Answering"
task_name: "Visual Question Answering"
task_description: "Answer the question about the image."
task_dir: "."
port: 8000

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  image_key: "image_url"
  context_key: "question"

annotation_schemes:
  - annotation_type: text
    name: answer
    description: "Provide a concise answer to the question"
    required: true

  - annotation_type: radio
    name: confidence
    description: "How confident are you in your answer?"
    labels:
      - "Very confident"
      - "Somewhat confident"
      - "Not confident"
    required: true

output_annotation_dir: "output/"
output_annotation_format: "json"

Sample Datasample-data.json

[
  {
    "id": "1",
    "image_url": "https://images.unsplash.com/photo-1560807707-8cc77767d783?w=640",
    "question": "What color is the dog?"
  },
  {
    "id": "2",
    "image_url": "https://images.unsplash.com/photo-1449824913935-59a10b8d2000?w=640",
    "question": "What time of day does this appear to be?"
  }
]

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/evaluation/visual-qa
potato start config.yaml

Details

Annotation Types

radiotext

Domain

Computer VisionNLP

Use Cases

Visual QAMultimodal

Tags

vqavisual-qamultimodalimage

Found an issue or want to improve this design?

Open an Issue