beginnerimage
Visual Question Answering
Answer questions about images for VQA dataset creation.
Archivo de configuraciónconfig.yaml
annotation_task_name: "Visual Question Answering"
task_name: "Visual Question Answering"
task_description: "Answer the question about the image."
task_dir: "."
port: 8000
data_files:
- "sample-data.json"
item_properties:
id_key: "id"
text_key: "image_url"
image_key: "image_url"
context_key: "question"
annotation_schemes:
- annotation_type: text
name: answer
description: "Provide a concise answer to the question"
required: true
- annotation_type: radio
name: confidence
description: "How confident are you in your answer?"
labels:
- "Very confident"
- "Somewhat confident"
- "Not confident"
required: true
output_annotation_dir: "output/"
output_annotation_format: "json"
Datos de ejemplosample-data.json
[
{
"id": "1",
"image_url": "https://images.unsplash.com/photo-1560807707-8cc77767d783?w=640",
"question": "What color is the dog?"
},
{
"id": "2",
"image_url": "https://images.unsplash.com/photo-1449824913935-59a10b8d2000?w=640",
"question": "What time of day does this appear to be?"
}
]Obtener este diseño
View on GitHub
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/evaluation/visual-qa potato start config.yaml
Detalles
Tipos de anotación
radiotext
Dominio
Computer VisionNLP
Casos de uso
Visual QAMultimodal
Etiquetas
vqavisual-qamultimodalimage
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
TextVQA - Reading Text in Images
Visual question answering that requires reading and reasoning about text present in images. Based on the TextVQA dataset (Singh et al., CVPR 2019), annotators answer questions about images where understanding scene text (signs, labels, menus, etc.) is essential.
textradio
CUB-200-2011 Fine-Grained Bird Classification
Fine-grained visual categorization of 200 bird species (Wah et al., 2011). Annotate bird images with species labels, part locations, and attribute annotations.
multiselectradio
EPIC-KITCHENS Egocentric Action Annotation
Annotate fine-grained actions in egocentric kitchen videos with verb-noun pairs. Identify cooking actions from a first-person perspective.
radiotext