Showcase/Image Captioning Evaluation
beginnerimage

Image Captioning Evaluation

Rate AI-generated image captions for accuracy, fluency, and detail.

🖼️

image annotation

Configuration Fileconfig.yaml

task_name: "Image Captioning Evaluation"
task_description: "Rate the quality of the generated caption for the image."
task_dir: "."
port: 8000

data_files:
  - "sample-data.json"

item_properties:
  id_key: id
  image_key: image_url
  context_key: caption

annotation_schemes:
  - annotation_type: likert
    name: accuracy
    description: "Does the caption accurately describe what's in the image?"
    size: 5
    min_label: "Inaccurate"
    max_label: "Accurate"
    required: true

  - annotation_type: likert
    name: detail
    description: "How detailed is the caption?"
    size: 5
    min_label: "Too vague"
    max_label: "Appropriate detail"
    required: true

  - annotation_type: radio
    name: hallucination
    description: "Does the caption mention things not in the image?"
    labels:
      - "Yes, hallucinations present"
      - "No hallucinations"
    required: true

output_annotation_dir: "output/"
output_annotation_format: "json"

Sample Datasample-data.json

[
  {
    "id": "1",
    "image_url": "https://images.unsplash.com/photo-1543466835-00a7907e9de1?w=640",
    "caption": "A brown dog sitting on grass looking at the camera with its tongue out."
  },
  {
    "id": "2",
    "image_url": "https://images.unsplash.com/photo-1504208434309-cb69f4fe52b0?w=640",
    "caption": "A sunset over mountains with orange and purple clouds."
  }
]

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/image-captioning-eval
potato start config.yaml

Details

Annotation Types

likertimage

Domain

Computer VisionNLP

Use Cases

Image CaptioningVLM Evaluation

Tags

captioningimageevaluationvlm

Found an issue or want to improve this design?

Open an Issue