Image Captioning Evaluation

Rate AI-generated image captions for accuracy, fluency, and detail.

ملف الإعدادconfig.yaml

annotation_task_name: "Image Captioning Evaluation"
task_name: "Image Captioning Evaluation"
task_description: "Rate the quality of the generated caption for the image."
task_dir: "."
port: 8000

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  image_key: "image_url"
  context_key: "caption"

annotation_schemes:
  - annotation_type: likert
    name: accuracy
    description: "Does the caption accurately describe what's in the image?"
    size: 5
    min_label: "Inaccurate"
    max_label: "Accurate"
    required: true

  - annotation_type: likert
    name: detail
    description: "How detailed is the caption?"
    size: 5
    min_label: "Too vague"
    max_label: "Appropriate detail"
    required: true

  - annotation_type: radio
    name: hallucination
    description: "Does the caption mention things not in the image?"
    labels:
      - "Yes, hallucinations present"
      - "No hallucinations"
    required: true

output_annotation_dir: "output/"
output_annotation_format: "json"

بيانات نموذجيةsample-data.json

[
  {
    "id": "1",
    "image_url": "https://images.unsplash.com/photo-1543466835-00a7907e9de1?w=640",
    "caption": "A brown dog sitting on grass looking at the camera with its tongue out."
  },
  {
    "id": "2",
    "image_url": "https://images.unsplash.com/photo-1504208434309-cb69f4fe52b0?w=640",
    "caption": "A sunset over mountains with orange and purple clouds."
  }
]

احصل على هذا التصميم

View on GitHub

Clone or download from the repository

بدء سريع:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/evaluation/image-captioning-eval
potato start config.yaml

التفاصيل

أنواع التوسيم

likertradio

المجال

Computer VisionNLP

حالات الاستخدام

Image CaptioningVLM Evaluation

الوسوم

captioningimageevaluationvlm

وجدت مشكلة أو تريد تحسين هذا التصميم؟

افتح مشكلة

تصاميم ذات صلة

Image Classification

Multi-class image classification with thumbnail preview and zoom controls.

likertmultiselect

T2I-CompBench Text-to-Image Evaluation

Compositional text-to-image generation evaluation based on T2I-CompBench (Huang et al., NeurIPS 2023). Annotators rate image quality on a Likert scale, classify the compositional challenge type, and compare pairs of generated images via pairwise preference.

likertradio

AnnoMI Counselling Dialogue Annotation

Annotation of motivational interviewing counselling dialogues based on the AnnoMI dataset. Annotators label therapist and client utterances for MI techniques (open questions, reflections, affirmations) and client change talk (sustain talk, change talk), with quality ratings for therapeutic interactions.

radiomultiselect