Skip to content
Showcase/Natural Questions - Open-Domain Question Answering
intermediatetext

Natural Questions - Open-Domain Question Answering

Open-domain question answering over Wikipedia passages, based on Google's Natural Questions dataset (Kwiatkowski et al., TACL 2019). Annotators identify both short and long answer spans and determine answerability.

Q1: Rate your experience12345Q2: Primary use case?ResearchIndustryEducationQ3: Additional feedback

Configuration Fileconfig.yaml

# Natural Questions - Open-Domain Question Answering
# Based on Kwiatkowski et al., TACL 2019
# Paper: https://aclanthology.org/Q19-1026/
# Dataset: https://ai.google.com/research/NaturalQuestions
#
# This task presents a real user question from Google Search along with
# a Wikipedia passage. Annotators identify short and long answer spans
# and determine whether the passage contains an answer.
#
# Answer Types:
# - Short Answer: The minimal span that directly answers the question
# - Long Answer: A paragraph or section containing the answer context
#
# Annotation Guidelines:
# 1. Read the question carefully
# 2. Read the Wikipedia passage
# 3. Determine if the passage answers the question
# 4. If yes, highlight the long answer span (paragraph-level)
# 5. Within that, highlight the short answer span (phrase-level)
# 6. Type the short answer text
# 7. If the passage does not contain the answer, select "No Answer"

annotation_task_name: "Natural Questions - Open-Domain Question Answering"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  # Step 1: Highlight answer spans
  - annotation_type: span
    name: answer_spans
    description: "Highlight the short answer span (exact answer) and the long answer span (surrounding context)"
    labels:
      - "Short Answer"
      - "Long Answer"
    label_colors:
      "Short Answer": "#3b82f6"
      "Long Answer": "#22c55e"

  # Step 2: Answerability judgment
  - annotation_type: radio
    name: answerability
    description: "Does this passage contain the answer to the question?"
    labels:
      - "Has Answer"
      - "No Answer"
    keyboard_shortcuts:
      "Has Answer": "1"
      "No Answer": "2"
    tooltips:
      "Has Answer": "The passage contains enough information to answer the question"
      "No Answer": "The passage does not contain the answer to the question"

  # Step 3: Type the short answer
  - annotation_type: text
    name: short_answer_text
    description: "Type the short answer to the question (if answerable)"

annotation_instructions: |
  You will be shown a question and a Wikipedia passage. Your task is to:
  1. Determine if the passage contains the answer to the question.
  2. If answerable, highlight the Long Answer (the relevant paragraph) and the Short Answer (the specific phrase or entity).
  3. Type the short answer in the text field.
  4. If the passage does not answer the question, select "No Answer."

  Short answers are typically entities, dates, numbers, or short phrases.
  Long answers are the paragraphs or sections that provide context for the short answer.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #a16207; font-size: 18px;">Question:</strong>
      <p style="font-size: 17px; line-height: 1.6; margin: 8px 0 0 0; font-weight: 500;">{{question}}</p>
    </div>
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Wikipedia Passage:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

[
  {
    "id": "nq_001",
    "text": "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower. Constructed from 1887 to 1889, it was initially criticized by some of France's leading artists and intellectuals, but it has become a global cultural icon of France and one of the most recognizable structures in the world. The tower is 330 metres tall and was the tallest man-made structure in the world until the Chrysler Building was completed in 1930.",
    "question": "How tall is the Eiffel Tower?"
  },
  {
    "id": "nq_002",
    "text": "Photosynthesis is a process used by plants and other organisms to convert light energy into chemical energy that can be stored and later released to fuel the organism's activities. This process involves the absorption of carbon dioxide and water, using sunlight as an energy source, to produce glucose and oxygen. The overall equation for photosynthesis is: 6CO2 + 6H2O + light energy = C6H12O6 + 6O2.",
    "question": "What are the products of photosynthesis?"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/text/question-answering/natural-questions-qa
potato start config.yaml

Details

Annotation Types

spanradiotext

Domain

NLPQuestion Answering

Use Cases

Open-Domain QAInformation RetrievalReading Comprehension

Tags

natural-questionsopen-domain-qawikipediagoogletacl2019

Found an issue or want to improve this design?

Open an Issue