NumEval - Numeral-Aware Language Understanding

Numeral-aware language understanding task requiring annotators to predict numerical values from text, classify numeral types, and provide explanations. Based on SemEval-2024 Task 7 (NumEval).

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# NumEval - Numeral-Aware Language Understanding
# Based on Chen et al., SemEval 2024
# Paper: https://aclanthology.org/volumes/2024.semeval-1/
# Dataset: https://github.com/SemEval/semeval-2024-task7
#
# This task asks annotators to predict numerical values from text context,
# classify the type of numeral, and provide an explanation of how they
# derived the answer.

annotation_task_name: "NumEval - Numeral-Aware Language Understanding"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: number
    name: predicted_value
    description: "Predicted numerical value"

  - annotation_type: radio
    name: numeral_type
    description: "What type of numeral is the expected answer?"
    labels:
      - "Exact Number"
      - "Approximate"
      - "Range"
      - "Ordinal"
      - "Other"
    keyboard_shortcuts:
      "Exact Number": "1"
      "Approximate": "2"
      "Range": "3"
      "Ordinal": "4"
      "Other": "5"
    tooltips:
      "Exact Number": "A precise numerical value (e.g., 42, 3.14)"
      "Approximate": "An estimated or rounded value (e.g., about 100, roughly 50%)"
      "Range": "A range of values (e.g., between 10 and 20)"
      "Ordinal": "A position or rank (e.g., first, 3rd place)"
      "Other": "Other numerical expressions not covered above"

  - annotation_type: text
    name: explanation
    description: "Explain how you derived the numerical answer from the text."

annotation_instructions: |
  You will be shown a text that contains or implies a numerical value. Your task is to:
  1. Read the text carefully and identify the relevant numerical information.
  2. Enter the predicted numerical value.
  3. Classify the type of numeral (exact, approximate, range, ordinal, or other).
  4. Explain your reasoning for arriving at the predicted value.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Text:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 12px;">
      <strong style="color: #a16207;">Expected Answer:</strong> <span>{{expected_answer}}</span>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "numeval_001",
    "text": "The company reported quarterly revenue of $2.3 billion, representing a 15% increase over the same period last year when revenue was [MASK].",
    "expected_answer": "2.0 billion"
  },
  {
    "id": "numeval_002",
    "text": "If a train travels at 120 km/h for 2.5 hours, the total distance covered is [MASK] kilometers.",
    "expected_answer": "300"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2024/task07-numeval
potato start config.yaml

Dataset & paper

Chen et al., SemEval 2024

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@inproceedings{chen-etal-2024-numeval,
    title = "{N}um{E}val: Numeral-Aware Language Understanding",
    author = "Chen, Chung-Chi and others",
    booktitle = "Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

Details

Annotation Types

numberradiotext

Domain

SemEvalNLPNumeracyLanguage Understanding

Use Cases

Numerical ReasoningNumeral UnderstandingQuantitative NLP

Related Designs

Argument Reasoning in Civil Procedure

Legal argument reasoning task requiring annotators to answer multiple-choice questions about civil procedure by selecting the best answer and providing legal reasoning. Based on SemEval-2024 Task 5.

radiotext

BIG-Bench Task Evaluation

Evaluate language model responses on diverse reasoning tasks from the BIG-Bench benchmark. Annotators assess correctness, provide reasoning explanations, and rate confidence for model outputs across multiple task categories.

radiotext

BRAINTEASER - Commonsense-Defying QA

Lateral thinking and commonsense-defying question answering task requiring annotators to select answers to brain teasers that defy default commonsense assumptions and provide explanations. Based on SemEval-2024 Task 9 (BRAINTEASER).

radiotext

NumEval - Numeral-Aware Language Understanding

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Dataset & paper

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

Argument Reasoning in Civil Procedure

BIG-Bench Task Evaluation

BRAINTEASER - Commonsense-Defying QA