MeasEval - Counts and Measurements

Extract and classify measurements, quantities, units, and measured entities from scientific text, based on SemEval-2021 Task 8 (Harper et al.). Annotators span-annotate measurement components and classify quantity types with normalized values.

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# MeasEval - Counts and Measurements
# Based on Harper et al., SemEval 2021
# Paper: https://aclanthology.org/2021.semeval-1.38/
# Dataset: https://github.com/harperco/MeasEval
#
# Annotators extract and classify measurement-related spans from scientific
# text, including quantities, units, measured entities, properties, and
# qualifiers. They also classify the quantity type and provide normalized values.

annotation_task_name: "MeasEval - Counts and Measurements"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: span
    name: measurement_spans
    description: "Highlight measurement components in the text."
    labels:
      - "Quantity"
      - "Unit"
      - "Measured Entity"
      - "Measured Property"
      - "Qualifier"

  - annotation_type: radio
    name: quantity_type
    description: "Classify the type of the primary quantity in this text."
    labels:
      - "Count"
      - "Measurement"
      - "Approximate"
      - "Range"
    keyboard_shortcuts:
      "Count": "1"
      "Measurement": "2"
      "Approximate": "3"
      "Range": "4"
    tooltips:
      "Count": "A discrete count of items or occurrences"
      "Measurement": "A precise measurement with a specific value and unit"
      "Approximate": "An approximate or estimated value"
      "Range": "A range of values (e.g., 10-20, between X and Y)"

  - annotation_type: text
    name: normalized_value
    description: "Provide the normalized numeric value of the primary quantity (e.g., '2.5' for 'two and a half')."

annotation_instructions: |
  You will see a passage from a scientific text containing measurements, counts,
  or quantities. Your task is to:
  1. Highlight the relevant spans: quantities, units, measured entities, measured
     properties, and any qualifiers (e.g., "approximately", "more than").
  2. Classify the type of the primary quantity as Count, Measurement, Approximate,
     or Range.
  3. Provide a normalized numeric value for the primary quantity.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Scientific Text:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "meas_001",
    "text": "The reaction temperature was maintained at 37 degrees Celsius for approximately 24 hours to ensure complete enzyme activation."
  },
  {
    "id": "meas_002",
    "text": "A total of 1,523 participants were enrolled in the clinical trial across 12 medical centers in three countries."
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2021/task08-measeval
potato start config.yaml

Dataset & paper

Harper et al., SemEval 2021

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@inproceedings{harper-etal-2021-semeval,
    title = "{S}em{E}val-2021 {T}ask 8: {M}eas{E}val -- {E}xtracting {C}ounts and {M}easurements and their {R}elated {C}ontexts",
    author = "Harper, Corey  and Cox, Jessica  and Kohler, Curt  and Scuba, Antoinette  and Rick, JoEllen  and Shriberg, Edith",
    booktitle = "Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)",
    year = "2021",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.semeval-1.38"
}

Details

Annotation Types

spanradiotext

Domain

NLPSemEval

Use Cases

Information ExtractionMeasurement ExtractionScientific Text Mining

Related Designs

Clickbait Spoiling

Classification and extraction of spoilers for clickbait posts, including spoiler type identification and span-level spoiler detection. Based on SemEval-2023 Task 5 (Hagen et al.).

textradio

EA-MT - Entity-Aware Machine Translation

Entity-aware machine translation evaluation requiring annotators to identify entity spans, classify translation errors, and provide corrected translations. Based on SemEval-2025 Task 2.

spanradio

Check-COVID: Fact-Checking COVID-19 News Claims

Fact-checking COVID-19 news claims. Annotators verify claims against evidence, identify supporting/refuting spans, and provide verdicts with explanations. Based on the Check-COVID dataset targeting misinformation during the pandemic.

radiospan