Parsing Time Normalizations

Identification and normalization of time expressions in text, combining span annotation for time mentions with ISO-format temporal normalization. Based on SemEval-2018 Task 6.

ملف الإعدادconfig.yaml

# Parsing Time Normalizations
# Based on Laparra et al., SemEval 2018
# Paper: https://aclanthology.org/S18-1011/
# Dataset: https://github.com/bethard/anafora-annotations
#
# This task asks annotators to identify time expressions in text and
# normalize them to ISO 8601 format. Annotators first highlight time
# expression spans, then provide the normalized ISO representation.
#
# Span Labels:
# - Time Expression: Any expression referring to a point or period in time
#
# Normalization: ISO 8601 format (e.g., 2018-03-15, 2018-W12, PT2H30M)

annotation_task_name: "Parsing Time Normalizations"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: span
    name: time_expression_spans
    description: "Highlight all time expressions in the text."
    labels:
      - "Time Expression"

  - annotation_type: text
    name: normalized_time
    description: "Provide the ISO 8601 normalized form of the highlighted time expressions."

annotation_instructions: |
  You will be shown a text passage along with the document date for reference.
  Your task is to:
  1. Highlight all time expressions in the text (dates, times, durations, frequencies).
  2. Provide the ISO 8601 normalized form for each time expression.
  Use the document date to resolve relative expressions like "yesterday" or "last week".

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 12px; margin-bottom: 12px;">
      <strong style="color: #a16207;">Document Date:</strong>
      <span style="font-size: 15px;">{{document_date}}</span>
    </div>
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Text:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

بيانات نموذجيةsample-data.json

[
  {
    "id": "timenorm_001",
    "text": "The meeting is scheduled for next Tuesday at 3:30 PM. Please arrive 15 minutes early to set up the conference room.",
    "document_date": "2018-03-12"
  },
  {
    "id": "timenorm_002",
    "text": "The company was founded on January 15, 2005, and has been operating for over thirteen years. The annual review will take place in December.",
    "document_date": "2018-06-20"
  }
]

// ... and 8 more items

احصل على هذا التصميم

View on GitHub

Clone or download from the repository

بدء سريع:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2018/task06-time-normalization
potato start config.yaml

التفاصيل

أنواع التوسيم

spantext

المجال

SemEvalNLPTemporal ProcessingInformation Extraction

حالات الاستخدام

Time NormalizationTemporal ExtractionInformation Extraction

الوسوم

semevalsemeval-2018shared-tasktime-normalizationtemporalparsinginformation-extraction

وجدت مشكلة أو تريد تحسين هذا التصميم؟

افتح مشكلة

تصاميم ذات صلة

Clickbait Spoiling

Classification and extraction of spoilers for clickbait posts, including spoiler type identification and span-level spoiler detection. Based on SemEval-2023 Task 5 (Hagen et al.).

textradio

Clinical TempEval - Temporal Information Extraction from Clinical Notes

Extraction of temporal information from clinical text, identifying time expressions, event mentions, and their temporal relations. Based on SemEval-2016 Task 12 (Clinical TempEval).

spanradio

CrossRE: Cross-Domain Relation Extraction

Cross-domain relation extraction across 6 domains (news, politics, science, music, literature, AI). Annotators identify entities and label 17 relation types between entity pairs, enabling study of domain transfer in relation extraction.

spanspan_link