Table-Based Fact Verification (TabFact)

Verify textual claims against structured tabular data from Wikipedia. Based on TabFact (Chen et al., ICLR 2020). Annotators determine whether a natural language statement is entailed or refuted by the contents of a given table, highlight evidence cells, and provide reasoning.

Configuration Fileconfig.yaml

yaml

# Table-Based Fact Verification (TabFact)
# Based on Chen et al., ICLR 2020
# Paper: https://arxiv.org/abs/1909.02164
# Dataset: https://github.com/wenhuchen/Table-Fact-Checking
#
# This task asks annotators to verify natural language claims against
# structured tabular data extracted from Wikipedia. The annotator must
# determine whether the table entails or refutes the claim, identify
# which cells provide evidence, and explain their reasoning.
#
# Verdict Labels:
# - ENTAILED: The table provides sufficient evidence that the claim is true
# - REFUTED: The table provides evidence that contradicts the claim
# - UNKNOWN: The table does not contain enough information to verify or refute
#
# Annotation Guidelines:
# 1. Read the claim carefully and identify the specific assertion being made
# 2. Examine the full table, paying attention to column headers and row labels
# 3. Look for cells that directly relate to the entities and values in the claim
# 4. Check for numerical comparisons (greater, less, equal, superlatives)
# 5. Verify whether the claim correctly describes relationships in the table
# 6. Highlight cells that serve as evidence for your verdict
# 7. Mark contradicting evidence if the claim gets a detail wrong
# 8. Provide a brief explanation of your reasoning
#
# Common Pitfalls:
# - Claims may reference aggregated values (totals, averages) not in the table
# - Watch for off-by-one errors in rankings and ordinal claims
# - Some claims involve multiple conditions that all must be verified
# - Distinguish between "not supported" and "actively contradicted"

annotation_task_name: "Table-Based Fact Verification (TabFact)"
task_dir: "."

data_files:
  - sample-data.json
item_properties:
  id_key: "id"
  text_key: "claim"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

annotation_schemes:
  # Step 1: Verdict on the claim
  - annotation_type: radio
    name: verdict
    description: "Based on the table, is the claim entailed (true), refuted (false), or unknown?"
    labels:
      - "entailed"
      - "refuted"
      - "unknown"
    keyboard_shortcuts:
      "entailed": "e"
      "refuted": "r"
      "unknown": "u"
    tooltips:
      "entailed": "The table provides sufficient evidence that the claim is true"
      "refuted": "The table provides evidence that contradicts the claim"
      "unknown": "The table does not contain enough information to determine truth"

  # Step 2: Evidence cell highlighting
  - annotation_type: span
    name: evidence_cells
    description: "Highlight the relevant parts of the table that support or contradict the claim"
    labels:
      - "supporting-evidence"
      - "contradicting-evidence"
      - "irrelevant"
    label_colors:
      "supporting-evidence": "#22c55e"
      "contradicting-evidence": "#ef4444"
      "irrelevant": "#9ca3af"
    tooltips:
      "supporting-evidence": "Cells or text that directly support the claim being true"
      "contradicting-evidence": "Cells or text that directly contradict the claim"
      "irrelevant": "Content that is related but does not serve as evidence"
    allow_overlapping: false

  # Step 3: Reasoning explanation
  - annotation_type: text
    name: reasoning
    description: "Explain why the statement is entailed or refuted by the table. Reference specific cells or values."

annotation_instructions: |
  You will be shown a claim and an HTML table from Wikipedia. Your task is to:
  1. Determine whether the claim is ENTAILED (supported by the table), REFUTED (contradicted by the table), or UNKNOWN (cannot be determined from the table).
  2. Highlight the specific cells or text that serve as evidence for your decision.
  3. Provide a brief explanation referencing the specific values or relationships you used.

  Tips:
  - Pay attention to numerical comparisons and rankings.
  - Some claims require combining information from multiple cells.
  - If the table simply does not contain the relevant information, choose "unknown".

html_layout: |
  <div style="padding: 15px; max-width: 900px; margin: auto;">
    <h3 style="color: #1e40af; margin-bottom: 10px;">Table: {{table_title}}</h3>
    <div style="overflow-x: auto; margin-bottom: 20px; border: 1px solid #e5e7eb; border-radius: 8px; padding: 10px; background: #f9fafb;">
      {{table_html}}
    </div>
    <div style="background: #fffbeb; border-left: 4px solid #f59e0b; padding: 12px 16px; border-radius: 4px;">
      <strong style="color: #92400e;">Claim:</strong>
      <span style="font-size: 16px; line-height: 1.6;">{{claim}}</span>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "tabfact_001",
    "claim": "Brazil has won the FIFA World Cup more times than Germany.",
    "table_html": "<table style='border-collapse: collapse; width: 100%;'><tr style='background:#dbeafe;'><th style='border:1px solid #ccc;padding:6px;'>Country</th><th style='border:1px solid #ccc;padding:6px;'>Titles</th><th style='border:1px solid #ccc;padding:6px;'>Runner-up</th><th style='border:1px solid #ccc;padding:6px;'>Third Place</th><th style='border:1px solid #ccc;padding:6px;'>Years Won</th></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Brazil</td><td style='border:1px solid #ccc;padding:6px;'>5</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>1958, 1962, 1970, 1994, 2002</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Germany</td><td style='border:1px solid #ccc;padding:6px;'>4</td><td style='border:1px solid #ccc;padding:6px;'>4</td><td style='border:1px solid #ccc;padding:6px;'>4</td><td style='border:1px solid #ccc;padding:6px;'>1954, 1974, 1990, 2014</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Italy</td><td style='border:1px solid #ccc;padding:6px;'>4</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>1</td><td style='border:1px solid #ccc;padding:6px;'>1934, 1938, 1982, 2006</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Argentina</td><td style='border:1px solid #ccc;padding:6px;'>3</td><td style='border:1px solid #ccc;padding:6px;'>3</td><td style='border:1px solid #ccc;padding:6px;'>0</td><td style='border:1px solid #ccc;padding:6px;'>1978, 1986, 2022</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>France</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>2</td><td style='border:1px solid #ccc;padding:6px;'>1998, 2018</td></tr></table>",
    "table_title": "FIFA World Cup All-Time Results",
    "source": "Wikipedia"
  },
  {
    "id": "tabfact_002",
    "claim": "The population of Tokyo is less than that of Delhi.",
    "table_html": "<table style='border-collapse: collapse; width: 100%;'><tr style='background:#dbeafe;'><th style='border:1px solid #ccc;padding:6px;'>City</th><th style='border:1px solid #ccc;padding:6px;'>Country</th><th style='border:1px solid #ccc;padding:6px;'>Population (millions)</th><th style='border:1px solid #ccc;padding:6px;'>Area (km2)</th><th style='border:1px solid #ccc;padding:6px;'>Density (/km2)</th></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Tokyo</td><td style='border:1px solid #ccc;padding:6px;'>Japan</td><td style='border:1px solid #ccc;padding:6px;'>37.4</td><td style='border:1px solid #ccc;padding:6px;'>2,191</td><td style='border:1px solid #ccc;padding:6px;'>6,158</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Delhi</td><td style='border:1px solid #ccc;padding:6px;'>India</td><td style='border:1px solid #ccc;padding:6px;'>32.9</td><td style='border:1px solid #ccc;padding:6px;'>1,484</td><td style='border:1px solid #ccc;padding:6px;'>11,312</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Shanghai</td><td style='border:1px solid #ccc;padding:6px;'>China</td><td style='border:1px solid #ccc;padding:6px;'>29.2</td><td style='border:1px solid #ccc;padding:6px;'>6,341</td><td style='border:1px solid #ccc;padding:6px;'>3,854</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Sao Paulo</td><td style='border:1px solid #ccc;padding:6px;'>Brazil</td><td style='border:1px solid #ccc;padding:6px;'>22.4</td><td style='border:1px solid #ccc;padding:6px;'>1,521</td><td style='border:1px solid #ccc;padding:6px;'>7,712</td></tr><tr><td style='border:1px solid #ccc;padding:6px;'>Mexico City</td><td style='border:1px solid #ccc;padding:6px;'>Mexico</td><td style='border:1px solid #ccc;padding:6px;'>21.8</td><td style='border:1px solid #ccc;padding:6px;'>1,485</td><td style='border:1px solid #ccc;padding:6px;'>9,209</td></tr></table>",
    "table_title": "World's Largest Metropolitan Areas by Population",
    "source": "Wikipedia"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/text/tabular/tabfact-table-verification
potato start config.yaml

Details

Annotation Types

radiospantext

Domain

NLPFact VerificationTable Understanding

Use Cases

Fact VerificationStructured Data ReasoningTable QA

Related Designs

Structured Fact Verification (FEVEROUS)

Verify claims against Wikipedia pages containing both unstructured text and structured data (tables, lists, infoboxes). Based on FEVEROUS (Aly et al., NeurIPS 2021). Annotators assess verdict, identify evidence across different modalities, evaluate evidence sufficiency, and provide justification.

radiospan

Check-COVID: Fact-Checking COVID-19 News Claims

Fact-checking COVID-19 news claims. Annotators verify claims against evidence, identify supporting/refuting spans, and provide verdicts with explanations. Based on the Check-COVID dataset targeting misinformation during the pandemic.