NLI with Explanations (e-SNLI)
Natural language inference with human explanations. Based on e-SNLI (Camburu et al., NeurIPS 2018). Classify entailment/contradiction/neutral and provide natural language justifications.
Configuration Fileconfig.yaml
# NLI with Explanations (e-SNLI)
# Based on Camburu et al., NeurIPS 2018
# Paper: https://papers.nips.cc/paper/8163-e-snli
# Dataset: https://github.com/OanaMariaCamburu/e-SNLI
#
# Natural Language Inference (NLI) determines the relationship between
# a premise and hypothesis. e-SNLI adds human-written explanations.
#
# NLI Labels:
# - Entailment: The hypothesis is definitely TRUE given the premise
# - Contradiction: The hypothesis is definitely FALSE given the premise
# - Neutral: The hypothesis may or may not be true (can't determine)
#
# Explanation Guidelines:
# 1. First classify the relationship
# 2. Then highlight key words in premise and hypothesis
# 3. Write a natural language explanation justifying your label
# 4. Focus on NON-OBVIOUS elements that determine the relation
# 5. Don't just repeat what's identical in both sentences
#
# For Entailment: Explain why hypothesis must be true
# For Contradiction: Identify the conflicting information
# For Neutral: Explain what information is missing/uncertain
#
# Good explanations are:
# - Concise but complete
# - Reference specific words/phrases
# - Explain the reasoning, not just restate
annotation_task_name: "NLI with Explanations"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "pair"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
# Step 1: NLI label
- annotation_type: radio
name: label
description: "Given the PREMISE, what is the relationship with the HYPOTHESIS?"
labels:
- "Entailment"
- "Contradiction"
- "Neutral"
tooltips:
"Entailment": "If the premise is true, the hypothesis MUST be true"
"Contradiction": "If the premise is true, the hypothesis MUST be false"
"Neutral": "The premise doesn't determine whether the hypothesis is true or false"
# Step 2: Highlight key words in premise
- annotation_type: span
name: premise_highlights
description: "Highlight the words in the PREMISE that are most relevant to your decision"
labels:
- "Key Evidence"
label_colors:
"Key Evidence": "#3b82f6"
tooltips:
"Key Evidence": "Words or phrases that are crucial for determining the relationship"
allow_overlapping: false
# Step 3: Highlight key words in hypothesis
- annotation_type: span
name: hypothesis_highlights
description: "Highlight the words in the HYPOTHESIS that are most relevant to your decision"
labels:
- "Key Evidence"
label_colors:
"Key Evidence": "#22c55e"
tooltips:
"Key Evidence": "Words or phrases that are crucial for determining the relationship"
allow_overlapping: false
# Step 4: Confidence
- annotation_type: likert
name: confidence
description: "How confident are you in your classification?"
min_value: 1
max_value: 5
labels:
1: "Very uncertain"
2: "Somewhat uncertain"
3: "Moderately confident"
4: "Confident"
5: "Very confident"
allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 3
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "nli_001",
"premise": "A woman in a green jacket is sitting on a bench.",
"hypothesis": "A person is sitting outside.",
"pair": "PREMISE: A woman in a green jacket is sitting on a bench.\nHYPOTHESIS: A person is sitting outside."
},
{
"id": "nli_002",
"premise": "Two dogs are running through a snowy field.",
"hypothesis": "The dogs are sleeping inside a house.",
"pair": "PREMISE: Two dogs are running through a snowy field.\nHYPOTHESIS: The dogs are sleeping inside a house."
}
]
// ... and 6 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/explainability/nli-explanation potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
ESA: Error Span Annotation for Machine Translation
Error span annotation for machine translation output. Annotators identify error spans in translations, classify error types (accuracy, fluency, terminology, style), and rate severity.
LongEval: Faithfulness Evaluation for Long-form Summarization
Faithfulness evaluation of long-form summaries. Annotators identify atomic content units in summaries, check each against source documents for faithfulness, and rate overall summary quality.
News Headline Emotion Roles (GoodNewsEveryone)
Annotate emotions in news headlines with semantic roles. Based on Bostan et al., LREC 2020. Identify emotion, experiencer, cause, target, and textual cue.