Pairwise Preference with Rationale

Compare two AI responses and select the better one while providing a written justification. Used for reward model training with interpretable preference signals.

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# Pairwise Preference with Rationale Configuration
# Based on HelpSteer2-Preference / HelpSteer3-Preference methodology
# Task: Compare two responses, select the better one, and explain why

annotation_task_name: "Pairwise Preference with Rationale"
task_dir: "."

# Data configuration
data_files:
  - data.json
item_properties:
  id_key: "id"
  text_key: "prompt"

# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

# Display layout showing prompt and both responses
html_layout: |
  <div class="comparison-container">
    <div class="prompt-section" style="background: #f5f5f5; padding: 15px; border-radius: 8px; margin-bottom: 20px;">
      <h3 style="margin-top: 0;">User Prompt:</h3>
      <div class="prompt-text">{{prompt}}</div>
    </div>
    <div class="responses-row" style="display: flex; gap: 20px;">
      <div class="response-a" style="flex: 1; background: #e3f2fd; padding: 15px; border-radius: 8px;">
        <h3 style="margin-top: 0; color: #1565c0;">Response A:</h3>
        <div class="response-text">{{response_a}}</div>
      </div>
      <div class="response-b" style="flex: 1; background: #fce4ec; padding: 15px; border-radius: 8px;">
        <h3 style="margin-top: 0; color: #c62828;">Response B:</h3>
        <div class="response-text">{{response_b}}</div>
      </div>
    </div>
  </div>

# Annotation schemes
annotation_schemes:
  - name: "preference"
    description: "Which response is better overall?"
    annotation_type: radio
    labels:
      - "Response A is significantly better"
      - "Response A is slightly better"
      - "About the same (tie)"
      - "Response B is slightly better"
      - "Response B is significantly better"
    keyboard_shortcuts:
      "Response A is significantly better": "1"
      "Response A is slightly better": "2"
      "About the same (tie)": "3"
      "Response B is slightly better": "4"
      "Response B is significantly better": "5"

  - name: "confidence"
    description: "How confident are you in your preference judgment?"
    annotation_type: radio
    labels:
      - "Very confident"
      - "Somewhat confident"
      - "Not very confident"
    keyboard_shortcuts:
      "Very confident": "q"
      "Somewhat confident": "w"
      "Not very confident": "e"

  - name: "rationale"
    description: |
      Explain why you prefer one response over the other.
      Consider: helpfulness, accuracy, clarity, completeness, and appropriateness.
    annotation_type: text
    min_length: 20
    max_length: 500
    placeholder: "Explain your preference (e.g., 'Response A is better because it provides more specific examples and addresses the question directly, while Response B is too vague and misses key points...')"

  - name: "key_differences"
    description: "What are the main factors that influenced your decision? (Select all that apply)"
    annotation_type: multiselect
    labels:
      - "Accuracy/Correctness"
      - "Helpfulness"
      - "Completeness"
      - "Clarity/Organization"
      - "Relevance"
      - "Appropriate length"
      - "Better examples"
      - "More engaging tone"
      - "Safety/Appropriateness"

# User configuration
allow_all_users: true

# Task assignment
instances_per_annotator: 80
annotation_per_instance: 2

# Instructions
annotation_instructions: |
  ## Pairwise Response Comparison Task

  Your goal is to compare two AI responses and determine which is better.

  ### Steps:
  1. Read the user prompt carefully
  2. Read both Response A and Response B thoroughly
  3. Select your preference (A better, B better, or tie)
  4. Rate your confidence in the judgment
  5. Write a brief rationale explaining your choice
  6. Select the key factors that influenced your decision

  ### Evaluation Criteria:

  **Primary Factors:**
  - **Helpfulness**: Does it address what the user actually needs?
  - **Accuracy**: Is the information correct and reliable?
  - **Completeness**: Does it fully answer the question?

  **Secondary Factors:**
  - **Clarity**: Is it well-organized and easy to understand?
  - **Relevance**: Does it stay on topic?
  - **Appropriateness**: Is the tone and length suitable?

  ### Writing Your Rationale:
  - Be specific about what makes one response better
  - Reference concrete examples from the responses
  - Keep it concise (2-4 sentences typically sufficient)
  - Focus on the most important differences

  ### When to Choose "Tie":
  - Both responses are roughly equally good (or equally bad)
  - They have different strengths that balance out
  - Use sparingly - most pairs will have a clear winner

  ### Tips:
  - Read the prompt first to understand user intent
  - Don't let response length alone determine your preference
  - Consider what would be most useful for the user
  - Be consistent in your evaluation criteria

Sample Datasample-data.json

json

[
  {
    "id": "pref_001",
    "prompt": "How do I make a good cup of coffee at home?",
    "response_a": "To make good coffee: 1) Use fresh, quality beans. 2) Grind just before brewing. 3) Use proper water temperature (195-205°F). 4) Measure your coffee (1:16 ratio of coffee to water). 5) Use filtered water. The key is consistency - once you find what you like, stick with it.",
    "response_b": "Just put coffee in the machine and press start. Add cream and sugar if you want."
  },
  {
    "id": "pref_002",
    "prompt": "Explain machine learning in simple terms.",
    "response_a": "Machine learning is when computers learn patterns from examples. Like how you learned to recognize dogs by seeing many dogs, a computer can learn to recognize spam emails by seeing thousands of examples labeled 'spam' or 'not spam'. It finds patterns and uses them to make predictions on new data it hasn't seen before.",
    "response_b": "Machine learning is a subset of artificial intelligence that uses statistical techniques to enable computer systems to learn from data, identify patterns, and make decisions with minimal human intervention. It employs algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. The three main types are supervised learning, unsupervised learning, and reinforcement learning."
  }
]

// ... and 3 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/preference-learning/pairwise-preference-rationale
potato start config.yaml

Dataset & paper

Wang et al., 2024

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@article{wang2024helpsteer2preference,
  title={HelpSteer2-Preference: Complementing Ratings with Human Preferences},
  author={Wang, Zhilin and others},
  journal={arXiv preprint arXiv:2410.01257},
  year={2024}
}

Details

Annotation Types

multiselectradiotext

Domain

NLPAI Alignment

Use Cases

Reward ModelingRLHFPreference Learning

Related Designs

DPO Preference Data Collection

Pairwise preference annotation for Direct Preference Optimization, based on Rafailov et al., NeurIPS 2023. Annotators compare two model responses to a prompt, select a preference, rate alignment dimensions, and provide reasoning.

pairwiseradio

FLUTE: Figurative Language Understanding through Textual Explanations

Figurative language understanding via NLI. Annotators classify figurative sentences (sarcasm, simile, metaphor, idiom) and provide textual explanations of the figurative meaning. The task combines natural language inference with fine-grained figurative language type classification.

radiotext

LexGLUE: Legal Language Understanding Benchmark

LexGLUE is a benchmark of 7 legal NLP datasets in English (Chalkidis et al., ACL 2022) covering EU and US law. This Potato config reproduces its classification tasks for annotation.

radiomultiselect