AlpacaFarm Preference Simulation

Simulate human preferences for instruction-following responses. Create preference data for efficient RLHF research and LLM evaluation.

Configuration Fileconfig.yaml

yaml

# AlpacaFarm Preference Simulation Configuration
# Based on Dubois et al., NeurIPS 2023
# Task: Collect preferences for instruction-following

annotation_task_name: "AlpacaFarm Preference Simulation"
task_dir: "."

data_files:
  - data.json
item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

annotation_schemes:
  - name: "preference"
    description: |
      Which response better follows the instruction?
      Consider helpfulness, accuracy, and appropriateness.
    annotation_type: radio
    labels:
      - "Response 1 is much better"
      - "Response 1 is slightly better"
      - "Tie - both are equal"
      - "Response 2 is slightly better"
      - "Response 2 is much better"

  - name: "preference_reason"
    description: "Primary reason for your preference:"
    annotation_type: radio
    labels:
      - "More accurate/correct"
      - "More helpful/useful"
      - "Better formatted/organized"
      - "More appropriate tone"
      - "More complete"
      - "More concise"
      - "Followed instructions better"
      - "Both equally good/bad"
      - "Other"

  - name: "response1_quality"
    description: "Rate Response 1 quality (1-5):"
    annotation_type: likert
    min_label: "1 - Very poor"
    max_label: "5 - Excellent"
    size: 5

  - name: "response2_quality"
    description: "Rate Response 2 quality (1-5):"
    annotation_type: likert
    min_label: "1 - Very poor"
    max_label: "5 - Excellent"
    size: 5

  - name: "task_difficulty"
    description: "How difficult was this instruction to follow well?"
    annotation_type: radio
    labels:
      - "Very easy"
      - "Easy"
      - "Moderate"
      - "Difficult"
      - "Very difficult"

  - name: "annotation_confidence"
    description: "How confident are you in your preference?"
    annotation_type: radio
    labels:
      - "Very confident"
      - "Somewhat confident"
      - "Not very confident"

allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 2

annotation_instructions: |
  ## AlpacaFarm Preference Annotation

  Compare two AI responses to the same instruction and indicate your preference.

  ### Evaluation Criteria:

  **Instruction Following**
  - Did it do what was asked?
  - Did it follow any specific requirements?
  - Did it stay on topic?

  **Helpfulness**
  - Would this actually help the user?
  - Is the information useful?
  - Is it actionable?

  **Accuracy**
  - Is the information correct?
  - Are there factual errors?
  - Is it misleading?

  **Quality**
  - Is it well-written?
  - Is it appropriately detailed?
  - Is the tone appropriate?

  ### Preference Scale:
  - **Much better**: Clear, obvious winner
  - **Slightly better**: Marginal advantage
  - **Tie**: Genuinely indistinguishable

  ### Guidelines:
  - Read the instruction first
  - Consider what a typical user would want
  - Don't overthink - go with your gut
  - Ties are okay when responses are equal

  ### Common Reasons for Preference:
  - More accurate information
  - Better addresses the actual question
  - More helpful/actionable
  - Better organized/formatted
  - More appropriate length
  - Better tone for the context

Sample Datasample-data.json

json

[
  {
    "id": "alpaca_001",
    "instruction": "Explain the concept of compound interest to a teenager.",
    "response_1": "Compound interest is when you earn interest on both your original money AND on the interest you've already earned. Imagine you put $100 in a savings account with 10% interest. After year 1, you have $110. In year 2, you earn 10% on $110 (not just $100), giving you $121. It's like a snowball rolling downhill, getting bigger and bigger!",
    "response_2": "Compound interest is the interest calculated on the initial principal and also on the accumulated interest from previous periods. The formula is A = P(1 + r/n)^(nt), where A is the final amount, P is principal, r is the annual interest rate, n is the number of times interest is compounded per year, and t is the number of years."
  },
  {
    "id": "alpaca_002",
    "instruction": "Write a short poem about coffee.",
    "response_1": "Morning's dark embrace,\nSteam rises, warmth fills the cup,\nEnergy awaits.",
    "response_2": "Coffee, oh coffee, so brown and so hot,\nI drink you each morning, I like you a lot.\nYou wake me up when I'm feeling so tired,\nYour caffeine boost is what I desired."
  }
]

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/preference-learning/alpacafarm-simulation
potato start config.yaml

Details

Annotation Types

likertradio

Domain

Natural Language ProcessingAI Alignment

Use Cases

RLHFPreference LearningModel Evaluation

Related Designs

InstructGPT Instruction Following

Evaluate how well AI responses follow user instructions. Compare outputs on helpfulness, truthfulness, and harmlessness for RLHF training.

likertradio

Constitutional AI Harmlessness Evaluation

Evaluate AI assistant responses for harmlessness and helpfulness based on the Constitutional AI framework by Anthropic. Annotators rate responses on a harmfulness scale, assess helpfulness, and provide explanations for their judgments.

radiolikert

OpenAssistant Conversation Quality

Rate AI assistant responses across multiple quality dimensions. Evaluate conversations for the OpenAssistant crowdsourced dataset.

likertradio

AlpacaFarm Preference Simulation

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

InstructGPT Instruction Following

Constitutional AI Harmlessness Evaluation

OpenAssistant Conversation Quality