AlpacaFarm Preference Simulation

Simulate human preferences for instruction-following responses. Create preference data for efficient RLHF research and LLM evaluation.

Konfigurationsdateiconfig.yaml

# AlpacaFarm Preference Simulation Configuration
# Based on Dubois et al., NeurIPS 2023
# Task: Collect preferences for instruction-following

annotation_task_name: "AlpacaFarm Preference Simulation"
task_dir: "."

data_files:
  - data.json
item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

annotation_schemes:
  - name: "preference"
    description: |
      Which response better follows the instruction?
      Consider helpfulness, accuracy, and appropriateness.
    annotation_type: radio
    labels:
      - "Response 1 is much better"
      - "Response 1 is slightly better"
      - "Tie - both are equal"
      - "Response 2 is slightly better"
      - "Response 2 is much better"

  - name: "preference_reason"
    description: "Primary reason for your preference:"
    annotation_type: radio
    labels:
      - "More accurate/correct"
      - "More helpful/useful"
      - "Better formatted/organized"
      - "More appropriate tone"
      - "More complete"
      - "More concise"
      - "Followed instructions better"
      - "Both equally good/bad"
      - "Other"

  - name: "response1_quality"
    description: "Rate Response 1 quality (1-5):"
    annotation_type: likert
    min_label: "1 - Very poor"
    max_label: "5 - Excellent"
    size: 5

  - name: "response2_quality"
    description: "Rate Response 2 quality (1-5):"
    annotation_type: likert
    min_label: "1 - Very poor"
    max_label: "5 - Excellent"
    size: 5

  - name: "task_difficulty"
    description: "How difficult was this instruction to follow well?"
    annotation_type: radio
    labels:
      - "Very easy"
      - "Easy"
      - "Moderate"
      - "Difficult"
      - "Very difficult"

  - name: "annotation_confidence"
    description: "How confident are you in your preference?"
    annotation_type: radio
    labels:
      - "Very confident"
      - "Somewhat confident"
      - "Not very confident"

allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 2

annotation_instructions: |
  ## AlpacaFarm Preference Annotation

  Compare two AI responses to the same instruction and indicate your preference.

  ### Evaluation Criteria:

  **Instruction Following**
  - Did it do what was asked?
  - Did it follow any specific requirements?
  - Did it stay on topic?

  **Helpfulness**
  - Would this actually help the user?
  - Is the information useful?
  - Is it actionable?

  **Accuracy**
  - Is the information correct?
  - Are there factual errors?
  - Is it misleading?

  **Quality**
  - Is it well-written?
  - Is it appropriately detailed?
  - Is the tone appropriate?

  ### Preference Scale:
  - **Much better**: Clear, obvious winner
  - **Slightly better**: Marginal advantage
  - **Tie**: Genuinely indistinguishable

  ### Guidelines:
  - Read the instruction first
  - Consider what a typical user would want
  - Don't overthink - go with your gut
  - Ties are okay when responses are equal

  ### Common Reasons for Preference:
  - More accurate information
  - Better addresses the actual question
  - More helpful/actionable
  - Better organized/formatted
  - More appropriate length
  - Better tone for the context

Beispieldatensample-data.json

[
  {
    "id": "alpaca_001",
    "instruction": "Explain the concept of compound interest to a teenager.",
    "response_1": "Compound interest is when you earn interest on both your original money AND on the interest you've already earned. Imagine you put $100 in a savings account with 10% interest. After year 1, you have $110. In year 2, you earn 10% on $110 (not just $100), giving you $121. It's like a snowball rolling downhill, getting bigger and bigger!",
    "response_2": "Compound interest is the interest calculated on the initial principal and also on the accumulated interest from previous periods. The formula is A = P(1 + r/n)^(nt), where A is the final amount, P is principal, r is the annual interest rate, n is the number of times interest is compounded per year, and t is the number of years."
  },
  {
    "id": "alpaca_002",
    "instruction": "Write a short poem about coffee.",
    "response_1": "Morning's dark embrace,\nSteam rises, warmth fills the cup,\nEnergy awaits.",
    "response_2": "Coffee, oh coffee, so brown and so hot,\nI drink you each morning, I like you a lot.\nYou wake me up when I'm feeling so tired,\nYour caffeine boost is what I desired."
  }
]

Dieses Design herunterladen

View on GitHub

Clone or download from the repository

Schnellstart:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/preference-learning/alpacafarm-simulation
potato start config.yaml

Details

Annotationstypen

likertradio

Bereich

Natural Language ProcessingAI Alignment

Anwendungsfälle

RLHFPreference LearningModel Evaluation

Schlagwörter

preferencesimulationinstructionalpacarlhfllm

Problem gefunden oder möchten Sie dieses Design verbessern?

Issue öffnen

AlpacaFarm Preference Simulation

Konfigurationsdateiconfig.yaml

Beispieldatensample-data.json

Dieses Design herunterladen

Details

Annotationstypen

Bereich

Anwendungsfälle

Schlagwörter

Verwandte Designs

InstructGPT Instruction Following

Constitutional AI Harmlessness Evaluation

OpenAssistant Conversation Quality