LLM Response Preference
Compare AI-generated responses to collect preference data for RLHF training.
获取此设计
This design is available in our showcase. Copy the configuration below to get started.
快速开始:
# Create your project folder mkdir pairwise-preference cd pairwise-preference # Copy config.yaml from above potato start config.yaml
详情
标注类型
领域
应用场景
标签
相关设计
DPO Preference Data Collection
Pairwise preference annotation for Direct Preference Optimization, based on Rafailov et al., NeurIPS 2023. Annotators compare two model responses to a prompt, select a preference, rate alignment dimensions, and provide reasoning.
Pairwise Preference with Rationale
Compare two AI responses and select the better one while providing a written justification. Used for reward model training with interpretable preference signals.
FLUTE: Figurative Language Understanding through Textual Explanations
Figurative language understanding via NLI. Annotators classify figurative sentences (sarcasm, simile, metaphor, idiom) and provide textual explanations of the figurative meaning. The task combines natural language inference with fine-grained figurative language type classification.