[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
-
Updated
Feb 16, 2025 - Python
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
Video Generation Benchmark
DPO-Shift: Shifting the Distribution of Direct Preference Optimization
Code for "ReSpace: Text-Driven 3D Indoor Scene Synthesis and Editing with Preference Alignment"
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).
[ICCV 2025] Official repository of "Mitigating Object Hallucinations via Sentence-Level Early Intervention".
[ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"
[ICML 25] "Preference Optimization for Combinatorial Optimization Problems"
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
[ICML 2025] TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
[NeurIPS 2025] Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
LLM-Driven Preference Data Synthesis for Proactive Prediction of the User’s Next Utterance in Human–Machine Dialogue
Survey of preference alignment algorithms
Generate synthetic datasets for instruction tuning and preference alignment using tools like `distilabel` for efficient and scalable data creation.
Creating a GPT-2-Based Chatbot with Human Preferences
Add a description, image, and links to the preference-alignment topic page so that developers can more easily learn about it.
To associate your repository with the preference-alignment topic, visit your repo's landing page and select "manage topics."