Sequential Monte Carlo sampling brings training-free precision to diffusion model alignment.
A training-free method to align diffusion models with specific objectives while preserving their versatility and avoiding over-optimization through Sequential Monte Carlo sampling.
-----
https://arxiv.org/abs/2501.05803v1
🎯 Original Problem:
→ Current methods to align diffusion models either suffer from over-optimization through fine-tuning or under-optimization through guidance approaches
→ Fine-tuning methods lose model versatility and sample diversity, while guidance methods fail to effectively optimize target rewards
-----
🔬 Solution in this Paper:
→ Introduces Diffusion Alignment as Sampling (DAS), a training-free method using Sequential Monte Carlo sampling
→ Uses multiple candidate latents to average out errors in estimated corrections
→ Employs carefully designed tempering techniques for efficient sampling
→ Incorporates theoretical guarantees for asymptotic exactness and sample efficiency
-----
⚡ Key Insights:
→ Training-free methods can achieve better results than fine-tuning approaches
→ Multiple candidate sampling with tempering is key to balancing reward optimization and diversity
→ Theoretical guarantees ensure reliable sampling from reward-aligned distributions
-----
📊 Results:
→ Outperforms all fine-tuning baselines in aesthetic score and human preference metrics
→ Achieves 20% improvement in both target and unseen rewards
→ Maintains sample diversity and cross-reward generalization
→ Works effectively with different diffusion backbones like Stable Diffusion v1.5 and SDXL
Share this post