Prune diffusion models by 20% without retraining, while keeping the same image quality
This paper introduces EcoDiff, a groundbreaking framework that enables efficient pruning of diffusion models without retraining, reducing model size by 20% while maintaining image quality. It solves the critical challenge of making large diffusion models more deployable and resource-efficient.
-----
https://arxiv.org/abs/2412.02852
🎯 Original Problem:
→ Modern diffusion models like SDXL and FLUX are becoming extremely large, requiring massive GPU memory and computation
→ Existing pruning methods need extensive retraining, costing up to $1M on AWS for models like Stable Diffusion 2
-----
🔧 Solution in this Paper:
→ EcoDiff introduces a model-agnostic structural pruning framework using differentiable neuron masking
→ It employs an end-to-end pruning objective that preserves final denoised latent quality across all steps
→ A novel time step gradient checkpointing technique reduces memory usage from 1400GB to under 30GB
→ The framework learns which neurons to remove through continuous relaxation of discrete masking
-----
💡 Key Insights:
→ Diffusion models have significant parameter redundancy that can be removed without retraining
→ End-to-end optimization is more effective than per-step pruning for maintaining image quality
→ Memory efficiency can be dramatically improved through clever gradient checkpointing
-----
📊 Results:
→ Successfully pruned 20% parameters from SDXL and FLUX without quality loss
→ Reduced SDXL deployment requirements to fit on 8GB GPU
→ Maintained or improved FID scores compared to original models
→ Compatible with existing efficiency techniques like time step distillation