Better ODE (ordinary differential equation) solving backfires in diffusion models: less error means worse images.
When mathematical perfection meets practical imperfection in AI image generation.
This paper challenges the common belief that better ODE solving in Consistency Models (CMs) leads to better image generation. Through experiments with Direct CMs, they show that superior ODE (ordinary differential equation) solving accuracy can actually result in worse sample quality, questioning fundamental assumptions about diffusion model distillation.
-----
https://arxiv.org/abs/2411.08954
🤔 Original Problem:
→ Consistency Models are meant to speed up diffusion model sampling by solving Probability Flow ODEs, but their theoretical foundation assumes better ODE solving equals better image quality.
-----
🔧 Solution in this Paper:
→ The researchers introduce Direct CMs that minimize error directly against an ODE solver, unlike regular CMs which use indirect self-consistency training.
→ They compare Direct CMs with regular CMs using SDXL as the teacher model for controlled experiments.
→ Direct CMs achieve lower ODE solving error but surprisingly produce significantly worse sample quality.
-----
💡 Key Insights:
→ Better ODE (ordinary differential equation) solving does not necessarily translate to improved sample quality
→ The success of CMs likely depends on factors beyond just ODE solving accuracy
→ The weak supervision in CM training might provide beneficial inductive bias
-----
📊 Results:
→ Direct CMs achieve 0.23-0.25 ODE error vs 0.29-0.30 for regular CMs
→ Regular CMs score better on all image metrics: FID (103.9 vs 158.6), CLIP score (0.21 vs 0.20)
Share this post