SpotDiffusion enables fast, high-quality panorama generation by shifting non-overlapping denoising windows, eliminating redundant computations in existing methods.
📚 https://arxiv.org/pdf/2407.15507
Original Problem 🔍:
Generating high-resolution panoramas with diffusion models is computationally expensive due to overlapping denoising windows.
-----
Key Insights from this Paper 💡:
• Overlapping predictions in existing methods are often redundant
• Shifting non-overlapping windows over time corrects seams
• Uniform denoising across image achievable without overlaps
-----
Solution in this Paper 💡:
• Introduces SpotDiffusion: Uses non-overlapping denoising windows
• Shifts windows randomly over time to ensure uniform denoising
• Eliminates need for averaging multiple predictions
• Can replace MultiDiffusion in existing methods
-----
Results 📊:
• Matches or exceeds image quality of MultiDiffusion and SyncDiffusion
• 6x faster than MultiDiffusion (stride 16)
• 3x faster than SyncDiffusion
• FID: 3.59 vs 3.21 (MultiDiffusion)
• CLIPScore: 31.67 (same as MultiDiffusion)
• ImageReward: 0.76 vs 0.75 (MultiDiffusion)