This paper proposes a simplified GAN architecture with improved stability and performance, challenging the notion that GANs are difficult to train.
https://arxiv.org/abs/2501.05441
🤔 Original Problem:
→ GANs are notoriously unstable and prone to mode collapse, leading to a reliance on ad-hoc tricks and outdated architectures.
→ The field has been held back by the belief that GANs are inherently tricky to train and don't scale well.
-----
💡 Solution in this Paper:
→ The authors introduce R3GAN, a new baseline GAN that combines a regularized relativistic loss with modern network architectures.
→ They augment the relativistic pairing GAN loss with zero-centered gradient penalties to improve stability and diversity.
→ The well-behaved loss allows them to discard ad-hoc tricks and replace outdated backbones with modern architectures.
→ Key architectural choices include proper ResNet design, initialization, resampling, grouped convolution, and no normalization.
-----
🔑 Key Insights from this Paper:
→ Regularized relativistic GAN loss addresses mode dropping and non-convergence issues
→ Combining R1 and R2 gradient penalties is crucial for stable training
→ Modern ConvNet designs can significantly improve GAN performance
→ Normalization layers can be harmful in generative models
-----
📊 Results:
→ R3GAN surpasses StyleGAN2 on FFHQ-256 (FID 2.75 vs 3.78)
→ Achieves full mode coverage on StackedMNIST (1000/1000 modes)
→ Outperforms many SOTA GANs and diffusion models on CIFAR-10 and ImageNet
Share this post