0:00
/
0:00
Transcript

"The Pitfalls of Memorization: When Memorization Hurts Generalization"

Generated below podcast on this paper with Google's Illuminate.

Teaching neural networks to forget what hurts and remember what matters.

This paper addresses the issue of neural networks learning spurious correlations and memorizing exceptions, leading to poor generalization.

-----

https://arxiv.org/abs/2412.07684

Original Problem 🧠:

Neural networks often learn simple explanations for most data while memorizing exceptions, resulting in poor generalization when relying on spurious correlations.

-----

Solution in this Paper 💡:

→ The paper proposes Memorization-Aware Training (MAT), a novel approach to mitigate the negative effects of memorization and spurious correlations.

→ MAT uses held-out predictions as a signal of memorization to shift model logits during training.

→ This shift encourages the model to learn robust patterns that are invariant across different data distributions.

→ MAT improves generalization under distribution shifts by guiding the learning process towards more meaningful patterns.

-----

Key Insights from this Paper 🔍:

→ Spurious correlations combined with memorization are particularly harmful to generalization.

→ Models can achieve zero training loss by relying on spurious features for most data and memorizing exceptions.

→ Memorization can be beneficial, harmful, or catastrophic depending on the nature of the data and learning dynamics.

→ MAT effectively reduces memorization, especially for minority groups, leading to improved generalization.

-----

Results 📊:

→ Evaluated on 4 datasets: Waterbirds, CelebA, MultiNLI, CivilComments

→ Memorization-Aware Training (MAT) showed improved worst-group accuracy compared to baselines

→ Analysis of memorization scores revealed MAT reduced memorization, particularly for minority groups

Discussion about this video