0:00
/
0:00
Transcript

"Measuring memorization through probabilistic discoverable extraction"

The podcast on this paper is generated with Google's Illuminate.

A probabilistic approach to better measure what LLMs actually remember from training data.

Paper from @GoogleDeepMind

A more realistic way to quantify what secrets LLMs might spill.

📚 https://arxiv.org/abs/2410.19482

🔍 Original Problem:

Current methods to measure memorization in LLMs rely on single-sequence greedy sampling, which underestimates true memorization rates and fails to reflect real-world user interactions where multiple attempts with different sampling strategies are possible.

-----

🛠️ Solution in this Paper:

→ Introduces (n,p)-discoverable extraction - a probabilistic relaxation of discoverable extraction

→ Quantifies probability of extracting target sequence within n attempts with probability p

→ Considers various sampling schemes (top-k, top-p, temperature) instead of just greedy sampling

→ No additional computational cost compared to traditional methods

-----

💡 Key Insights:

→ Greedy sampling misses clear cases of memorization where target sequences have high generation likelihood

→ Larger models and repeated training data show higher memorization rates

→ Gap between greedy and probabilistic extraction rates increases with model size

→ Different sampling strategies significantly impact extraction success rates

-----

📊 Results:

→ Even with modest n=3 attempts and p=10% probability, extraction rates exceed greedy sampling

→ Extraction rates on training data consistently higher than test data across all parameter settings

→ For 12B parameter model, needs only n=40 sequences to match greedy rate at p=90%

→ Gap between greedy and probabilistic rates widens with model size (1B to 12B parameters)

Discussion about this video