"MemHunter: Automated and Verifiable Memorization Detection at Dataset-scale in LLMs"

Playback speed

Share post at current time

0:00

Transcript

Generated below podcast on this paper with Google's Illuminate.

Jan 07, 2025

Teaching small AI to catch big AI's memory leaks.

MemHunter introduces an automated system to detect when LLMs memorize training data, making privacy risk assessment scalable across large datasets.

-----

Original Problem 🔍:

→ Current methods to detect LLM memorization are inefficient, requiring per-sample optimization and manual intervention

→ Existing approaches can't effectively assess privacy risks across large datasets

→ Traditional methods only look for exact matches, missing partial memorization that could still leak sensitive information

-----

Solution in this Paper 🛠️:

→ MemHunter uses a tiny LLM to generate memory-inducing prompts automatically

→ The system employs the Longest Common Substring approach to detect partial matches and assess memorization risks

→ It uses hypothesis testing to verify memorization at dataset scale

→ The framework iteratively refines prompts through rejection sampling and fine-tuning

-----

Key Insights 💡:

→ Memorization detection should consider partial matches, not just exact copies

→ Dataset-level verification is crucial for real-world privacy assessment

→ Using a smaller LLM for prompt generation makes the process scalable

-----

Results 📊:

→ Extracts 40% more training data than existing methods under time constraints

→ Reduces search time by 80% when used as a plug-in

→ Achieves up to 92% accuracy in memorization detection on Vicuna-7B

→ Successfully differentiates between trained and untrained models with 95% confidence

Rohan's Bytes