Ever wondered how to make your LLM forget specific things without starting from scratch? Here's your answer.
LLMEraser introduces a unified framework for efficient unlearning in LLMs, enabling selective removal or correction of specific information while preserving overall model performance through parameter-efficient fine-tuning.
-----
https://arxiv.org/abs/2412.00383
🤔 Original Problem:
LLMs can inadvertently retain sensitive or incorrect information during fine-tuning. Current unlearning methods either require expensive retraining or struggle with precise information removal while maintaining model performance.
-----
🔧 Solution in this Paper:
→ LLMEraser introduces three categories of instance-wise unlearning: Instance Removal, Query Modification, and Response Correction.
→ It leverages influence functions to directly calculate parameter changes in PEFT adapters without retraining.
→ The framework reformulates parameter changes as a finite-sum quadratic programming problem for efficient computation.
→ It updates only the necessary adapter parameters while preserving the original model structure.
-----
💡 Key Insights:
→ Instance-wise unlearning can target specific data points without affecting related concepts
→ Parameter-efficient methods can achieve unlearning without full model retraining
→ Influence functions can effectively estimate parameter changes for selective forgetting
-----
📊 Results:
→ Performance gap of only 0.6% compared to full retraining
→ 5.1% average improvement over corrupted baselines in query modification tasks
→ Successfully handles 40% label noise in response correction tasks
Share this post