A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Why some post-trained LLM parameter edits succeed while others fail.
Why some post-trained LLM parameter edits succeed while others fail.
Riemann sums explain performance differences in parameter-edited LLMs after training
Original Problem 🔍:
Delta parameter editing in post-trained large-scale models lacks a unified framework for systematic analysis of different operations and their impact on model performance.
Solution in this Paper 🧠:
• Proposes a unified view of delta parameter editing based on Riemann sum approximation of loss function
• Categorizes existing methods into three classes: competitive, decreased, and improved performance
• Analyzes how different editing operations impact the approximation term and model performance
• Introduces extensions to existing techniques like DARE and BitDelta
Key Insights from this Paper 💡:
• Changes in model performance after editing delta parameters can be analyzed through loss differences approximated using Riemann sums
• The overall distribution of parameter changes is more important than specific parameters during post-training analysis
• Extrapolation is not key for EXPO-like methods; the direction of the approximation term determines whether to use extrapolation or interpolation
Results 📊:
• Extensive experiments on ViT, LLaMA 3, Qwen 2, and Mistral across multiple tasks support theoretical analysis
• DARE maintains competitive performance by keeping approximation term close to zero
• BitDelta and Twin-Merging show decreased performance due to positive approximation term
• EXPO improves performance by producing negative loss changes on alignment data
A unified framework for delta parameter editing in post-trained models enhances understanding and applicability.