A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

Why some post-trained LLM parameter edits succeed while others fail.

Nov 11, 2024

Why some post-trained LLM parameter edits succeed while others fail.

Riemann sums explain performance differences in parameter-edited LLMs after training

Original Problem 🔍:

Delta parameter editing in post-trained large-scale models lacks a unified framework for systematic analysis of different operations and their impact on model performance.

Solution in this Paper 🧠:

• Proposes a unified view of delta parameter editing based on Riemann sum approximation of loss function

• Categorizes existing methods into three classes: competitive, decreased, and improved performance

• Analyzes how different editing operations impact the approximation term and model performance

• Introduces extensions to existing techniques like DARE and BitDelta

Key Insights from this Paper 💡:

• Changes in model performance after editing delta parameters can be analyzed through loss differences approximated using Riemann sums

• The overall distribution of parameter changes is more important than specific parameters during post-training analysis

• Extrapolation is not key for EXPO-like methods; the direction of the approximation term determines whether to use extrapolation or interpolation

Results 📊:

• Extensive experiments on ViT, LLaMA 3, Qwen 2, and Mistral across multiple tasks support theoretical analysis

• DARE maintains competitive performance by keeping approximation term close to zero

• BitDelta and Twin-Merging show decreased performance due to positive approximation term

• EXPO improves performance by producing negative loss changes on alignment data

A unified framework for delta parameter editing in post-trained models enhances understanding and applicability.

Rohan's Bytes

Discussion about this post