Fine-tuning often leads to catastrophic forgetting. Where the model loses its ability to generalize across other tasks.
LINES: POST-TRAINING LAYER SCALING PREVENTS…
Fine-tuning often leads to catastrophic forgetting. Where the model loses its ability to generalize across other tasks.