"Spurious Forgetting in Continual Learning of Language Models"

Playback speed

Share post at current time

0:00

Transcript

"Spurious Forgetting in Continual Learning of Language Models"

Below podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Jan 31, 2025

Freeze bottom LLM layers to prevent spurious forgetting in continual learning.

LLMs in continual learning suffer from performance declines on previous tasks when learning new ones. This isn’t true catastrophic forgetting, but rather a loss of task alignment. This paper proposes freezing bottom LLM layers during continual learning to maintain alignment and mitigate “spurious forgetting.”

-----

Paper - https://arxiv.org/abs/2501.13453

Original Problem: 😟:

→ LLMs exhibit performance drops on prior tasks during continual learning.

→ This seems like catastrophic forgetting, but prior knowledge can be easily recovered.

-----

Solution in this Paper: 💡:

→ This paper introduces “spurious forgetting”. Spurious forgetting means LLMs lose task alignment, not knowledge, in continual learning.

→ The paper proposes Freeze, a strategy that freezes the bottom n layers of the model.

→ This prevents updates to these layers when learning new tasks, preserving previous task alignment.

-----

Key Insights from this Paper: 🤔:

→ Task alignment is more crucial than knowledge retention in continual learning.

→ Early optimization steps on new tasks can disrupt previously learned alignments.

→ Bottom LLM layers play a crucial role in task alignment.

-----

Results: 💪:

→ Freeze improves task accuracy in sequential fine-tuning from 11% to 44% on a synthetic dataset.

→ Freeze improves continual learning performance across safety alignment, continual instruction tuning, continual knowledge editing, and instance incremental learning.

→ In Safety alignment, jailbreak rate drops significantly from 99.80% to 79.61% and 1.15% when freezing the bottom 3 or 6 layers.

→ In continual instruction tuning, the average test scores across 8 tasks improves from 47.38 to 50.33 when freezing 3 layers after 1st task.

→ In continual knowledge editing, the efficacy improves from 62.47 to 70.88 when freezing 1 layer after 1st task.

Rohan's Bytes

"Spurious Forgetting in Continual Learning of Language Models"

Discussion about this video