LLM memory is like Schrödinger's - only observable when queried.
LLMs exhibit dynamic memory capabilities, similar to human cognitive processes, through dynamic approximation.
📚 https://arxiv.org/pdf/2409.10482
Original Problem 🤔:
Understanding the memory mechanism in LLMs is challenging. Existing research lacks a deep theoretical framework to explain how these models exhibit memory-like behavior, and how it differs from human memory.
-----
Solution in this Paper 🔧:
- Utilizes the Universal Approximation Theorem (UAT) to explain LLM memory.
- Proposes that LLMs' memory operates like "Schrödinger's memory," observable only when queried.
- Compares LLM memory with human memory, suggesting both rely on dynamic fitting of outputs based on inputs.
- Conducts experiments using CN Poems and ENG Poems datasets to assess memory capabilities.
-----
Key Insights from this Paper 💡:
- LLMs can dynamically approximate functions, exhibiting a form of memory.
- Memory in LLMs is not static but inferred from input cues.
- Longer texts challenge LLMs' memory capacity.
- Human and LLM memories share dynamic response mechanisms.
-----
Results 📊:
- Qwen2-1.5B-Instruct achieved 96.9% accuracy on CN Poems.
- Bloom-1b4-zh reached 96.6% accuracy on CN Poems.
- Nearly all models achieved 99.9% accuracy on ENG Poems.
- Memory performance decreases with longer text outputs.
Share this post