0:00
/
0:00
Transcript

"Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics"

The podcast on this paper is generated with Google's Illuminate.

Your LLM isn't doing math - it's using clever pattern matching tricks.

LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a “bag of heuristics”, as proposed in this paper.

https://arxiv.org/abs/2410.21272

🤔 Original Problem:

Do LLMs solve reasoning tasks using actual algorithms or just memorize training data? This fundamental question impacts how these models truly learn and generalize.

-----

🛠️ Solution in this Paper:

Using arithmetic as a test case, researchers analyzed individual neurons in LLMs to understand their reasoning mechanism. They identified a circuit of ~200 neurons per layer (1.5%) that handles arithmetic operations. Each neuron implements simple pattern-matching rules called "heuristics" - like recognizing when operands fall within certain ranges or have specific patterns.

-----

💡 Key Insights:

→ LLMs don't use true mathematical algorithms or pure memorization

→ They combine multiple simple pattern-matching rules (heuristics) to solve arithmetic

→ Different neurons activate for different numerical patterns and boost corresponding answer tokens

→ This explains both their capabilities and limitations in arithmetic reasoning

→ The mechanism appears early in training and remains consistent throughout

-----

📊 Results:

→ The identified circuit achieves 96% faithfulness in reproducing full model behavior

→ Only 200 neurons per layer (1.5%) are needed for accurate arithmetic

→ 91% of top neurons implement identifiable heuristic patterns

→ Each heuristic causes 29% accuracy drop when ablated

Discussion about this video