0:00
/
0:00
Transcript

"Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics"

The podcast on this paper is generated with Google's Illuminate.

Your LLM isn't doing math - it's using clever pattern matching tricks.

LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a β€œbag of heuristics”, as proposed in this paper.

https://arxiv.org/abs/2410.21272

πŸ€” Original Problem:

Do LLMs solve reasoning tasks using actual algorithms or just memorize training data? This fundamental question impacts how these models truly learn and generalize.

-----

πŸ› οΈ Solution in this Paper:

Using arithmetic as a test case, researchers analyzed individual neurons in LLMs to understand their reasoning mechanism. They identified a circuit of ~200 neurons per layer (1.5%) that handles arithmetic operations. Each neuron implements simple pattern-matching rules called "heuristics" - like recognizing when operands fall within certain ranges or have specific patterns.

-----

πŸ’‘ Key Insights:

β†’ LLMs don't use true mathematical algorithms or pure memorization

β†’ They combine multiple simple pattern-matching rules (heuristics) to solve arithmetic

β†’ Different neurons activate for different numerical patterns and boost corresponding answer tokens

β†’ This explains both their capabilities and limitations in arithmetic reasoning

β†’ The mechanism appears early in training and remains consistent throughout

-----

πŸ“Š Results:

β†’ The identified circuit achieves 96% faithfulness in reproducing full model behavior

β†’ Only 200 neurons per layer (1.5%) are needed for accurate arithmetic

β†’ 91% of top neurons implement identifiable heuristic patterns

β†’ Each heuristic causes 29% accuracy drop when ablated

Discussion about this video

User's avatar