0:00
/
0:00
Transcript

"LLMs Do Not Think Step-by-step In Implicit Reasoning"

The podcast on this paper is generated with Google's Illuminate.

This paper comes back to show again, that LLMs are just pattern-matching machines, not step-by-step reasoners when solving math problems.

Reveals that LLMs don't actually perform step-by-step reasoning when generating direct answers without showing their work. Through probing experiments on arithmetic problems, researchers found that models skip intermediate calculations and rely on pattern matching, challenging assumptions about implicit reasoning capabilities.

-----

https://arxiv.org/abs/2411.15862

🤔 Original Problem:

While Chain-of-Thought (CoT) prompting improves reasoning but slows down inference, many researchers assumed LLMs could reason implicitly without showing steps. However, implicit reasoning consistently underperforms explicit CoT, raising questions about whether LLMs truly reason step-by-step internally.

-----

🔬 Solution in this Paper:

→ Researchers used Qwen2.5-72B to solve multi-step arithmetic problems without showing work.

→ They probed the model's hidden states using linear classifiers to detect intermediate calculation steps.

→ The team tested problem variations by reversing equation order and using decimal values to assess reasoning stability.

-----

💡 Key Insights:

→ LLMs rarely compute intermediate results during implicit reasoning, despite often reaching correct final answers

→ Only first and final steps could be detected in hidden states, with middle steps largely absent

→ Performance drops significantly when problems are slightly modified, showing implicit reasoning's instability

→ Two-step reasoning might be possible implicitly, but longer chains fail without explicit steps

-----

📊 Results:

→ Implicit reasoning accuracy: 85% for 3-step and 54% for 5-step problems

→ Performance dropped to 13.71% for reversed 5-step problems

→ Explicit CoT maintained 100% accuracy across all variations

Discussion about this video

User's avatar