"LLMs Think Too Fast To Explore Effectively"
Below podcast on this paper is generated with Google's Illuminate.
https://arxiv.org/abs/2501.18009
The paper addresses the question of whether LLMs can explore effectively in open-ended tasks, similar to humans. Current LLMs excel in many areas but their ability to discover new information through exploration remains under-examined.
This paper proposes to evaluate LLMs' exploration capabilities using the game "Little Alchemy 2". It analyzes if LLMs balance uncertainty and empowerment like humans or rely on different strategies.
-----
📌 Sparse Autoencoders (SAE) effectively dissect LLM internal representations. This reveals a critical flaw: LLMs prioritize uncertainty processing in early layers, causing premature decisions before considering empowerment.
📌 Transformer architecture's fixed layer order is the bottleneck. Early layers capture immediate input. Later layers, needed for complex 'empowerment', are processed too late for effective exploration.
📌 "Little Alchemy 2" game is a novel, insightful benchmark. It exposes LLMs' limitations in open-ended exploration, unlike standard benchmarks that miss crucial cognitive abilities.
----------
Methods Explored in this Paper 🔧:
→ The study used "Little Alchemy 2" as an open-ended exploration task. In this game, players combine elements to discover new elements.
→ Four LLMs were tested: gpt-4o, o1, LLaMA3.1-8B, and LLaMA3.1-70B. These models represent different sizes and architectures.
→ Human gameplay data from 29,493 participants served as a benchmark. Human performance was measured by the number of new elements discovered.
→ LLMs were prompted with game rules, inventory, and history. Their actions were constrained to valid game combinations.
→ The researchers analyzed two exploration strategies: uncertainty-driven exploration and empowerment. Uncertainty prioritizes less-used elements. Empowerment focuses on combinations leading to more future discoveries.
→ Regression models quantified the influence of uncertainty and empowerment on element choices by both humans and LLMs.
→ Sparse Autoencoders (SAE) were used to analyze the latent representations of uncertainty and empowerment within LLaMA3.1-70B. This helped understand where these concepts are processed in the model.
-----
Key Insights 💡:
→ Most LLMs underperform humans in open-ended exploration in "Little Alchemy 2". The exception is the 'o1' model which surpasses human performance.
→ Traditional LLMs primarily rely on uncertainty-driven exploration strategies. They do not effectively use empowerment-based strategies like humans.
→ Sparse Autoencoder analysis reveals that uncertainty and choice are represented in earlier transformer blocks of LLaMA3.1-70B. Empowerment is represented in later blocks.
→ This temporal difference suggests that LLMs "think too fast". They make decisions based on uncertainty before fully processing empowerment information. This premature decision-making hinders effective exploration.
-----
Results 📊:
→ o1 discovered 177 elements, outperforming humans who averaged 42 elements in 500 trials (p < 0.001).
→ LLaMA3.1-8B discovered 9 elements, LLaMA3.1-70B discovered 25 elements, and gpt-4o discovered 35 elements, all significantly less than humans (p < 0.001).
→ Regression analysis showed near-zero empowerment weights for most LLMs, significantly lower than humans. o1 showed the highest empowerment weights, even exceeding humans.
→ Sparse Autoencoder analysis for LLaMA3.1-70B showed peak correlation of uncertainty values at layer 2 (r = 0.73) and empowerment values at layer 72 (r = 0.55).