LLMs need both correct mental state attribution and appropriate depth of mentalization for true Theory of Mind (ToM) .
Current ToM benchmarks miss half the puzzle: knowing when to mentalize matters as much as how.
This paper identifies a critical gap in evaluating Theory of Mind (ToM) capabilities in LLMs, highlighting the need to assess both depth of mentalization and correct inference.
-----
https://arxiv.org/abs/2412.13631
🤔 Original Problem:
→ Current AI research focuses solely on testing whether LLMs can correctly attribute mental states, ignoring whether ToM should be invoked in the first place.
→ Existing benchmarks use static scenarios that don't properly evaluate interactive ToM capabilities.
-----
🔬 Solution in this Paper:
→ The paper proposes a two-step evaluation framework for ToM capabilities.
→ First step determines whether to invoke ToM and at what depth of mentalization.
→ Second step applies correct inference given the chosen depth.
→ This approach distinguishes between three types of ToM errors: unnecessary ToM use, insufficient depth, and incorrect reasoning.
-----
🎯 Key Insights:
→ Biological agents adaptively choose ToM depth based on context and resource constraints
→ Linear probing methods cannot distinguish between different types of ToM failures
→ Interactive benchmarks are needed to test appropriate ToM invocation
-----
📊 Results:
→ Current benchmarks fail to distinguish between Type B (insufficient depth) and Type C (incorrect reasoning) errors
→ Static vignette-based tasks preclude testing true ToM capabilities
Share this post