Attention is all you need, even to catch LLM hallucinations.
Inconsistency in answers generated from attentive and non-attentive queries can indicate hallucinations.
The Paper introduces a novel method using attention mechanisms for zero-shot hallucination detection, improving accuracy and reducing computational cost compared to existing consistency-based methods.
-----
https://arxiv.org/abs/2501.09997
Original Problem 🧐:
→ Existing hallucination detection methods based on answer consistency are computationally expensive.
→ They rely on multiple LLM runs and may fail when LLMs are confidently wrong.
-----
Solution in this Paper 💡:
→ This paper proposes Attention-Guided Self-Reflection (AGSER).
→ AGSER uses attention scores to categorize query tokens into attentive and non-attentive sets.
→ Attentive queries contain the most important tokens based on attention contribution. Non-attentive queries contain the rest.
→ AGSER generates answers for both attentive and non-attentive queries separately.
→ It calculates consistency scores between these answers and the original answer.
→ The difference between attentive and non-attentive consistency scores estimates hallucination.
→ Lower attentive consistency and higher non-attentive consistency indicate higher hallucination probability.
→ AGSER requires only three LLM passes, reducing computation compared to methods needing multiple resamples.
Share this post