DecoPrompt tricks LLMs into telling truth by rewording false questions
A way to make LLMs admit when they don't know something
LLMs often generate false information when faced with incorrect premises, even when they possess the correct knowledge. This paper introduces DecoPrompt, an algorithm that reduces hallucinations by analyzing prompt entropy and selecting low-entropy prompts that are less likely to trigger false outputs.
-----
https://arxiv.org/abs/2411.07457
🤔 Original Problem:
LLMs struggle with false-premise questions, often generating hallucinated responses even when they have the correct knowledge. Traditional methods like few-shot prompting or Chain-of-Thought actually increase hallucination rates.
-----
🔧 Solution in this Paper:
→ DecoPrompt analyzes the entropy of false-premise prompts to predict hallucination likelihood
→ The algorithm paraphrases the original prompt multiple times to generate alternative versions
→ It selects the version with lowest entropy while preserving semantic meaning
→ Lower entropy prompts are found to correlate with reduced hallucination rates
-----
💡 Key Insights:
→ Higher prompt entropy strongly correlates with increased hallucination likelihood
→ The same question asked differently can significantly impact hallucination rates
→ DecoPrompt's approach transfers well across different LLM sizes and families
-----
📊 Results:
→ Reduced hallucinations by up to 28.1% on the Fictitious dataset
→ Achieved 84% agreement with human evaluators on hallucination detection
→ Demonstrated strong cross-model transferability, working effectively across different LLMs
Share this post