Memory-augmented LLMs that can pause, verify facts, and correct themselves on the fly
LLMs often generate incorrect facts. This paper introduces Explicit Working Memory (Ewe) that actively monitors and corrects factual errors during text generation.
-----
https://arxiv.org/abs/2412.18069
🤔 Original Problem:
→ LLMs frequently hallucinate facts when generating long-form text
→ Current retrieval methods (RAG) have limitations in maintaining factual accuracy throughout generation
-----
🛠️ Solution in this Paper:
→ Ewe introduces a working memory system that stores knowledge from retrieved passages in multiple memory units
→ The system pauses generation periodically to check facts and refresh memory with new relevant information
→ When factual errors are detected, Ewe backtracks, updates memory with correct information, and regenerates text
→ Memory units process different passages in parallel, allowing flexible incorporation of various knowledge sources
-----
💡 Key Insights:
→ Shorter memory units (128 tokens) perform better than longer ones
→ Supporting passages are more effective than instruction-based feedback
→ Combining C4 and Wikipedia knowledge improves factual accuracy
→ Intermediate intervals for memory refresh show optimal results
-----
📊 Results:
→ Improved VeriScore (factuality metric) by 2-10 points absolute across four datasets
→ Maintained helpfulness comparable to base model
→ Outperformed all baseline methods including standard RAG and Chain-of-Verification
Share this post