Intelligent sentence filtering: the key to faster, more accurate RAG.
EXIT improves RAG systems by intelligently compressing retrieved documents while preserving essential information, making question-answering faster and more accurate.
-----
https://arxiv.org/abs/2412.12559
🤔 Original Problem:
RAG systems struggle when retrievers fail to rank relevant documents well. Adding more documents hurts both speed and accuracy due to LLMs' difficulty with long contexts and distracting information.
-----
🔧 Solution in this Paper:
→ EXIT splits retrieved documents into sentences and evaluates each one's relevance to the query
→ It uses parallel binary classification to determine if sentences contain answer-critical information
→ The system considers full document context when scoring each sentence, not just the sentence itself
→ Selected sentences are recombined in their original order to maintain coherence
→ The compression adapts dynamically based on query complexity and retrieval quality
-----
💡 Key Insights:
→ Context-aware sentence selection outperforms both abstractive and extractive baselines
→ Parallel processing enables fast compression without sacrificing accuracy
→ Preserving sentence order maintains document coherence
→ The framework works as a plug-and-play module for any RAG pipeline
-----
📊 Results:
→ Reduces processing time from several seconds to ~1 second
→ Achieves 86.8% token reduction while improving answer quality
→ Improves EM scores by 3.7 points using 70B LLM
→ Maintains high accuracy across both single-hop and multi-hop QA tasks
Share this post