Single-shot passage ranking with LLMs reduces API costs while improving accuracy
LLMs can now rank all passages in a single shot instead of using sliding windows, making search ranking faster and more accurate with proper fine-tuning .
-----
https://arxiv.org/abs/2412.14574v1
🔍 Original Problem:
Current LLMs use sliding window strategy for passage ranking, which repeatedly processes overlapping passages, causing high API costs and latency . This inefficiency stems from limited input length capabilities .
-----
🛠️ Solution in this Paper:
→ The paper introduces full ranking strategy using long-context LLMs that can process all passages simultaneously
→ A multi-pass sliding window approach generates complete ranking lists for training labels
→ An importance-aware loss function weights passage IDs based on their rank position, ensuring top-ranked passages receive more attention
-----
💡 Key Insights:
→ Full ranking is more efficient but less effective in zero-shot settings
→ With supervised fine-tuning, full ranking outperforms sliding window approach
→ Initial passage order significantly impacts ranking performance
→ Increasing ranking iterations improves performance but converges after 3-4 passes
-----
📊 Results:
→ 2.2 point improvement in NDCG@10 on TREC DL19
→ 29.3% reduction in latency compared to sliding window
→ 50% reduction in API costs
→ Consistent performance improvement across different passage numbers (20-100)
------
Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓
🎉 https://rohanpaul.substack.com/
Share this post