0:00
/
0:00
Transcript

"Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers"

Generated this podcast on this Paper with Google's Illuminate, a specialized tool to create podcast from arXiv papers only

This paper makes inference-free sparse retrievers competitive with dense models while maintaining efficiency, making lightweight search as smart as the heavy lifters, but way faster.

Smart token weighting makes sparse retrievers as good as dense ones, but faster

IDF-aware FLOPS and ensemble distillation bridge the gap between sparse and dense retrievers

https://arxiv.org/abs/2411.04403

🎯 Original Problem:

Inference-free sparse retrievers, while efficient, significantly lag behind in search relevance compared to both sparse and dense siamese models, limiting their practical applications.

-----

🔧 Solution in this Paper:

→ Introduced IDF-aware FLOPS loss that balances token importance based on Inverted Document Frequency, preventing underestimation of semantically rich tokens

→ Developed heterogeneous ensemble knowledge distillation combining dense and sparse retrievers during pre-training

→ Implemented score normalization to prevent bias when combining different retriever outputs

→ Applied two-phase training: large-scale pre-training followed by MS MARCO dataset fine-tuning

-----

💡 Key Insights:

→ Uniform FLOPS regularization penalizes all tokens equally, hindering performance

→ IDF-aware approach effectively reduces computational complexity while maintaining accuracy

→ Combining dense and sparse retrievers creates stronger supervisory signals

→ Pre-computed IDF values from training show good generalization

-----

📊 Results:

→ Outperforms existing SOTA inference-free sparse models by 3.3 NDCG@10 points on BEIR benchmark

→ Matches performance of siamese sparse retrievers

→ Maintains client-side latency at only 1.1x that of BM25

Discussion about this video