RARe (Retrieval Augmented Retrieval with In-Context Examples), proposed in this paper, teaches retrieval models to learn from examples, just like LLMs do.
📚 https://arxiv.org/abs/2410.20088
Original Problem 🤔:
In-context learning works great for LLMs but hasn't been explored for retrieval models. Simply adding examples to queries doesn't work well for retrievers, creating a gap in leveraging example-based learning for information retrieval tasks.
-----
Solution in this Paper 🛠️:
→ Introduces RARe (Retrieval Augmented Retrieval with In-Context Examples) that finetunes pre-trained models using semantically similar query-document pairs as examples
→ Uses BM25 to find relevant example pairs during training and inference
→ Augments queries with task instructions and retrieved examples in a specific format
→ Works with both decoder-only LLMs and existing retriever models
-----
Key Insights 💡:
→ Retrieved (semantically similar) examples work better than random examples
→ Performance improves with more examples (up to 10 tested)
→ Including negative examples in prompts doesn't improve performance
→ Latency overhead diminishes for larger corpus sizes
-----
Results 📊:
→ Achieves +2.72% nDCG gains on reasoning-oriented retrieval tasks
→ Shows +1.41% nDCG@10 improvement on standard retrieval benchmarks
→ Demonstrates stronger out-of-domain generalization compared to baseline models
→ Maintains consistent improvements across different model architectures
Share this post