Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Using automated verification and atomic instructions, VIF-RAG enables LLMs to follow complex rules during knowledge retrieval.

Nov 10, 2024

Using automated verification and atomic instructions, VIF-RAG enables LLMs to follow complex rules during knowledge retrieval.

Combines atomic instructions with verification code to ensure LLMs follow rules while accessing external knowledge

Original Problem 🔍:

Existing instruction-following (IF) alignment methods for LLMs lack effectiveness in Retrieval-Augmented Generation (RAG) scenarios due to diverse knowledge introduced by retrieval.

Solution in this Paper 🛠️:

• VIF-RAG: Automated, scalable, verifiable synthetic pipeline for IF alignment in RAG

• Starts with <100 atomic instructions, uses combination rules for complex instructions

• Employs supervised models for instruction rewriting and code generation for verification

• Integrates instructions with RAG and general data, scaling to >100K high-quality samples

• Introduces FollowRAG Benchmark: 3K test samples, 22 constraint types, 4 QA benchmarks

Key Insights from this Paper 💡:

• First framework addressing IF alignment in RAG scenarios

• Automated verification at each step ensures high-quality data synthesis

• Seamless integration with various RAG benchmarks for comprehensive evaluation

• Balances IF alignment with preservation of LLM's foundational abilities

Results 📊:

• VIF-RAG outperforms all baselines in FollowRAG across multiple configurations

• >10% improvement on average accuracy compared to baselines

• Maintains performance stability with increasing number of instructions (up to 4)

• Effectively preserves other foundational capabilities of LLMs

🧠 How does VIF-RAG generate high-quality instruction data?

VIF-RAG starts by manually crafting a minimal set of atomic instructions (<100) and developing combination rules to synthesize and verify complex instructions.

It then uses supervised models for instruction rewriting while simultaneously generating code to automate verification via a Python executor. Finally, it integrates these instructions with extensive RAG and general data samples, scaling up to over 100K high-quality samples through automated processes.

Rohan's Bytes

Discussion about this post