Two LLMs working together: one fixes inputs, one polishes outputs, both improve results
RIRO (Reshaping Inputs, Refining Outputs ) introduces a two-layer LLM architecture that reshapes inputs and refines outputs to improve performance in data-scarce environments, enabling better generalization and accuracy .
https://arxiv.org/abs/2412.15254
Original Problem 🤔:
→ LLMs struggle with accuracy and consistency when fine-tuned on small, domain-specific datasets
→ Current solutions like data augmentation add noise or require strict input formats, limiting real-world applicability
Solution in this Paper 🔧:
→ RIRO employs a novel two-layer architecture where the first layer reformulates inputs to match training data distribution
→ The second layer focuses on refining outputs to minimize inconsistencies
→ Uses Quantized Low-Rank Adaptation (QLoRA) for efficient fine-tuning while maintaining performance
→ Three model variants tested: Refining LLM (input focus), Reshaping LLM (output focus), and Stacked LLM (combined approach)
Key Insights 💡:
→ Input reformulation significantly improves model generalization in data-scarce scenarios
→ Output reshaping enhances consistency and accuracy of generated content
→ QLoRA enables efficient fine-tuning without compromising performance
→ Combined input-output processing outperforms single-layer approaches
Results 📊:
→ RIRO achieved highest BLEU score (0.72) compared to baseline Phi-2 (0.55)
→ ROUGE-1 F1 score improved from 0.265 to 0.402
→ Cosine similarity increased from 0.816 to 0.891
Share this post