Fine-tuning 7B models with textbooks outsmarts 70B models on course questions.
Fine-tuning smaller LLMs with course textbooks enables them to outperform larger models on Multiple Choice Questions while requiring minimal computing resources.
https://arxiv.org/abs/2501.05891v1
🤔 Original Problem:
→ Educational institutions face challenges using LLMs due to high computational costs and poor performance on domain-specific questions.
-----
🔧 Solution in this Paper:
→ The researchers tested LLaMA-2 variants (7B, 13B, 70B) on 162 Programming Language MCQs.
→ They used LoRA and qLoRA techniques for efficient fine-tuning with course textbook content.
→ The process involved testing different learning rates, batch sizes, and epochs to optimize performance.
-----
💡 Key Insights:
→ Fine-tuned 7B and 13B models can run on consumer GPUs (24GB)
→ Single-chapter fine-tuning produced more stable results than using entire textbook
→ Quantized models showed minimal accuracy loss while reducing memory usage significantly
-----
📊 Results:
→ 13B quantized variants achieved 78% better performance than pre-trained versions
→ Fine-tuned 7B models required only 13GB memory vs 45GB for base models
→ Free tier Google Colab (15GB) can support inference and fine-tuning of 7B quantized models
------
Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓
🎉 https://rohanpaul.substack.com/
Share this post