Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning

Nov 11, 2024

Montessori-Instruct, a novel data synthesis framework uses student feedback to train teacher LLMs to generate better synthetic training data

Original Problem 🔍:

Synthetic data for training LLMs often lacks quality and relevance, leading to ineffective student learning.

Solution in this Paper 🛠️:

• Montessori-Instruct: A framework tailoring teacher's data synthesis to student preferences

• Uses local data influence to measure synthetic data utility for student learning

• Optimizes teacher model via Direct Preference Optimization (DPO)

• Iterative process: probing dataset → influence collection → preference dataset → teacher optimization

Key Insights from this Paper 💡:

• Local data influence effectively captures student learning preferences

• Teacher optimization outperforms data bootstrapping or response optimization

• Synthetic data from optimized teacher generalizes well across different student models

• Multiple iterations of Montessori-Instruct continue to improve student performance

Results 📊:

• Outperforms Self-Instruct by 18.35% (Alpaca Eval) and 46.24% (MT-Bench)

• Surpasses data synthesized by GPT-4o

• Improves performance on out-of-domain tasks (e.g., MMLU, GSM8K)

• Demonstrates robustness across different seed data, iterations, and student models

How Montessori-Instruct works

🔍 Montessori-Instruct utilizes local data influence of synthetic training data points on students to characterize students' learning preferences.

🔄 It then trains the teacher model with Direct Preference Optimization (DPO) to generate synthetic data tailored toward student learning preferences.

🧠 The process involves:

This process can be iterated multiple times to continually refine the teacher.

Rohan's Bytes