Train your small AI models with help from the big ones, privately
FedCoLLM introduces a novel framework that enables simultaneous enhancement of server-side LLMs and client-side Small Language Models (SLMs) while preserving data privacy through parameter-efficient federated learning and knowledge distillation.
-----
https://arxiv.org/abs/2411.11707
🤔 **Original Problem**:
→ Organizations can't directly share sensitive domain data with LLM providers for fine-tuning
→ Small companies lack resources to fine-tune large models locally
→ No existing solution for mutual knowledge transfer between server LLMs and client SLMs
-----
🛠️ **Solution in this Paper**:
→ FedCoLLM deploys lightweight LoRA adapters as bridges between clients and server
→ Uses mutual knowledge distillation between LLM and aggregated SLM via auxiliary dataset
→ Implements secure aggregation to protect privacy during knowledge transfer
→ Enables bidirectional knowledge flow while keeping raw data private
-----
💡 **Key Insights**:
→ Parameter-efficient adapters reduce communication costs to just 0.23-0.29% compared to full model training
→ Mutual knowledge distillation enables effective knowledge transfer without raw data sharing
→ Federated framework with secure aggregation preserves client privacy
-----
📊 **Results**:
→ Achieves 41-66% improvement over zero-shot performance across different model combinations
→ Matches centralized training performance while using only 0.23-0.29% of communication costs
→ Shows consistent gains for both server LLMs and client SLMs across multiple NLP tasks
Share this post