Training separate LoRA models for each task is storage and inference inefficient in multi-task scenarios.
Existing parameter generation methods fail to capture correlations between tasks.
The paper proposes a novel method named ICM-LoRA. It uses a Conditional Variational Autoencoder (CVAE) to generate task-specific LoRA weights for efficient customization of LLMs.
-----
📌 CVAE as a LoRA Generator: Instead of storing separate Low-Rank Adaptation (LoRA) weights, the Conditional Variational Autoencoder (CVAE) dynamically generates task-specific parameters. This reduces storage needs while maintaining comparable performance to standard fine-tuned LoRA models.
📌 Task Vectors as Condition Inputs: Extracting task vectors from fine-tuned model hidden states allows CVAE to model inter-task relationships. This ensures task-aware LoRA weight synthesis, improving efficiency in multi-task adaptation.
📌 Meta-Learned Parameter Generation: In-context meta-learning enables CVAE to generalize across tasks, learning LoRA parameter distributions conditioned on task vectors. This eliminates the need for repetitive fine-tuning, making multi-task LoRA deployment more scalable.
-----
https://arxiv.org/abs/2501.17635
Methods explored in this Paper 💡:
→ ICM-LoRA uses a Conditional Variational Autoencoder (CVAE) as a generator.
→ CVAE takes task descriptions as task vectors to produce task-aware LoRA weights.
→ In-context meta-learning enhances CVAE to learn task-parameter distribution relationships.
→ Task vectors are extracted from the hidden states of a fine-tuned model.
→ CVAE is trained using LoRA parameters and task vectors from multiple tasks.
→ Trained CVAE generates LoRA weights for specific tasks without further fine-tuning.
-----
Key Insights from this Paper 🧐:
→ Task vectors from different categories form distinct clusters.
→ Task vectors can represent high-level features of different categories.
→ Task vectors can serve as condition vectors to guide CVAE generation.
→ In-context meta-learning helps CVAE understand task context for better LoRA parameter generation.
→ CVAE can learn the distribution of LoRA parameters conditioned on task vectors.
-----
Results 📊:
→ ICM-LoRA achieves object detection MAP50 of 0.96, MAP75 of 0.89 on the Dog task, matching original LoRA.
→ ICM-LoRA achieves language modeling perplexity of 6.74 and BPC of 0.40 on the ArXiv subset, comparable to original LoRA.
→ ICM-LoRA generator size is 283MB, only 1% storage compared to original LoRA weights and datasets.
Share this post