Survey paper on Parameter-Efficient Fine-Tuning (PEFT).
Categorizes and reviews PEFT techniques across diverse Foundation Models (FMs), analyzing their core mechanisms, applications, and future directions.
------
Paper - https://arxiv.org/abs/2501.13787
Methods discussed in this Paper 💡:
→ PEFT achieves huge cost-reduction by updating only a small fraction of the model's parameters while striving for optimal downstream task performance.
→ Key PEFT categories include: Selective (freezing or masking parameters), Additive (inserting adapter networks), Prompt (learning soft commands), Reparameterization (modifying existing parameters), and Hybrid (combining multiple techniques).
→ The survey systematically analyzes each category, discussing its core mechanism and how it is applied to different FMs like LLMs, VFMs, and MFMs.
-----
Key Insights from this Paper 🤔:
→ PEFT methods demonstrate remarkable growth and are successfully applied across diverse FMs and tasks.
→ LLMs and VFMs are the dominant focus areas, with VLMs and VGMs gaining traction, while MFMs remain relatively underexplored.
→ PEFT methods face challenges regarding reliability due to hyperparameter sensitivity and limited representation capacity.
→ Future directions include interdisciplinary research, continual PEFT, architecture-specific optimizations, and scaling law exploration.
-----
Results 💯:
→ LoRA reduces trainable parameters by over 99.97% compared to full fine-tuning in GPT-3, requiring only 4.7M or 37.7M parameters for training while achieving near full fine-tuning performance.
→ PASTA achieves a 90.8% F1 score on CoNLL2003 for Named Entity Recognition, outperforming P-tuning v2 by 0.6% with 20 times fewer trainable parameters.
→ AdapterDrop reduces memory costs by 69% when fine-tuning T5 and CLIP-T5, outperforming other methods that achieve only a 26% reduction under similar parameter usage.
Share this post