Fine-tuning large language models (LLMs) is computationally expensive, and Low-Rank Adaptation (LoRA) provides a cost-effective solution by approximating weight updates through low-rank matrices. In real-world scenarios, LLMs are fine-tuned on data from multiple domains to perform tasks across various fields, embodying multi-task learning (MTL). LoRA often underperforms in such complex scenarios. To enhance LoRA's capability in multi-task learning, we propose R-LoRA, which incorporates Multi-Head Randomization. Multi-Head Randomization diversifies the head matrices through Multi-Head Dropout and Multi-Head Random Initialization, enabling more efficient learning of task-specific features while maintaining shared knowledge representation. Our approach not only improves performance in MTL but also reduces GPU memory usage and training time. Experiments show that R-LoRA's gains stem from increased diversity in the head matrices, demonstrating its effectiveness for multi-task learning. The code is available at https://github.com/jinda-liu/R-LoRA
Building similarity graph...
Analyzing shared references across papers
Loading...
Jinda Liu
Yi Chang
Yuan Chieh Wu
Building similarity graph...
Analyzing shared references across papers
Loading...
Liu et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68e6a0f4718ef0a556b33e58 — DOI: https://doi.org/10.48550/arxiv.2502.15455