Key points are not available for this paper at this time.
Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses.Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training.In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data.RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models.Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks.The code is available at github.com/VITA-Group/RankMean.
Building similarity graph...
Analyzing shared references across papers
Loading...
Gabriel Perin
Xuxi Chen
Shusen Liu
Building similarity graph...
Analyzing shared references across papers
Loading...
Perin et al. (Mon,) studied this question.
www.synapsesocial.com/papers/6a08faf00465d979db9d0a87 — DOI: https://doi.org/10.18653/v1/2024.findings-acl.104