March 1, 2024

Enhancing Neural Network Efficiency with Streamlined Pruned Linear Adapters

Key Points

Key points are not available for this paper at this time.

Abstract

The burgeoning number of parameters in recent machine learning models has escalated computational expenses and resource demands during fine-tuning phases. In response, parameter-efficient transfer learning has attracted significant attention. Among the solutions, the Adapter method stands out for its ability to curtail the number of fine-tuning parameters while preserving model performance. In this paper, we introduce an innovative adapter design integrated with a model pruning technique. Diverging from conventional neural network-based structural adapters, our adapter utilizes linear transformations, effectively diminishing the fine-tuning parameter count relative to existing approaches. Additionally, our method involves pruning less critical adapters from the transformer module during both training and inference stages. Extensive experiments conducted on the GLUE benchmark demonstrate that our approach reduces parameter counts by 25K in BERT-large and 102K in RoBERTa-large compared to non-pruning methods, while achieving average accuracies of 81.3% and 89.5%, respectively.

Demander à l'IA

Bookmark

Cite This Study

Mao et al. (Fri,) studied this question.

synapsesocial.com/papers/68e76bd8b6db6435876e1c3c https://doi.org/https://doi.org/10.1109/icaace61206.2024.10549729

Demander à l'IA

Bookmark