June 16, 2024

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

Key Points

Key points are not available for this paper at this time.

Abstract

This paper focuses on the high computational complexity in Large Language Models (LLMs), a significant challenge in both natural language processing (NLP) and multi-modal tasks. We propose Low-Rank Approximation for Sparse Attention (LoRA -Sparse), an innovative approach that strategically reduces this complexity. LoRA -Sparse introduces low-rank linear projection layers for sparse attention approximation. It utilizes an order-mimic training methodology, which is crucial for efficiently approximating the self-attention mechanism in LLMs. We empirically show that sparse attention not only reduces computational demands, but also enhances model performance in both NLP and multi-modal tasks. This surprisingly shows that redundant attention in LLMs might be non-beneficial. We extensively validate LoRA -Sparse through rigorous empirical studies in both (NLP) and multi-modal tasks, demonstrating its effectiveness and general applicability. Based on LLaMA and LLaVA models, our methods can reduce more than half of the self-attention computation with even better performance than full-attention baselines.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Song Lin

Yukang Chen

Shuai Yang

Actions

Institutions

University of Hong Kong

Tencent (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Lin et al. (Sun,) studied this question.

www.synapsesocial.com/papers/69dabcc9a6045d71bfa3e000 — DOI: https://doi.org/10.1109/cvpr52733.2024.01306

Also consider

Synapse has enriched 2 closely related papers on similar clinical questions. Consider them for comparative context:

LLaMA: Open and Efficient Foundation Language Models· 2023 · 3,887 citations
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic· 2023 · 71 citations

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider