What type of study is this?

This is a Quantitative Study study.

September 29, 2025Open Access

MoLAE: Mixture of Latent Experts for Parameter-Efficient Language Models

Key Points

MoLAE significantly reduces resource requirements while preserving the performance of large language models.
This architecture optimizes the mixture of experts model to improve parameter efficiency by lowering the computational load.
A systematic two-step algorithm facilitates the transformation from standard MoE to MoLAE architecture, enhancing extensive model capabilities.
Theoretical analysis supports MoLAE's effectiveness, showing improvements across various efficiency dimensions while sustaining existing capabilities.

Abstract

Mixture of Experts (MoE) has become a key architectural paradigm for efficiently scaling Large Language Models (LLMs) by selectively activating a subset of parameters for each input token. However, standard MoE architectures face significant challenges, including high memory consumption and communication overhead during distributed training. In this paper, we introduce Mixture of Latent Experts (MoLAE), a novel parameterization that addresses these limitations by reformulating expert operations through a shared projection into a lower-dimensional latent space, followed by expert-specific transformations. This factorized approach substantially reduces parameter count and computational requirements, particularly in existing LLMs where hidden dimensions significantly exceed MoE intermediate dimensions. We provide a rigorous mathematical framework for transforming pre-trained MoE models into MoLAE architecture, characterizing conditions for optimal factorization, and developing a systematic two-step algorithm for this conversion. Our comprehensive theoretical analysis demonstrates that MoLAE significantly improves efficiency across multiple dimensions while preserving model capabilities. Experimental results confirm that MoLAE achieves comparable performance to standard MoE with substantially reduced resource requirements.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zehua Liu

Han Wu

Ruifeng She

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MoLAE: Mixture of Latent Experts for Parameter-Efficient Language Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study