November 24, 2025Open Access

Towards a flexible and unified architecture for speech enhancement

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Abstract Deploying neural networks across devices with vastly different computational budgets is critical for realizing AI Flow at the network edge. This paper contributes to cooperative family-model systems by proposing a single network that can be dynamically sliced into subnetworks of varying sizes, enabling seamless adaptation to heterogeneous resource constraints across the device-edge-cloud continuum. To scale broadly, we make both the width and depth of the network flexible. For width scaling, we introduce FlexAttention, which enables a variable number of attention heads to adaptively adjust computational load. We also propose FlexRMSNorm, a normalization layer that dynamically adapts to different network widths. Combined with early-exit strategies, these components form a network that scales in both width and depth. Built from these flexible modules, we present SEFlow, a causal and sampling-rate-agnostic model that handles a wide range of speech enhancement tasks, including denoising, dereverberation, declipping, and packet loss concealment. Experimental results demonstrate that SEFlow is comparable to the state-of-the-art task-specific models across multiple speech enhancement tasks. Remarkably, even sub-networks as small as 1% of the full network remain effective in low-resource scenarios. Our demonstrations are available on the project homepage.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Linfeng Feng (Mon,) studied this question.

synapsesocial.com/papers/69403bab2d562116f290ce87 https://doi.org/https://doi.org/10.1007/s44336-025-00022-z

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar

Ver artículo completo