What question did this study set out to answer?

To enhance the Mamba architecture's ability to model complex temporal dynamics in multivariate time series forecasting.

February 5, 2026Open Access

C-T-Mamba: Temporal Convolutional Block for Improving Mamba in Multivariate Time Series Forecasting

Key Points

To enhance the Mamba architecture's ability to model complex temporal dynamics in multivariate time series forecasting.
Developed a hybrid framework integrating Mamba block, channel attention, and temporal convolution block.
Conducted extensive experiments on five public benchmark datasets to evaluate performance.
Analyzed inference scaling in terms of GPU memory usage and latency.
C-T-Mamba achieves average reductions of 4.3–18.5% in Mean Squared Error (MSE) and 3.9–16.2% in Mean Absolute Error (MAE).
Demonstrates an 8.8× reduction in GPU memory and over 10× speedup in inference time compared to standard Transformers.
Maintains latency at 8.9 ms at 2048 steps, showcasing superior linear scaling.

Abstract

In recent years, Transformer-based methods have demonstrated proficiency in capturing complex patterns for time series forecasting. However, their quadratic complexity relative to input sequence length poses a significant bottleneck for scalability and real-world deployment. Recently, the Mamba architecture has emerged as a compelling alternative by mitigating the prohibitive computational overhead and latency inherent in Transformers. Nevertheless, a vanilla Mamba backbone often struggles to adequately characterize intricate temporal dynamics, particularly long-term trend shifts and non-stationary behaviors. To bridge the gap between Mamba’s global scanning and local dependency modeling, we propose C-T-Mamba, a hybrid framework that synergistically integrates a Mamba block, channel attention, and a temporal convolution block. Specifically, the Mamba block is leveraged to capture long-range temporal dependencies with linear scaling, the channel attention mechanism filters redundant information, and the temporal convolution block extracts multi-scale local and global features. Extensive experiments on five public benchmarks demonstrate that C-T-Mamba consistently outperforms state-of-the-art (SOTA) baselines (e.g., PatchTST and iTransformer), achieving average reductions of 4.3–18.5% in MSE and 3.9–16.2% in MAE compared to representative Transformer-based and CNN-based models. Inference scaling analysis reveals that C-T-Mamba effectively breaks the computational bottleneck; at a horizon of 1536, it achieves an 8.8× reduction in GPU memory and over 10× speedup compared to standard Transformers. At 2048 steps, its latency remains as low as 8.9 ms, demonstrating superior linear scaling. These results underscore that C-T-Mamba achieves SOTA accuracy while maintaining a minimal computational footprint, making it highly effective for long-term multivariate time series forecasting.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Liu et al. (Tue,) studied this question.

synapsesocial.com/papers/698436a5f1d9ada3c1fb5bc8 https://doi.org/https://doi.org/10.3390/electronics15030657

Bookmark

View Full Paper