What question did this study set out to answer?

Evaluate the effectiveness of sLSTM and mLSTM compared to xLSTM and standard LSTM in financial forecasting.

April 15, 2026Open Access

Beyond xLSTM: A Comparative Study of sLSTM and mLSTM for Short-Term Financial Forecasting

Key Points

Evaluate the effectiveness of sLSTM and mLSTM compared to xLSTM and standard LSTM in financial forecasting.
Tested three LSTM variants: sLSTM, mLSTM, and xLSTM.
Conducted experiments across six financial datasets, including U.S. equities and Chinese A-shares.
Evaluated performance using various historical window lengths, focusing particularly on the 10-day window.
sLSTM and mLSTM showed better forecasting performance than standard LSTM and TimesNet.
Both units consistently outperformed the structurally complex xLSTM.
The advantage of sLSTM and mLSTM remained robust across different asset types and timeframes.

Abstract

In the field of financial forecasting, the complexity–accuracy paradigm—the assumption that more complex models yield superior performance—is frequently challenged by market noise and non-stationarity. This study tests this paradigm by evaluating advanced LSTM variants: the core Long Short-Term Memory (LSTM) unit (sLSTM), the matrix LSTM unit (mLSTM), and the extended LSTM architecture (xLSTM), which integrates these units into stacked residual blocks. We systematically benchmark these variants against standard LSTMs and the advanced benchmark model, TimesNet. Extensive experiments span six diverse financial datasets (comprising mature U.S. equities, a macro index, and high-volatility Chinese A-shares) and four historical window lengths. Results demonstrate that the core sLSTM and mLSTM units consistently deliver superior forecasting performance. Crucially, the targeted architectural innovations of sLSTM and mLSTM not only outperform the standard LSTM and TimesNet benchmarks individually but also surpass the more structurally complex xLSTM module configuration. This advantage remains robust across different asset types, indicators, and window lengths, with particularly outstanding performance at the 10-day length window. This study thus provides strong counterevidence to the “complexity–accuracy” paradigm in this field, proposing a data-driven innovation direction for practical trading systems: prioritizing efficient, high-performance core model innovations over generalized architectural complexity.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper