We investigate whether sophisticated deep learning architectures justify their computational cost for short-term cryptocurrency price forecasting. Our study evaluates a 2.1M-parameter (M represents millions (e.g., 2.1M = 2,100,000 parameters), with all RMSE values reported in USD) wavelet-enhanced transformer that decomposes the Fear and Greed Index (FGI) into multiple timescales before integrating these signals with technical indicators. Using Diebold–Mariano tests with HAC-corrected variance, we find that all models—including our wavelet–transformer, ARIMA, XGBoost, LSTM, and vanilla Transformer—fail to significantly outperform the O(1) naive persistence baseline at the 1-day horizon (DM statistic = +19.13, p<0.001, naive preferred). Our model achieves an RMSE of USD 2005 versus USD 1986 for naive (ratio 1.010), requiring 3909× more inference time (2.43 ms vs. 0.0006 ms) for a statistically worse performance. These results provide strong empirical support for the Efficient Market Hypothesis in cryptocurrency markets: even sophisticated multi-scale architectures combining wavelet decomposition, cross-attention, and auxiliary technical indicators cannot extract profitable short-term signals. Through systematic ablation, we identify positional encoding as the only critical architectural component—its removal causes 30% RMSE degradation. Our findings carry important implications, as follows: (1) short-term crypto forecasting faces fundamental predictability limits, (2) architectural complexity provides negative ROI in efficient markets, and (3) rigorous statistical validation reveals that apparent improvements often represent noise rather than signal.
Jay et al. (Tue,) studied this question.