We propose Torq, a novel geometric modification to Stochastic Gradient Descent (SGD) with momentum. Standard optimizers often struggle with high-frequency noise and chaotic oscillations in high-dimensional, fractal loss landscapes. Torq stabilizes the optimization path by decoupling local exploration from global trajectory guidance. By rotating the momentum vector towards a stable, low-frequency gradient trend while preserving its kinetic energy, Torq filters out stochastic noise and prevents overshooting. Empirical results show that Torq effectively navigates non-linear landscapes where standard momentum fails to capture long-term trends, significantly improving generalization.
Mikhail Gorokhov (Tue,) studied this question.