Seismic ground-motion simulations provide high-fidelity predictions but are computationally prohibitive for large-scale scenario analyses. Surrogate models based on Multi-Layer Perceptrons (MLPs) or Fourier Neural Operators (FNOs) have been studied, yet each has limitations: MLPs fail to capture spatial correlations, while FNOs incur high costs from repeated Fourier transforms on full-resolution grids. To overcome these issues, we propose a surrogate model based on the MLP-Mixer architecture that operates on a patch grid, enabling efficient extraction of global spatial correlations. In addition, we introduce a multi-stream design with source and geology inputs fused through a learnable element-wise multi-modal mixer, allowing period-dependent, data-driven fusion of modalities. Experiments on Nankai Trough simulations demonstrate that the proposed method, referred to as Multi-MLP-Mixer, achieves accuracy comparable to state-of-the-art surrogate models while reducing training and inference time, thereby balancing predictive performance with computational efficiency.
Hachiya et al. (Fri,) studied this question.