What type of study is this?

This is a Systematic Review study.

What question did this study set out to answer?

This survey aims to categorize multimodal models used for financial forecasting and identify their trends and challenges.

February 26, 2026Open Access

Learning Across Modalities: A Systematic Survey of Multimodal Models for Financial Analysis

Key Points

This survey aims to categorize multimodal models used for financial forecasting and identify their trends and challenges.
Conducted a systematic review of 35 papers from 2018 to 2025.
Developed a unified taxonomy based on input modalities, modeling architectures, fusion strategies, and tasks.
Analyzed challenges in data integration like temporal misalignment and modality imbalance.
Identified persistent challenges including noisy data and limited cross-market generalization.
Highlighted trends such as adaptive fusion techniques and the use of large language models.
Outlined impacts of architectural design on interpretability and deployability.

Abstract

• Taxonomy for multimodal forecasting by modality, model, fusion type, and task. • Review of 35 papers (2018–2025) analyzed through the proposed unified taxonomy. • Challenges: misalignment, modality imbalance, noisy inputs and generalization. • Trends: adaptive fusion, missing modality learning, LLMs, temporal GNN models. Multimodal learning has recently emerged as a powerful paradigm for financial forecasting, enabling the integration of heterogeneous data sources such as market time series, textual news, and relational graphs. This survey presents a unified taxonomy for multimodal financial forecasting models, structured along four key dimensions: input modalities, modelling architectures, fusion strategies, and predictive tasks. Using this taxonomy, we conduct a systematic review of 35 representative works published between 2018 and 2025, highlighting methodological trends, design choices, and performance patterns. Our analysis identifies persistent challenges, including temporal misalignment, modality imbalance, missing or noisy data, and limited cross-market generalization. We also discuss emerging trends and promising research directions, such as adaptive fusion, incomplete modality learning, and the integration of large language models and temporal graph neural networks, and analyse how architectural and fusion design choices impact practical considerations such as interpretability and deployability, aiming to bridge methodological innovation with domain-specific requirements.

Learning Across Modalities: A Systematic Survey of Multimodal Models for Financial Analysis

Key Points

Abstract

Cite This Study