March 3, 2026Open Access

Longitudinal Forecasting of Retinal Structure and Function Using a Multimodal StyleGAN-Based Architecture

Key Points

Forecasts retinal morphology with a structural similarity index measure (SSIM) of 0.9264 and a peak signal-to-noise ratio (PSNR) of 38.1 dB.
Utilizes a multimodal GAN model with super-resolution modules and temporal attention mechanisms for accurate OCT image synthesis.
Employs a hybrid deep-shallow long short-term memory (LSTM) model to predict best-corrected visual acuity (BCVA) values alongside retinal biomarkers.
Highlights the importance of effective forecasting in guiding personalized treatment strategies and optimizing clinical decision-making.

Abstract

Generative Adversarial Networks (GANs) have emerged as powerful tools for medical image synthesis and clinical outcome prediction. In ophthalmology, accurate forecasting of Optical Coherence Tomography (OCT) images and best-corrected visual acuity (BCVA) values can significantly enhance patient monitoring and personalized treatment planning. We introduce a multimodal GAN inspired by the StyleGAN architecture, featuring super-resolution modules, a multi-scale patch discriminator, and temporal attention mechanisms. To predict logMAR values, a hybrid deep-shallow LSTM model was jointly trained alongside the image pipeline. Synthesized scans were processed through an EfficientNet-based classifier to predict 16 retinal biomarkers. To ensure subject independence, we employed a 3-fold patient-level cross-validation strategy. The proposed multimodal GAN achieved an SSIM of 0.9264, an FID of 11.9, and a PSNR of 38.1 dB for OCT forecasting. The logMAR module delivered an MAE of 0.052, while the biomarker classifier attained a macro-F1 score of 0.81. Based on logMAR change forecasting, patients were further categorized into Winner, Stabilizer, and Loser outcome groups using a threshold of Δ=0.05, achieving an overall F1 score of 0.84. Our approach effectively forecasts retinal morphology and functional outcomes, providing valuable predictive insights for proactive clinical decision-making in retinal health management.

Bookmark

View Full Paper

Bookmark

View Full Paper

Longitudinal Forecasting of Retinal Structure and Function Using a Multimodal StyleGAN-Based Architecture

Key Points

Abstract

Cite This Study