Generative Adversarial Networks (GANs) have emerged as powerful tools for medical image synthesis and clinical outcome prediction. In ophthalmology, accurate forecasting of Optical Coherence Tomography (OCT) images and best-corrected visual acuity (BCVA) values can significantly enhance patient monitoring and personalized treatment planning. We introduce a multimodal GAN inspired by the StyleGAN architecture, featuring super-resolution modules, a multi-scale patch discriminator, and temporal attention mechanisms. To predict logMAR values, a hybrid deep-shallow LSTM model was jointly trained alongside the image pipeline. Synthesized scans were processed through an EfficientNet-based classifier to predict 16 retinal biomarkers. To ensure subject independence, we employed a 3-fold patient-level cross-validation strategy. The proposed multimodal GAN achieved an SSIM of 0.9264, an FID of 11.9, and a PSNR of 38.1 dB for OCT forecasting. The logMAR module delivered an MAE of 0.052, while the biomarker classifier attained a macro-F1 score of 0.81. Based on logMAR change forecasting, patients were further categorized into Winner, Stabilizer, and Loser outcome groups using a threshold of Δ=0.05, achieving an overall F1 score of 0.84. Our approach effectively forecasts retinal morphology and functional outcomes, providing valuable predictive insights for proactive clinical decision-making in retinal health management.
Sampathkumar et al. (Wed,) studied this question.