What question did this study set out to answer?

The central aim is to improve diagnostic efficiency for rare eye diseases using a generative model.

March 26, 2026Open Access

Boosting foundation models for rare eye disease diagnosis via a multimodal text-to-image generative framework

Key Points

The central aim is to improve diagnostic efficiency for rare eye diseases using a generative model.
Introduced EyeDiff, a generative foundation model for ophthalmic image synthesis.
Utilized textual descriptions to generate lesion-preserving images.
Augmented minority classes across 11 datasets to reduce data imbalance.
Evaluated model performance using objective metrics and expert human evaluations.
EyeDiff produced high-fidelity images representing diverse retinal conditions.
Significant improvement in diagnostic accuracy for rare and common eye diseases.
Enhanced accuracy was observed across different types of foundation models.

Abstract

The rising prevalence of vision-threatening retinal diseases poses a significant burden on the global healthcare systems. Though deep learning (DL) techniques offer promising avenues for improving diagnostic efficiency, data scarcity and imbalance issues persist in training robust diagnostic models, particularly for rare eye diseases. Here, we introduce EyeDiff, a generative foundation model capable of synthesizing lesion-preserving ophthalmic images from textual descriptions. Both objective metrics and expert human evaluations confirmed EyeDiff's ability to generate high-fidelity images across multiple imaging modalities, accurately reflecting textual descriptions of diverse retinal diseases and lesion types. By augmenting minority classes across 11 globally sourced datasets, EyeDiff consistently boosted the diagnostic accuracy for both common and rare eye diseases across different foundation model types, including modality-specific, multimodal and vision-language foundation models trained solely on real data. These results underscore EyeDiff's potential as a general-purpose text-to-image foundation model, offering a scalable and flexible approach to generate balanced, disease-relevant data for advancing retinal disease diagnosis.

Bookmark

View Full Paper

Cite This Study

Chen et al. (Mon,) studied this question.

synapsesocial.com/papers/69c4cc69fdc3bde4489179ac https://doi.org/https://doi.org/10.1038/s41746-026-02560-2

Bookmark

View Full Paper