Background/Objectives: Early and accurate diagnosis of chest diseases is a critical challenge in clinical practice, particularly in scenarios where multiple pathologies may coexist. While deep learning-based medical image analysis has shown promising results, most existing studies rely on unimodal data and fixed-scale datasets, limiting their generalizability and clinical relevance. In this study, we present a comprehensive comparative analysis of unimodal and multimodal deep learning models for multi-label chest disease classification using chest X-ray images and associated clinical metadata. Methods: A total of twelve models were developed based on three widely used convolutional neural network architectures—ResNet50, EfficientNetB3, and DenseNet121—under both unimodal (image-only) and multimodal (image + clinical data) configurations. To systematically investigate the impact of data scale, experiments were conducted on two distinct versions: the Random Sample of NIH Chest X-ray Dataset and the NIH Chest X-ray Dataset, containing 5606 and 121,120 samples, respectively. Model performance was evaluated using label-based Area Under the Receiver Operating Characteristic Curve (AUROC) metrics. Results: Experimental results demonstrate that multimodal fusion consistently outperforms unimodal approaches across all architectures and data scales, with more pronounced improvements observed in large-scale settings. Furthermore, increasing data volume leads to improved generalization and reduced performance variance, particularly for rare pathologies. Conclusions: These findings highlight the effectiveness of multimodal, multi-label learning in enhancing diagnostic accuracy and support the development of robust clinical decision support systems for chest disease assessment.
Building similarity graph...
Analyzing shared references across papers
Loading...
Diğdem Orhan
Fırat University
Murat Uçan
Dicle University
R. Alhajj
University of Calgary
Building similarity graph...
Analyzing shared references across papers
Loading...
Orhan et al. (Sun,) studied this question.
synapsesocial.com/papers/69a7cc8ed48f933b5eed836b — DOI: https://doi.org/10.3390/diagnostics16050734