The increasing volume of medical imaging data in modern healthcare systems has created significant challenges for efficient, accurate, and scalable diagnostic decision support. Chest X-ray (CXR) imaging, as one of the most widely used radiological modalities, requires robust informatics-oriented analytical frameworks capable of integrating heterogeneous visual patterns while maintaining computational efficiency and interpretability. In this study, we present an informatics-driven hybrid deep learning framework, termed VD-MHA Net, designed for scalable pulmonary disease classification from CXR images. The proposed architecture integrates a Vision Transformer (ViT) branch for global contextual representation with a DenseNet121 backbone for localized structural feature extraction. To effectively combine heterogeneous representations, a lightweight multi-head attention-based fusion module is employed, enabling adaptive feature prioritization within a unified decision-making pipeline. The framework was evaluated on a curated multi-source dataset comprising chest X-ray images across four diagnostic categories: normal, COVID-19, pneumonia, and lung opacity. Experimental results demonstrate that VD-MHA Net achieves robust diagnostic performance, with a mean AUC of , a balanced accuracy of , an F1-score of , and an MCC of , while exhibiting stable convergence and computational scalability. Furthermore, SHAP-based explainability analysis confirms that the proposed system focuses on clinically relevant pulmonary regions, enhancing transparency and trust in clinical decision-support environments. Overall, this study highlights the potential of hybrid attention-driven architectures as practical medical informatics solutions for large-scale radiographic analysis and supports their integration into data-driven clinical workflows.
Nejat et al. (Wed,) studied this question.