Objectives: To develop and rigorously evaluate a Hybrid Multi-Path Attention Convolutional Neural Network (HMPA-CNN) for the classification of kidney diseases across heterogeneous institutional datasets and imaging modalities. Materials and Methods: The proposed HMPA-CNN employs dual parallel pathways to disentangle spatial (3 × 3 convolutions) and textural (5 × 5 convolutions) representations, followed by attention-based feature recalibration and gated fusion. Performance was assessed on five geographically distinct datasets comprising 29,148 CT and MRI images collected from Turkey, Bangladesh, Jordan, Iraq, and publicly available international sources. The evaluation framework included three-class tumor discrimination, four-class renal pathology classification, six-class tumor subtyping, binary kidney stone detection, and chronic kidney disease (CKD) assessment under cross-modality conditions. Results: The model achieved 99.76% overall accuracy on the KidneyNeXt three-class dataset, 99.96% on the four-class multi-institutional CT dataset, and 99.74% on the independent Jordan cohort under a four-class configuration. In the more granular six-class tumor subtyping task, overall accuracy was 96.36%. The same architecture achieved 93.85% overall accuracy on the MRI-based CKD classification task, suggesting that the framework can be adapted to a different imaging modality. Across most classification tasks, specificity exceeded 99%, with benign–malignant misclassification remaining below 2%. Performance declined to 91.96% for kidney stone detection, reflecting the intrinsic difficulty of small-object localization in axial CT images. Conclusions: The dual-path architecture consistently preserved high discriminative performance across institutions, diagnostic granularities, and imaging modalities. Its stable specificity and low benign–malignant confusion suggest potential utility as a supportive tool within renal imaging workflows, particularly for screening and structured diagnostic assistance. Clinically, benign–malignant misclassification is the most critical error, as it may delay oncologic evaluation or lead to unnecessary follow-up. Therefore, the model should be used as a decision-support tool rather than an autonomous diagnostic system. Further prospective validation is required to determine its impact in routine clinical practice.
Karatepe et al. (Thu,) studied this question.