This study evaluates the performance of a deep learning framework supported by a cross-replication strategy for predicting Alzheimer’s disease (AD) from structural magnetic resonance imaging (MRI). EfficientNetV2-B0 was selected due to its favorable accuracy-efficiency trade-off. The workflow consisted of two stages: (i) clustering-based relabeling of the full dataset into five clinically meaningful categories, and (ii) training a classifier on the relabeled data. To assess the stability of the proposed approach, the model was trained across multiple random initializations on a fixed train/validation/test split. Class-wise Average Precision, macro- and micro-averaged Precision-Recall Area Under the Curve (PR–AUC) and Receiver Operating Characteristic Area Under the Curve (ROC–AUC), and their 95% confidence intervals were reported using bootstrap resampling. The cross-replication strategy yielded improved stability across initializations, with a mean test accuracy of 0.95 compared with 0.94 for the single-run baseline, along with consistently higher PR–AUC and ROC–AUC values. These findings suggest that cross-replication enhances the reliability of AD stage prediction by reducing performance variability due to stochastic initialization, although further evaluation with alternative data partitions or external validation cohorts is warranted.
Mustafa Cosar (Fri,) studied this question.