Alzheimer’s disease (AD) is a progressive neurodegenerative disorder for which early detection and reliable monitoring remain major clinical challenges. Electroencephalography (EEG) combined with machine learning has attracted growing interest as a scalable and non-invasive approach to AD detection, yet reported classification accuracies vary widely across studies and are rarely comparable or clinically translatable. One important reason is that the analytical pipeline—from data acquisition to model validation—involves numerous methodological choices whose inter-stage dependencies and reproducibility implications are rarely made explicit. In this narrative review, we adopt a methodological chain framework to make these dependencies explicit, organizing EEG-based AD research into five sequential stages: data acquisition, preprocessing, feature representation, modeling, and validation. Choices at each stage can shape downstream analyses, inflate reported performance, and reduce cross-study comparability in ways that are difficult to detect when stages are assessed independently. These effects are particularly consequential in EEG-based AD research, where cohorts are typically small and biomarkers are subtle. We make three primary contributions: (1) we describe inter-stage methodological dependencies that may contribute to reproducibility problems and performance inflation; (2) we synthesize major sources of methodological variability across representative EEG–AD studies and evaluate their differential impact on spectral, connectivity, and complexity features; and (3) we provide practical, stage-aligned recommendations culminating in a minimum reporting checklist.
Wang et al. (Wed,) studied this question.