Nearest-neighbor classifiers are accurate and easy to deploy, but their memory footprint and inference time grow with the size of the reference set. This paper studies an evolutionary prototype selection strategy for k-nearest neighbor (K-NN) classification aimed at extreme, class-balancedreduction. A compact genetic algorithm (GA) evolves a fixed number of prototype indices per class drawn from a disjoint design partition; the selected prototypes are then used by a 1-NN classifier, with fitness defined as the number of correctly classified test instances. To address concerns about generality and baseline strength, we evaluate an experimental suite including synthetic 2D Gaussians (σ=0. 5 and σ=1. 0) and a 3D three-moons geometry, as well as public benchmarks spanning binary and multi-class settings and higher-dimensional data (Breast Cancer Wisconsin, Wine, Reduced MNIST/Digits 8 × 8, Forest CoverType with seven classes, and a 10D five-class spiral benchmark). We compare against K-NN baselines with k∈1, 3, 5, 7 using all design samples, and include GA operator ablations (GA1/GA2/GA3). Each scenario is repeated over 30 independent runs, reporting mean ± std, min/max, per-run distributions, win/tie/loss counts, and non-parametric significance tests (paired Wilcoxon with Holm correction; Friedman where applicable). Across datasets, the GA-selected prototype banks—often orders of magnitude smaller than the full design set—match or improve accuracy, with frequent statistically supported wins against strong K-NN baselines, and in the hardest cases provide substantial compression with no loss relative to the best baseline. These results establish a reproducible baseline for extreme, class-balanced prototype reduction suitable for memory- and latency-constrained deployments and for fair comparison against more elaborate prototype selection methods.
Ayala-Ramírez et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: