Gene expression data presents significant challenges due to their high dimensionality; effective gene selection methods are needed to obtain accurate analysis and biomarker discovery. In this paper, we conducted a comprehensive comparative study using nine filter-based gene selection techniques: Information Gain, Mutual Information, Correlation-based Feature Selection (CFS), Relief-F, T-Test, Wilcoxon, Chi2, Pearson correlation, and Gini index. A breast cancer microarray dataset was used to evaluate these methods based on their classification accuracy, computational efficiency, and stability of the selected gene subsets. Most methods achieve high predictive accuracy and perfect stability but differ in their computational costs. This study aims to provide practical insights for choosing appropriate filtering methods based on their balance performance and efficiency in analyzing gene expression.
Elwaer et al. (Mon,) studied this question.