We introduce a novel variable selection method for high-dimensional generalized linear models, called Randomized Smart Subset Selection (RS3). RS3 combines randomized block-wise model fitting with importance-weighted stochastic subset sampling to efficiently identify active predictors. Two extensions are proposed: (i) RS3+SCAD, which applies penalized regression with the SCAD penalty to a reduced set of candidate variables, improving computational efficiency while adaptively determining model size; and (ii) RS3+FPCS, a subsampling-based procedure for false positive control based on coefficient sign stability. Simulation studies demonstrate that RS3 and its variants achieve high true positive rates with low false discovery, even under moderate predictor correlation. Applications to gene expression data show that RS3+SCAD consistently identifies sparse, biologically interpretable gene sets, outperforming existing methods in terms of prediction accuracy and stability. MSC Codes: 62F40, 62J12, 62H30
Rios et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: