Los puntos clave no están disponibles para este artículo en este momento.
Random forests remain among the most popular off-the-shelf supervised machine tools with a well-established track record of predictive accuracy in regression and classification settings. Despite their empirical success as as a bevy of recent work investigating their statistical properties, a and satisfying explanation for their success has yet to be put forth. Here aim to take a step forward in this direction by demonstrating that the randomness injected into individual trees serves as a form of regularization, making random forests an ideal model in low-to-noise ratio (SNR) settings. Specifically, from a model-complexity, we show that the mtry parameter in random forests serves much the purpose as the shrinkage penalty in explicitly regularized regression like lasso and ridge regression. To highlight this point, we design randomized linear-model-based forward selection procedure intended as an to tree-based random forests and demonstrate its surprisingly strong performance. Numerous demonstrations on both real and synthetic data provided.
Mentch et al. (Thu,) studied this question.