Random forests (RFs) have a long history; they were originally defined by Leo but have antecedents in bagging methods introduced in 1996. They have become one of the most widely adopted machine learning tools thanks to their computational efficiency, relative insensitivity to tuning parameters, inbuilt cross validation, and interpretation tools. Despite their popularity, mathematical theory about the fundamental properties of RFs has been slow to emerge. Nonetheless, the past decade has seen significant advances in our understanding and analysis of these algorithms. In this review article, we describe several variations of RFs and how rates of consistency of these variants highlight the impact of different RF mechanisms on their performance. Another line of research focuses on establishing central limit theorems and confidence intervals for RFs. We also depict recent analyses in variable importance computed with RFs.
Scornet et al. (Mon,) studied this question.