April 20, 2016

Ensembles of Overfit and Overconfident Forecasts

Key Points

Key points are not available for this paper at this time.

Abstract

Firms today average forecasts collected from multiple experts and models. Because of cognitive biases, strategic incentives, or the structure of machine-learning algorithms, these forecasts are often overfit to sample data and are overconfident. Little is known about the challenges associated with aggregating such forecasts. We introduce a theoretical model to examine the combined effect of overfitting and overconfidence on the average forecast. Their combined effect is that the mean and median probability forecasts are poorly calibrated with hit rates of their prediction intervals too high and too low, respectively. Consequently, we prescribe the use of a trimmed average, or trimmed opinion pool, to achieve better calibration. We identify the random forest, a leading machine-learning algorithm that pools hundreds of overfit and overconfident regression trees, as an ideal environment for trimming probabilities. Using several known data sets, we demonstrate that trimmed ensembles can significantly improve the random forest’s predictive accuracy. This paper was accepted by James Smith, decision analysis.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yael Grushka‐Cockayne

University of Virginia

Victor Richmond R. Jose

Georgetown University

Kenneth C. Lichtendahl

Google (United States)

Journals

Management Science

Actions

Institutions

University of Virginia

Georgetown University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Ensembles of Overfit and Overconfident Forecasts

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study