Key points are not available for this paper at this time.
BACKGROUND: Many measures of prediction accuracy have been developed. However, the most popular ones in typical medical outcome prediction settings require additional investigation of calibration. METHODS: We show how rescaling the Brier score produces a measure that combines discrimination and calibration in one value and improves interpretability by adjusting for a benchmark model. We have called this measure the index of prediction accuracy (IPA). The IPA permits a common interpretation across binary, time to event, and competing risk outcomes. We illustrate this measure using example datasets. RESULTS: The IPA is simple to compute, and example code is provided. The values of the IPA appear very interpretable. CONCLUSIONS: IPA should be a prominent measure reported in studies of medical prediction model performance. However, IPA is only a measure of average performance and, by default, does not measure the utility of a medical decision.
Kattan et al. (Fri,) studied this question.