What type of study is this?

This is a Quantitative Study study.

October 20, 2025Open Access

Truthfulness of Decision-Theoretic Calibration Measures

Key Points

The new measure, subsampled step calibration, outperforms existing methods by being both decision-theoretic and truthful.
Subsampled step calibration ensures error minimization, achieving truthfulness up to an O(1) factor, unlike prior measures.
In smoothed settings with noise, it maintains truthfulness up to an O(sqrt(log(1/c))) factor, indicating robustness.
An impossibility result shows that truthful decision-theoretic measures must be discontinuous and non-truthful in certain cases.

Abstract

Calibration measures quantify how much a forecaster's predictions violates calibration, which requires that forecasts are unbiased conditioning on the forecasted probabilities. Two important desiderata for a calibration measure are its decision-theoretic implications (i. e. , downstream decision-makers that best-respond to the forecasts are always no-regret) and its truthfulness (i. e. , a forecaster approximately minimizes error by always reporting the true probabilities). Existing measures satisfy at most one of the properties, but not both. We introduce a new calibration measure termed subsampled step calibration, StepCE^sub, that is both decision-theoretic and truthful. In particular, on any product distribution, StepCE^sub is truthful up to an O (1) factor whereas prior decision-theoretic calibration measures suffer from an e^- (T) - (T) truthfulness gap. Moreover, in any smoothed setting where the conditional probability of each event is perturbed by a noise of magnitude c > 0, StepCE^sub is truthful up to an O ( (1/c) ) factor, while prior decision-theoretic measures have an e^- (T) - (T^1/3) truthfulness gap. We also prove a general impossibility result for truthful decision-theoretic forecasting: any complete and decision-theoretic calibration measure must be discontinuous and non-truthful in the non-smoothed setting.

Truthfulness of Decision-Theoretic Calibration Measures

Key Points

Abstract

Cite This Study

Also Consider

Also Consider