Key points are not available for this paper at this time.
A Neural Network (NN) may exhibit overconfidence about wrong hypotheses, especially for Out-Of-Domain (OOD) inputs. A Gaussian process (GP) instead has an explainable distributional uncertainty behaviour, by predicting hypotheses with greater uncertainty for query inputs further from the training data. Previous work has shown that a NN can learn to emulate the behaviour of a GP on in-domain data. This paper expands upon this, by proposing to train a NN student to emulate the GP teacher's distributional uncertainty behaviour on OOD data. This avoids the computational cost of using a GP at run-time, while improving the OOD confidence calibration of a NN. More accurate confidence calibration may better inform how the system should feedback to the user. Experiments on the SEP-28k-E stutter detection dataset suggest that distillation of such knowledge is feasible between these models.
Wong et al. (Mon,) studied this question.
Synapse has enriched one closely related paper. Consider it for comparative context: