Key points are not available for this paper at this time.
A Neural Network (NN) may exhibit overconfidence about wrong hypotheses, especially for Out-Of-Domain (OOD) inputs. A Gaussian process (GP) instead has an explainable distributional uncertainty behaviour, by predicting hypotheses with greater uncertainty for query inputs further from the training data. Previous work has shown that a NN can learn to emulate the behaviour of a GP on in-domain data. This paper expands upon this, by proposing to train a NN student to emulate the GP teacher's distributional uncertainty behaviour on OOD data. This avoids the computational cost of using a GP at run-time, while improving the OOD confidence calibration of a NN. More accurate confidence calibration may better inform how the system should feedback to the user. Experiments on the SEP-28k-E stutter detection dataset suggest that distillation of such knowledge is feasible between these models.
Wong et al. (Mon,) studied this question.