When approximating one probability density with another density, it is desirable to minimize the information loss of the approximation as quantified by, e.g., the Kullback–Leibler divergence (KLD). It has been known for some time that in the case of the Gaussian distribution, matching the first two moments of the original density yields the optimal approximation in terms of minimizing the KLD. In this paper, we will show that a similar property can be proven for certain hyperspherical probability distributions, namely the von Mises–Fisher and the Watson distribution. This result has profound implications for momentbased filtering on the unit hypersphere as it shows that momentbased approaches are optimal in the information-theoretic sense.
Kurz et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: