Spherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification

Key Points

Key points are not available for this paper at this time.

Abstract

Many feature representations, as in genomics, describe directional data where all feature vectors share a common norm. In other cases, as in computer vision, a norm or variance normalization step, where all feature vectors are normalized to a common length, is generally used. These representations and pre-processing step map the original data from R p to the surface of a hypersphere S p−1. Such representations should then be modeled using spherical distributions. However, the difficulty associated with such spherical representations has prompted researchers to model their spherical data using Gaussian distributions instead—as if the data were represented in R p rather than S p−1. This opens the question to whether the classification results calculated with the Gaussian approximation are the same as those obtained when using the original spherical distributions. In this paper, we show that in some particular cases (which we named spherical-homoscedastic) the answer to this question is positive. In the more general case however, the answer is negative. For this reason, we further investigate the additional error added by the Gaussian modeling. We conclude that the more the data deviates from spherical-homoscedastic, the less advisable it is to employ the Gaussian approximation. We then show how our derivations can be used to define

Bookmark

Spherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification

Key Points

Abstract

Cite This Study