Key points are not available for this paper at this time.
We study the problem of learning mixtures of distributions, a natural formalization of clustering. A mixture of distributions is a collection of distributions D = D1,. . . DT, and � mixing weights, w1,. . . , wT such that i wi = 1. A sample from a mixture is generated by choosing i with probability wi and then choosing a sample from distribution Di. The problem of learning the mixture is that of finding the parameters of the distributions comprising D, given only the ability to sample from the mixture. In this paper, we restrict ourselves to learning mixtures of product distributions. The key to learning the mixtures is to find a few vectors, such that points from different distributions are sharply separated upon projection onto these vectors. Previous techniques use the vectors corresponding to the top few directions of highest variance of the mixture. Unfortunately, these directions may be directions of high noise and not directions along which the distributions are separated. Further, skewed mixing weights amplify the effects of noise, and as a result, previous techniques only work when the separation between the input distributions is large relative to the imbalance in the mixing weights. In this paper, we show an algorithm which successfully learns mixtures of distributions with a separation condition that depends only logarithmically on the skewed mixing weights. In particular, it succeeds for a separation between the centers that is Θ (σ √ T log Λ), where σ is the maximum directional standard deviation of any distribution in the mixture, T is the number of distributions, and Λ is polynomial in T, σ, log n and the imbalance in the mixing
Chaudhuri et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: