Los puntos clave no están disponibles para este artículo en este momento.
We consider the problem of learning co-occurrence information between two word categories, or more in general between two discrete random variables taking values in a hierarchically classified domain. In particular, we consider the problem of learning the `association norm' defined by A (x, y) =p (x, y) / (p (x) *p (y) ), where p (x, y) is the joint distribution for x and y and p (x) and p (y) are marginal distributions induced by p (x, y). We formulate this problem as a sub-task of learning the conditional distribution p (x|y), by exploiting the identity p (x|y) = A (x, y) *p (x). We propose a two-step estimation method based on the MDL principle, which works as follows: It first estimates p (x) as p1 using MDL, and then estimates p (x|y) for a fixed y by applying MDL on the hypothesis class of A * p1 | A B for some given class B of representations for association norm. The estimation of A is therefore obtained as a side-effect of a near optimal estimation of p (x|y). We then apply this general framework to the problem of acquiring case-frame patterns. We assume that both p (x) and A (x, y) for given y are representable by a model based on a classification that exists within an existing thesaurus tree as a `cut, ' and hence p (x|y) is represented as the product of a pair of `tree cut models. ' We then devise an efficient algorithm that implements our general strategy. We tested our method by using it to actually acquire case-frame patterns and conducted disambiguation experiments using the acquired knowledge. The experimental results show that our method improves upon existing methods.
Abe et al. (Thu,) studied this question.