December 23, 2002

A variable-length category-based n-gram language model

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

A language model based on word-category n-grams and ambiguous category membership with n increased selectively to trade compactness for performance is presented. The use of categories leads intrinsically to a compact model with the ability to generalise to unseen word sequences, and diminishes the sparseness of the training data, thereby making larger n feasible. The language model implicitly involves a statistical tagging operation, which may be used explicitly to assign category assignments to untagged text. Experiments on the LOB corpus show the optimal model-building strategy to yield improved results with respect to conventional n-gram methods, and when used as a tagger, the model is seen to perform well in relation to a standard benchmark.

Preguntar a la IA

Me gusta

Guardar

Cite This Study

Niesler et al. (Mon,) studied this question.

synapsesocial.com/papers/6a20769721cb8130ca780371 https://doi.org/https://doi.org/10.1109/icassp.1996.540316

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Preguntar a la IA

Me gusta

Guardar