March 1, 1992

An estimate of an upper bound for the entropy of English

Key Points

Key points are not available for this paper at this time.

Abstract

We present an estimate of an upper bound of 1.75 bits for the entropy of characters in printed English, obtained by constructing a word trigram model and then computing the cross-entropy between this model and a balanced sample of English text. We suggest the well-known and widely available Brown Corpus of printed English as a standard against which to measure progress in language modeling and offer our bound as the first of what we hope will be a series of steadily decreasing bounds.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Peter F. Brown

IBM (United States)

Vincent J. Della Pietra

IBM (United States)

Robert L. Mercer

IBM (United States)

Journals

Computational Linguistics

Actions

Institutions

IBM (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Brown et al. (Sun,) studied this question.

synapsesocial.com/papers/6a0f8a05d13714ec96fe4652 — DOI: https://doi.org/10.5555/146680.146685

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Variations on a theme by Ziv and Lempel· 1985 · 85 citations
Interpolated estimation of Markov source parameters from sparse data· 1980 · 825 citations
Elements of Information Theory· 2001 · 37,906 citations
Text Compression· 1990 · 988 citations
A Sandwich Proof of the Shannon-McMillan-Breiman Theorem· 1988 · 178 citations

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Variations on a theme by Ziv and Lempel· 1985 · 85 citations
Interpolated estimation of Markov source parameters from sparse data· 1980 · 825 citations
Elements of Information Theory· 2001 · 37,906 citations
Text Compression· 1990 · 988 citations
A Sandwich Proof of the Shannon-McMillan-Breiman Theorem· 1988 · 178 citations

An estimate of an upper bound for the entropy of English

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider