March 12, 2025

Autoregressive Language Model with Historical Context Re-encoding

Key Points

Key points are not available for this paper at this time.

Abstract

The foundation of current large language model applications lies in the generative language model, which typically employs an autoregressive token generation approach. However, this model faces two key limitations: its unidirectional causal attention mechanism restricts semantic expressiveness, and the deep decoder results in slower decoding. To address these issues, we introduce the autoregressive language model with historical context re-encoding (HCR). Our method improves the encoding of historical tokens by periodically re-encoding newly generated tokens. The model incorporates a history encoder and uses a relatively shallow decoder for short-segment decoding. This innovative architecture enhances generation quality, accelerates decoding, and operates efficiently in both generation and comprehension modes. Comprehensive experiments demonstrate that HCR significantly outperforms standard autoregressive models in various language comprehension and generation tasks, delivering an average performance boost of over 2.3% and a 1.3x improvement in decoding speed.

Bookmark

Cite This Study

Yimeng Zhuang (Wed,) studied this question.

synapsesocial.com/papers/6a11f09316bc81049accbfe6 https://doi.org/https://doi.org/10.1109/icassp49660.2025.10890165

Bookmark