March 14, 2024Open Access

Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training

Key Points

Key points are not available for this paper at this time.

Abstract

We explore the training dynamics of neural networks in a structured non-IID setting where documents are presented cyclically in a fixed, repeated sequence. Typically, networks suffer from catastrophic interference when training on a sequence of documents; however, we discover a curious and remarkable property of LLMs fine-tuned sequentially in this setting: they exhibit anticipatory behavior, recovering from the forgetting on documents before encountering them again. The behavior emerges and becomes more robust as the architecture scales up its number of parameters. Through comprehensive experiments and visualizations, we uncover new insights into training over-parameterized networks in structured environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yang et al. (Thu,) studied this question.

www.synapsesocial.com/papers/68e7420ab6db6435876bb819 — DOI: https://doi.org/10.48550/arxiv.2403.09613

Authors

Yanlai Yang

Matt Jones

Michael C. Mozer

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion