Key points are not available for this paper at this time.
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically changed the Natural Language Processing (NLP) field. For numerous NLP tasks, approaches leveraging PLMs have achieved state-of-the-art performance. The key idea is to learn a generic, latent representation of language from a generic task once, then share it across disparate NLP tasks. Language modeling serves as the generic task, one with abundant self-supervised text available for extensive training. This article presents the key fundamental concepts of PLM architectures and a comprehensive view of the shift to PLM-driven NLP techniques. It surveys work applying the pre-training then fine-tuning, prompting, and text generation approaches. In addition, it discusses PLM limitations and suggested directions for future research.
Building similarity graph...
Analyzing shared references across papers
Loading...
Bonan Min
Hayley Ross
Elior Sulem
ACM Computing Surveys
Harvard University Press
University of Oregon
University of the Basque Country
Building similarity graph...
Analyzing shared references across papers
Loading...
Min et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d61b88bcbb69330b88b439 — DOI: https://doi.org/10.1145/3605943