Abstract Large Language Models (LLMs) exhibit In-Context Learning (ICL), which enables the model to perform new tasks conditioning only on the examples provided in the context without updating the model’s weights. While ICL offers fast adaptation across natural language tasks and domains, its emergence is less straightforward for modalities beyond text. In this work, we systematically uncover properties present in LLMs that support the emergence of ICL for autoregressive models and various modalities by promoting the learning of the mechanisms needed for ICL. We identify exact token repetitions in the training data sequences as an important factor for ICL. Such repetitions further improve stability and reduce transiency in ICL performance. We analyse in detail the training dynamics of such data sequences and explain how token repetitions enhance the ICL learning mechanisms. Moreover, we emphasise the importance of the training task difficulty for the emergence of ICL. Finally, by applying our novel insights on ICL emergence, we unlock ICL capabilities across various visual datasets used for few-shot classification, and confirm the generalisability of our insights to much harder real-world examples of large-scale object classification, and a more challenging EEG classification task. Code is available at https: //github. com/jelenab98/unlockingᵢcl
Bratulić et al. (Mon,) studied this question.