Foundational models have emerged as a powerful paradigm within the deep learning field. Their capacity relies on the ability to learn robust representations from large-scale datasets and generalize to diverse downstream applications, such as classification. In this paper, we present Astromer 2, a foundational model designed for extracting light curve embeddings. We introduce Astromer 2, an enhanced iteration of our self-supervised model for light curve analysis. This paper highlights the advantages of its pretrained embeddings, compares its performance with that of its predecessor, Astromer 1, and provides a detailed empirical analysis of its capabilities, offering deeper insights into the model's representations. Astromer 2 is pretrained on 1.5 million single-band light curves from the MACHO survey using a self-supervised learning task that predicts randomly masked observations within sequences. Finetuning on a smaller labeled dataset allows us to assess its performance in classification tasks. The quality of the embeddings is measured by the F1 score of an multilayer perceptron (MLP) classifier trained on Astromer-generated embeddings. Our results demonstrate that Astromer 2 significantly outperforms Astromer 1 across all evaluated scenarios, including limited datasets of 20, 100, and 500 samples per class. The use of weighted per-sample embeddings, which integrate intermediate representations from Astromer's attention blocks, is particularly impactful. Notably, Astromer 2 achieves a 15% improvement in F1 score on the ATLAS dataset compared to prior models, showcasing robust generalization to new datasets. This enhanced performance, especially with minimal labeled data, underscores the potential of Astromer 2 for more efficient and scalable light curve analysis.
Donoso-Oliva et al. (Wed,) studied this question.