Symbiotic stars are interacting binary systems composed of a red giant transferring material to a hot compact star, typically a white dwarf. These systems are crucial for studying stellar evolution, accretion processes, mass transfer, and a variety of complex astrophysical phenomena. However, there is a significant discrepancy between the number of confirmed symbiotic stars ( (and the estimated population in the Milky Way ( (1. 2 10³ - 1. 5 10⁴) ), suggesting that a large fraction remains undetected. To address this issue, we propose the identification of new symbiotic stars through the application of machine-learning techniques. Our approach combines multiband photometric data from DR3, 2MASS, and WISE, together with parallax measurements and the pseudo-equivalent width of H (α), to effectively distinguish symbiotic candidates from other stellar populations. Gaia We trained a random forest model using a sample of 166 confirmed S-type symbiotic stars and a control sample of 1, 600 nonsymbiotic stars. To mitigate class imbalance and improve the classification performance, we applied the synthetic minority oversampling technique (SMOTE). The model achieved an (F₁) score of 89% for the symbiotic class. We applied our model to a catalog of approximately 2. 5 million stars selected based on photometric colors consistent with those of S-type symbiotic stars. We identified 990 candidates in this sample with a classification probability of at least 70%. To refine the selection, we applied statistically and physically motivated cuts based on effective temperature, surface gravity, and metallicity and complemented the cuts by SkyMapper photometry. This process yielded 12 high-confidence candidates, characterized by cool temperatures, low surface gravities, solar-like metallicity, (Hα) emission, luminosities ranging from moderate to high, and ultraviolet excesses consistent with the properties of S-type symbiotic systems. To evaluate the model performance, we applied it to a validation set of symbiotic stars recently confirmed in the literature. We recovered 92. 3% of them. This result supports the effectiveness and generalizability of our classification approach.
Rojas et al. (Mon,) studied this question.