Classifying variable stars like Cepheids and RR Lyrae variables is a vital part of exploring the universe. Understanding these stars can provide deep insights into measuring cosmic distances through their period-luminosity relationship. Using machine learning, we aim to classify Cepheid Types I, Cepheid Type II, and RR Lyrae Variables based on redshift, color indices, galactic coordinates, and metallicity. Our initial step was to gather a dataset of approximately 6000 stars for RR Lyrae and 2300 for Cepheids and split them into 2 classes: Cepheid and RR Lyrae from the SIMBAD database, then randomize our sample to avoid bias and graph the data to identify useful relationships. Our graphs showed differences in metallicity, galactic coordinates, and redshift, which led us to include only these factors in our classification model. Our accuracy on the Random Forest model was 92%, and on the Gradient Boosting model, it was 89%. These results suggest strong links between these parameters and the star types they indicate. This study aims to enhance existing classification models by adding parameters that could improve their accuracy and provide more statistical insights into the information each parameter provides about star types. However, we recognize the limitations of our method, including the lack of hyperparameter tuning and confidence intervals.
Anand et al. (Wed,) studied this question.