The design of carbon dots (CDs) with tailored optical properties is a significant challenge in materials science, often hindered by complex synthetic protocols and nonlinear synthesis-property relationships. To accelerate this process, we present a data-driven approach leveraging machine learning to predict the photoluminescent emission of CDs from their synthesis parameters. A systematic experimental data set was utilized, comprising 407 CD syntheses prepared from p-benzoquinone and ethylenediamine across different solvents. We performed a rigorous comparative analysis of state-of-the-art ensemble learning algorithms: Random Forest, XGBoost, and CatBoost. The results demonstrate that CatBoost provides superior predictive accuracy, achieving a mean cross-validation coefficient of determination (R 2) of approximately 0.98, outperforming other models. These findings highlight the efficacy of gradient boosting algorithms, particularly CatBoost, in modeling systematic chemical data and provide a validated computational tool to guide the efficient, on-demand synthesis of functional nanomaterials.
Duman et al. (Fri,) studied this question.