Key points are not available for this paper at this time.
The discovery of new superconductors through traditional trial-and-error experimental methods is both challenging and costly. A data-driven strategy can speed up the time-consuming exploration processes and also potentially reveal new insights into the complex correlations at play. Several material databases have been created for predicting the superconducting critical temperature (Tc), but they are confined solely to superconducting materials, limiting their prediction capabilities. In addressing this, we compiled a comprehensive open-source dataset encompassing both superconductors and non-superconductors. Our dataset contains 13 415 superconductors and 13 425 non-superconductors, each characterized by 212 features. Novel features, namely, CuO layers, ionic radius, heat of vaporization, cohesive energy, and thermal conductivity with respect to every element in the material were included. Another feature named “material type” was introduced to classify materials as low-temperature superconductors, high-temperature superconductors, and non-superconductors. Various machine learning techniques, including boosting and bagging methods, were employed to predict Tc. The light gradient boosting model emerged as the most effective, achieving a coefficient of determination of 93% and a mean absolute error of 2.93 K. While this novel, comprehensive material dataset is made available to enrich future research, a web application is also developed for predicting Tc of any material in relation to the material type based on the best of trained models.
Nawoda et al. (Tue,) studied this question.