• Created a large-scale experimental dataset with 608 device structures of TPSC • Analysed 150 syntheses and material features for performance enhancement • Random Forest model achieved a train R 2 of 0.987 • SHAP, PDP, and ICE plots were utilised to find the key fabrication parameters Tin-based perovskite solar cells (TPSCs) have shown significant potential as lead-free perovskite solar cells (PSCs). However, TPSCs are currently struggling to reach higher power conversion efficiency (PCE) due to rapid crystallization and Sn +2 to Sn 4+ oxidation. This study presents a novel machine learning (ML) framework to analyse a large-scale, manually extracted experimental dataset and optimise fabrication parameters to enhance the performance of TPSCs. The dataset is compiled from nearly 1,200 research articles and includes 150 selected features and four target labels (PCE, open-circuit voltage (V OC ), short-circuit current (J SC ), and fill factor (FF)). This framework consists of a benchmarking method that utilises multiple ML models to select a suitable model. The Random Forest model outperformed other ML models, achieving an outstanding training coefficient of determination (R 2 ) of 0.987. A five-fold cross-validation is introduced into the framework to analyse the ML model's generalisation. SHAP (Shapley Additive Explanations), Partial Dependence Plots (PDPs), and Individual Conditional Expectation (ICE) plots revealed dominant features of anti-solvent, ETL spin speed, and annealing temperature. This framework provides crucial recommendations to enhance the TPSC's performance. This bridges the gap to commercialise the lead-free high-performance PSCs.
Muppana et al. (Sun,) studied this question.