Soiling in photovoltaic systems is a recurring problem that reduces energy generation and demands efficient operation and maintenance (O&M) strategies. In this context, this paper proposes a machine learning-based approach to identify dirt levels and generate cleaning alerts using operational and weather data. Initially, the models were evaluated with a decision threshold ranging from 0.5 to 0.7, using only operational features. Subsequently, the inclusion of weather features was tested, which improved the models’ performance and enabled the selection of the best models for the exhaustive features search step. The models analyzed in this step were Extra Trees, Histogram-based Gradient Boosting, Extreme Gradient Boosting, and Random Forest. Exhaustive analysis further improved model performance, as indicated by global metrics and ROC curves. The Extra Trees model with a threshold of 0.5 showed the best performance and was selected as the final configuration, achieving an accuracy of 0.9884 and an AUC-ROC of 0.9957. Finally, the selected model was applied to determine daily soiling levels and trigger alerts based on temporal persistence, indicating its potential to support predictive O&M decisions and cleaning actions in PV systems.
Hammerschmitt et al. (Sat,) studied this question.