Summary Machine learning (ML)-based methods have been widely used for predicting cementing quality, but only a few studies have explored the interaction between features and integrated models. In this paper, we propose a voting-based ensemble ML method, apply it to the identification and prediction of cementing quality, and evaluate the model’s effectiveness. The results show that the ensemble ML model is superior to individual models, in this case the random forest (RF) and extreme gradient boosting (XGBoost) models. In the single-model configuration, these had the highest prediction accuracy, reaching 90.74%, while the ensemble model reached 91.67%, demonstrating the effectiveness of the ensemble model. Considering the issue of the number of models, we conducted ablation experiments on the number of ensemble models, proving that the integration of four base models has a better effect. The ensemble model achieves a prediction accuracy of 94.44% after optimization, which is an increase of 4.63% compared with the initial model’s 89.81%. At the same time, we conducted Shapley (SHAP) value analysis on some features to analyze the core factors affecting cementing quality. Meanwhile, we selected three actual application wells to evaluate the performance and found that the model has good predictive ability and interpretability. This study not only verifies the evaluation capability of the SHAP method for cementing quality but also constructs a bridge from the model to application. This model can provide clear and interpretable guidance for optimizing on-site construction parameters, enabling precise decision-making, and ultimately achieving the goals of increasing cementing success rate and reducing accident risks.
Sun et al. (Sun,) studied this question.