This study focuses on comparing the performance of five machine learning algorithms, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting, and Support Vector Machine, in predicting sugarcane growth based on the soil nutrients (NPK), pH, temperature, humidity, and soil type. The dataset was preprocessed through outlier removal and missing value imputation to ensure data quality. Models were trained and validated using distinct data splits, and their performance was evaluated using metrics such as R², Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). The results indicate that both the Random Forest and Gradient Boosting models achieved highly accurate predictions, with high R² values and low error rates in all datasets used for training, validation, and testing. Random forest achieved a 0.99 R² in the training set, 0.94 on the validation set, and 0.93 on the testing set. In comparison, Gradient Boosting had a slightly lower training R² of 0.98 but attained a marginally higher testing R² of 0.93 and validation R² of 0.94. Random Forest may be slightly more prone to overfitting due to the noticeable drop in performance on validation and testing. On the other hand, Gradient Boosting demonstrated better generalization by maintaining balanced performance on unseen data. These findings suggest that the Gradient Boosting model is an efficient and reliable choice for the development of a prediction tool for real-world applications, particularly in sugarcane growth simulation and fertilizer optimization for improved crop management.
ALEGIA et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: