Accurate effort estimation and early risk detection are critical for the success of software projects, as inaccurate forecasts can lead to schedule overruns, inefficient resource allocation, and unmet requirements. This study investigates the use of machine learning techniques to support task-level effort prediction and proactive risk identification in software project management. An applied case study was conducted on a simulated dataset of 500 software development tasks, described by planning, technical, and team-related features. Two ensemble-based regression models, Gradient Boosting and Random Forest, are evaluated for predicting actual task duration. Model performance is assessed using standard metrics, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the coefficient of determination (R²). To enable early risk detection, prediction errors are transformed into deviation-based indicators, and threshold-based classifiers are employed to identify tasks with moderate (>20%) and severe (>30%) schedule overruns. Confusion matrices and classification metrics are used to evaluate the effectiveness of the proposed alerting mechanism, and the distribution of high-risk tasks across sprint quantiles is analyzed to support managerial decision-making.
Catana et al. (Thu,) studied this question.