7519 Background: Multiple myeloma (MM) is a malignant plasma cell disorder that often presents with non-specific symptoms, resulting in delayed diagnosis and poorer outcomes. While artificial intelligence (AI) models using routine laboratory data have shown promise for MM detection, their ability to distinguish monoclonal gammopathy of undetermined significance (MGUS) from early MM and to identify patients at risk of progression remains limited. Methods: We conducted a retrospective study using routinely collected clinical and laboratory data from 89 patients evaluated at the Lincolnshire Hospital Group between 2003 and 2025. The cohort included confirmed MM cases, MGUS patients, and non-myeloma controls. Predictive features included hemoglobin, serum creatinine, calcium, albumin, immunoglobulin levels, paraprotein concentration, and derived laboratory indices. After rigorous data cleaning and preprocessing, eight supervised machine learning models were trained using a training–testing split. These included six machine learning algorithms (Lasso, gradient boosting, random forest, elastic-net, ridge regression, and support vector) and two baseline algorithms (Naïve (Mean) and Seasonal-Naïve). The training set included 69 patients (82%), and the test set included 20 patients (18%). Model performance was evaluated using precision, recall, F1 score, and regression evaluation metrics such as R² (R-squared) and MAE (Mean Absolute Error), with a specific focus on MGUS–MM differentiation. Results: The developed AI models demonstrated reliable performance in distinguishing MM from non-myeloma controls and showed improved discriminatory ability between MGUS and MM compared with standard laboratory threshold-based assessment. Specific laboratory feature patterns were associated with progression-consistent phenotypes, enabling risk stratification within the MGUS population. The best-performing model, Lasso Regression, achieved 0.01 MAE (months) and a nearly perfect score of 1.00 for R². This meant the best prediction done by the model had a 0.47 months error, i.e. 14 days difference from the actual transformation date. Gradient Boosting and Random Forest achieved 1.04 and 1.15 MAE, and 0.90 and 0.95 R², respectively. The results were implemented as a prototype clinical decision-support tool capable of generating rapid risk predictions using routinely available laboratory data. Conclusions: This study advances existing AI-based diagnostic approaches by addressing the clinically important challenge of early MM detection and MGUS risk stratification. AI-driven analysis of routine laboratory data may support earlier identification of patients at risk of progression, enabling improved monitoring, timely referral, and enhanced clinical decision-making.
Rinaldi et al. (Thu,) studied this question.