Abstract Background Intestinal structuring and penetrating complications are major drivers of unfavorable outcomes in Crohn’s disease (CD), and more than half of patients develop at least one of these complications during long-term follow-up. However, conventional clinical tools show limited ability to predict behavior progression, and reliable early-warning markers remain unclear. This study aimed to develop an AI-based multimodal model that integrates easily accessible clinical characteristics and longitudinal blood-test data to predict disease-behavior transition in long-standing CD. Methods We retrospectively included CD patients with ≥5 years of continuous follow-up from 2015–2025. A total of 36 clinical variables and 50 blood-test features (24 blood routine, 24 biochemical detection, and 2 inflammatory markers) were collected. After preprocessing, 289 eligible patients were used for model development. We benchmarked classical machine-learning algorithms (Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost, Naive Bayes) and further developed a variational autoencoder (VAE)–based deep-learning model that jointly optimized reconstruction loss, KL divergence, and classification loss to learn compact latent patient representations. Model interpretability was assessed using SHAP values. External validation was performed using two independent datasets. Results Among 289 patients, 162 were classified as Montreal B1 at baseline (91 remained B1 during follow-up, 71 transited), and 127 were classified as B2 at baseline (58 remained stable, 69 transited). Models combining clinical and blood-test features consistently outperformed models using either feature set alone. The AI-based VAE achieved the highest predictive performance (B1 baseline prediction: AUC: 0.8690; Acc: 0.8438; Sn: 0.8571; Sp: 0.8333; PPV: 0.8000; NPV: 0.8824; MCC: 0.6900; F1 score: 0.8276; B2 baseline prediction: AUC: 0.8512; Acc: 0.8077; Sn: 0.9286; Sp: 0.6667; PPV: 0.7647; NPV: 0.8889; MCC: 0.6000; F1 score: 0.8387). SHAP analysis showed that biologic therapy exposure, markers of systemic inflammation, and disease-activity indices were the strongest contributors to model predictions. Performance remained robust across two external validation cohorts. Conclusion We developed an AI-driven multimodal model that enables accurate prediction of Crohn’s disease behavior progression up to one year before clinical transition using routine clinical and blood-test data. This approach may support early risk identification, proactive therapeutic adjustment, and personalized disease management in patients with long-standing CD. Conflict of interest: Mr. Dai, Hanqiao: No conflict of interest Zhang, Zhe: No conflict of interest Wan, Zhaoman: No conflict of interest Bai, Xiaoyin: No conflict of interest Zhang, Peng: No conflict of interest Yang, Hong: No conflict of interest
Dai et al. (Thu,) studied this question.