Dear Editor, We have read with great interest the recent article by Zhu et al1. titled “Development and validation of an explainable machine learning model for predicting occult lymph node metastasis in early-stage oral tongue squamous cell carcinoma: A multi-center study.” The authors addressed a clinically significant challenge in the management of early-stage oral tongue squamous cell carcinoma (OTSCC) – the prediction of occult lymph node metastasis (OLNM) – by developing and validating an explainable machine learning (ML) model using multicenter data. Their work represents an important advancement in the application of artificial intelligence (AI) to surgical oncology, particularly through the integration of SHapley Additive exPlanations (SHAP) for model interpretability and the development of a user-friendly online prediction platform. We commend the authors for their rigorous methodology, comprehensive validation, and commitment to clinical translation. We have been assured that the article complies with TITAN Guidelines 2025-governing declaration and use of AI2. The study’s strengths are numerous. The use of multicenter data enhances the generalizability of the findings, and the external validation performed on an independent cohort from a different institution adds robustness to the results. The random forest (RF) model demonstrated exceptional performance, with area under the curve (AUC) values exceeding 0.9 in both internal and external validation sets, outperforming other ML models and traditional predictors such as depth of invasion (DOI) and tumor budding (TB). The application of SHAP analysis successfully addresses the “black-box” nature of ML algorithms, providing both global and local interpretations that are invaluable for clinical adoption3. Furthermore, the deployment of an open-access online tool allows clinicians to utilize the model in real-time, bridging the gap between research and practice. Despite these commendable achievements, several aspects warrant further discussion to enhance the impact and applicability of this promising model. Firstly, while the RF model exhibited superior predictive performance, the exploration of ensemble methods or hybrid modeling approaches – such as stacking or boosting ensembles – might have offered even greater accuracy and stability. Ensemble techniques often mitigate the limitations of individual algorithms and could potentially improve performance in heterogeneous patient populations. Future iterations of the model could consider incorporating such approaches to maximize predictive power. Another consideration is the generalizability of the model across diverse ethnic and geographic populations. The study cohorts were derived from Chinese medical centers, which may limit the applicability of the model to other regions with different genetic backgrounds, lifestyle factors, and healthcare practices. Validation in international cohorts, such as those from North American or European institutions, would strengthen the model’s global relevance and utility. Additionally, including socioeconomic and environmental variables could further refine the model’s adaptability to various clinical settings. The reliance on postoperative pathological features – such as tumor budding and perineural invasion – represents a limitation for preoperative decision-making. While these variables are strong predictors, they are not available prior to surgery. Integrating preoperative data, such as imaging biomarkers from MRI or CT radiomics, could enhance the model’s utility in the preoperative setting4,5. For example, recent advances in radiomics have shown promise in predicting OLNM noninvasively6. Combining clinicopathological variables with imaging features might yield a more comprehensive tool for clinical use before treatment planning. Moreover, the study did not extensively discuss the potential impact of neoadjuvant therapies or treatment modifications on model performance. As treatment paradigms evolve—for instance, with the increasing use of immunotherapy or targeted therapies—the biological behavior of tumors and their metastatic potential may change7,8. Future studies could explore the model’s performance in patients receiving multimodal treatment and assess its adaptability to evolving clinical contexts. Finally, the ethical implications of using ML models in clinical decision-making deserve attention. While the SHAP framework improves transparency, ensuring that clinicians understand and trust the model’s predictions is crucial for its adoption. Educational initiatives and guidelines on the interpretation of ML-based tools may be necessary to foster confidence among healthcare providers. Furthermore, patient perspectives on the use of such technologies should be considered to ensure that their values and preferences are respected in shared decision-making processes. In conclusion, Zhu et al have made a valuable contribution to the field of surgical oncology by developing a highly accurate and interpretable ML model for predicting OLNM in early-stage OTSCC. Their work lays a strong foundation for future research and clinical application. By addressing the points mentioned above – such as exploring ensemble methods, expanding validation to diverse cohorts, incorporating preoperative data, and ensuring seamless clinical integration – the model could evolve into an indispensable tool for personalized treatment planning. We congratulate the authors on their excellent study and look forward to seeing further advancements in this important area.
Liu et al. (Wed,) studied this question.