What question did this study set out to answer?

This study aims to improve the evaluation methods of robotic surgical systems by integrating process-level metrics and analyzing intrinsic robotic data.

February 26, 2026Open Access

From outcome-based validation to data-driven evaluation in robotic surgery

Key Points

This study aims to improve the evaluation methods of robotic surgical systems by integrating process-level metrics and analyzing intrinsic robotic data.
Prospective evaluation of the WG-NST600S robotic system during gastrointestinal surgeries.
Assessment of traditional perioperative outcomes alongside high-frequency kinematic data.
Modeling of surgeon-system interactions to distinguish performance factors.
The WG-NST600S shows feasibility and safety in gastrointestinal surgery.
Traditional outcome metrics alone insufficiently capture technical performance nuances.
Integrating robotic data enhances understanding of surgery execution and system capabilities.

Abstract

Dear Editor, We read with great interest the prospective single-center study1 entitled “Robotic gastrointestinal surgery using the Weigao surgical robot system: a single-center prospective analysis,” which reports the clinical application of the WG-NST600S surgical robot system in gastrointestinal surgery. As one of the first clinical evaluations of a domestically developed robotic platform, the study provides valuable evidence supporting its feasibility and short-term safety. At the same time, it raises a broader methodological question that is increasingly central to robotic surgery research: how such systems should be evaluated, given their inherent capacity to generate high-resolution intraoperative data2. Limitations: evaluating a data-generating system with outcome-centered metrics Despite being a cyber–physical platform that continuously records intraoperative kinematic and control data, the WG-NST600S system was assessed primarily through conventional perioperative outcomes, including operative time, blood loss, and postoperative complications. While indispensable, these metrics are relatively coarse and offer limited insight into system-level technical performance. In early-phase platform evaluation, clinically meaningful differences between robotic systems may not manifest as divergent complication rates, but rather as differences in execution efficiency, stability, and consistency during key operative tasks. Consequently, reliance on aggregate outcomes constrains causal interpretation of how specific system features contribute to surgical performance. This limitation is particularly relevant when claims regarding technical advantages or non-inferiority are made without direct measurement of operative process characteristics3. Recommendations I: integrating process-level robotic data into clinical evaluation A first step toward addressing this limitation is the systematic incorporation of intrinsic robotic data into the existing prospective evaluation framework. Robotic systems routinely capture high-frequency kinematic and control-level signals during surgery, which can be synchronized with operative timelines and analyzed alongside traditional clinical endpoints. These data provide objective descriptors of surgical execution and allow performance to be assessed at the process level rather than solely through downstream outcomes4. Methodologically, structuring analyses around standardized operative phases enables performance comparison within technically homogeneous contexts, thereby reducing confounding from procedural variability. Incorporating process-level metrics into statistical models, without displacing clinical endpoints, allows evaluation frameworks to reflect both what outcomes are achieved and how they are achieved. Importantly, such integration can be implemented through retrospective analysis of prospectively collected robotic data, preserving study feasibility and clinical workflow. Recommendations II: separating system performance from surgeon adaptation using analytical modeling A second, complementary recommendation is the explicit modeling of surgeon–system interaction over time. In evaluations of newly introduced robotic platforms, observed performance reflects both intrinsic system characteristics and surgeon adaptation. Without analytically separating these effects, system-level conclusions risk being conflated with learning-curve phenomena. This challenge can be addressed by embedding learning-curve, adjusted analyses into evaluation protocols, using case sequence and surgeon identity as structured analytical variables. By modeling performance trajectories longitudinally, it becomes possible to distinguish stable system-related performance features from transient adaptation effects. When combined with data-driven pattern analysis, such modeling enables identification of consistent technical signatures attributable to the platform itself rather than to operator familiarity5. Together, these approaches reposition the robotic system not only as a surgical tool, but as a quantifiable technical environment in which surgical performance can be objectively characterized and compared. Conclusion The WG-NST600S system demonstrates encouraging feasibility and safety in gastrointestinal surgery. To more fully substantiate its technological contribution, future evaluations should move beyond outcome-centered validation and incorporate structured analyses of intrinsic robotic data. Integrating process-level metrics and learning-curve-adjusted modeling would strengthen causal interpretation, enhance methodological rigor, and align robotic surgery evaluation with the broader trajectory of data-driven and AI-enabled surgical innovation.

From outcome-based validation to data-driven evaluation in robotic surgery

Key Points

Abstract

Cite This Study