What type of study is this?

This is a Quantitative Study study.

September 26, 2025

STANet: A Surgical Gesture Recognition Method Based on Spatiotemporal Fusion

Key Points

STANet efficiently models surgical gesture action sequences by combining spatial and temporal features.
Our model achieved exceptional performance on datasets JIGSAWS and RARP-45, outperforming benchmark models.
The temporal module uses a convolution strategy while the spatial module focuses on extracting specific features.
This method may enable better quality evaluations in robotic surgeries and assist with intelligent recognition.

Abstract

ABSTRACT In robotic surgery, surgical gesture recognition has great importance in surgical quality evaluation and intelligent recognition assistance. Currently, deep learning models, such as recurrent neural networks and temporal convolutional networks, are mainly used to model action sequences and capture the temporal dependencies between them. However, some of these methods ignore the fusion of spatial and temporal features, and hence cannot effectively capture long‐term relationships and efficiently model action sequences. To overcome these limitations, we propose a spatiotemporal adaptive network (STANet) to fuse spatiotemporal features. Specifically, we designed a temporal module and a spatial module to extract respective features. Subsequently, these features were fused and further refined through temporal modeling using a temporal adaptive convolution strategy. This approach integrates both long‐term and short‐term characteristics of surgical gesture sequences. The organic combination of temporal and spatial modules was inserted into the backbone network to form the STANet, which efficiently modeled the action sequences. Our approach has been validated on the publicly available surgical gesture datasets JIGSAWS and RARP‐45, achieving very good results. Compared to other reported benchmark models, our model demonstrates exceptional performance. It can be used in surgical robots, visual feedback systems, and computer‐assisted surgery.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Boqiang Jia

Wenjie Wang

Xin Tian

Journals

Annals of the New York Academy of Sciences

Actions

Institutions

Xi'an Polytechnic University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

STANet: A Surgical Gesture Recognition Method Based on Spatiotemporal Fusion

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study