March 3, 2026Open Access

OVGrasp: Open-Vocabulary Intent Detection for Grasping Assistance using ExoGlove

Key Points

OVGrasp achieves a grasping ability score of 87.00%, significantly higher than existing methods.
The system incorporates an open vocabulary mechanism for zero-shot detection of unseen objects.
Multimodal decision making combines spatial and linguistic cues to accurately infer user intents.
The framework operates within a wearable exoskeleton featuring RGB-D vision capabilities.

Abstract

Grasping assistance is essential for restoring autonomy in individuals with motor impairments, particularly in unstructured environments where object categories and user intentions are diverse and unpredictable. We present OVGrasp , a hierarchical control framework for grasp assistance that integrates RGB-D vision, open vocabulary prompts, and voice commands to enable robust multimodal interaction. To enhance generalisation in open environments, OVGrasp incorporates a vision language foundation model with an open vocabulary mechanism, which enables zero-shot detection of previously unseen objects without retraining. A multimodal decision maker further fuses spatial and linguistic cues to infer user intent, such as grasp or release, in situations involving multiple objects. We deploy the complete framework on a custom egocentric view wearable exoskeleton and conduct systematic evaluations on fifteen objects across three grasp types. Experimental results with ten participants show that OVGrasp achieves a grasping ability score (GAS) of 87.00%, surpassing existing baselines and providing improved kinematic alignment with natural hand movement. • OVGrasp: a hierarchical framework for grasp assistance. • Open-vocabulary detection enables zero-shot generalisation to unseen objects. • Multimodal decision-making fuses vision, depth, and speech for intent detection. • Integrated in a soft hand exoskeleton with egocentric RGB-D sensing. • Achieves superior grasping ability score and improved joint kinematics in tests.

OVGrasp: Open-Vocabulary Intent Detection for Grasping Assistance using ExoGlove

Key Points

Abstract

Cite This Study

Also Consider

Also Consider