Aiming at the shortcomings of traditional vocational skill evaluation methods, this paper puts forward a detection and evaluation model of vocational skill operation behavior based on multi-sensor fusion and image processing. The model integrates multi-sensor data from IMU, infrared positioning, depth camera and microphone, and combines with image processing technology to realize fine analysis of operation behavior. Spatial-temporal synchronization framework and Kalman filtering technology are adopted to achieve sub-millisecond multi-data alignment and effectively reduce noise interference. Construct an operational behavior map, transform multimodal data into hierarchical skill representation, and realize semantic understanding and structural modeling of operational behavior. Using the weighted dynamic model, the evaluation weight is automatically adjusted according to the operation stage, and the fine evaluation of vocational skill operation is realized. The experimental results show that the model significantly improves the consistency of evaluation results and reduces the false detection rate in the evaluation of CNC machine tools, and provides a quantitative and interpretable intelligent tool for vocational skill certification.
Peng et al. (Sun,) studied this question.