Accurate motion capture is critical for biomechanics research, clinical gait analysis, human–computer interaction, and the entertainment industry. Two dominant paradigms exist: marker-based systems, which use inertial measurement units affixed to the body, and marker-less, vision-based systems, which infer kinematics directly from camera data using deep-learning techniques. While both have matured rapidly, a systematic, head-to-head evaluation under identical experimental conditions is still lacking. This study benchmarks the spatiotemporal accuracy, robustness, and practical usability of state-of-the-art inertial measurement units and vision-based pipelines across a representative set of motor tasks (walking, running, squatting, and upper-limb reaches) performed by 5 healthy adults. Joint angles from inertial measurement units were reconstructed via a complementary-filter fusion of accelerometer, gyroscope, and magnetometer signals. Vision-based approaches, specifically the most advanced methods, rely on cameras that capture images in red, green, and blue color channels and employ a series of convolutional layers for reconstruction. Additionally, we have also included double, depth-based Kinect version two cameras for comparison. Results indicated that both the inertial measurement unit and vision-based pipelines estimated lower-limb joint angles with generally acceptable differences, yet the vision approach showed larger deviations, especially during out-of-plane motion and brief self-occlusions. Inertial measurement unit, by contrast, remained more consistent in those scenarios but exhibited gradual drift during prolonged recordings when magnetic updates were absent. Although the camera-only system offered quicker setup and greater participant comfort, a streamlined calibration routine narrowed the preparation gap for inertial measurement units. Vision hardware was less expensive, but its higher computational demands offset that advantage. Taken together, the study highlights a trade-off: marker-less vision prioritizes plug-and-play usability, whereas inertial measurement units deliver steadier, higher-precision tracking.
Minas Aslanyan (Mon,) studied this question.