Established construction workload metrics fail to integrate workers’ physical postures, leading to inaccurate assessments of both ergonomic risk and productivity. To address this gap, this research introduces a data-driven framework that integrates computer vision and deep learning to analyze worker postures. The posture-hour is proposed as a new metric intended to objectively quantify physical workload and overcome the limitations of conventional labor-hour measurements. The framework consists of a two-stage process: initial skeletal key-point extraction using a pretrained YOLO-Pose model, followed by posture classification via a custom architecture combining a convolutional neural network (CNN), a bidirectional long short-term memory (BiLSTM) network, and a multihead attention mechanism. The prototyped system was capable of identifying five fundamental postures—walking, standing, bending, squatting, and arm raising—achieving 85.3% accuracy in a controlled rebar-tying experiment. Validation on concrete pouring surveillance footage demonstrated the framework’s effectiveness in multiworker workload quantification. The results revealed distinct, task-specific postural demands: rebar tying was characterized by prolonged bending and squatting at low structural nodes, whereas concrete pumping involved sustained arm raising. The originality of this work lies in its integration of vision-based posture analysis with duration-based metrics to establish a direct, quantifiable link between physical workload and productivity. This lays the foundation for human-centric, AI-enhanced productivity study and work planning. Limitations to be addressed in future research include the reliance on manual annotations for model training, the need for task breakdown definitions that depend on practical know-how, and the current absence of an industry-wide posture classification taxonomy.
Qi et al. (Wed,) studied this question.