This article proposes HOG-MAE, a novel self-supervised learning framework designed for efficient person identification using Distributed Acoustic Sensing (DAS) spatiotemporal gait signals. To mitigate the high costs of manual annotation in DAS deployment, we develop a masked autoencoder that targets the reconstruction of Histogram of Oriented Gradients (HOG) features. By reconstructing HOG features instead of raw pixels, the model prioritizes essential semantic attributes—specifically the spatial-structural geometry of strides and rhythmic frequency patterns—while maintaining inherent invariance to non-essential variations, such as background noise and signal intensity fluctuations caused by diverse footwear or walking speeds. Furthermore, leveraging the asymmetric distribution of spatial and temporal information in DAS signals, we further compress the gait semantic features, thereby achieving a lightweight model and reducing pre-training time. Experimental results demonstrate that HOG-MAE achieves a classification accuracy of 96.86% and reduces pretraining time by 22.08% compared to the standard MAE under the same epoch settings.
Shi et al. (Thu,) studied this question.