March 3, 2026Open Access

A Self-Supervised Transformer Approach for Human Activity Recognition From Accelerometer Signals

Key Points

Improved human activity recognition was achieved with the proposed model, reaching an average F1-score improvement of 18.1%.
This model demonstrates high performance while using less than 100,000 parameters and 30 million FLOPs, making it suitable for wearable devices.
The method utilizes a self-supervised learning technique focused on masked reconstruction, enhancing its efficacy with controlled noise injection.
This approach suggests a decreased dependency on large labeled datasets, offering a reliable solution across various sensor placements.

Abstract

A crucial piece of technology for activity tracking and health monitoring is wearable accelerometer-based Human Activity Recognition (HAR). However, there are practical difficulties due to the requirement for huge labeled datasets and the diversity in sensor placement 1,2. In order to acquire meaningful representations from unlabeled accelerometer data, this work presents a self-supervised learning (SSL) technique based on the Transformer architecture 3. The method uses a pretext task for masked reconstruction that is improved by controlled noise injection. The model is refined on three labeled datasets, WISDM, REALWORLD, and OPPORTUNITY, following pre-training on the unlabeled Capture24 dataset. The results show that, when compared to training from scratch, comprehensive fine-tuning of the pre-trained model results in an average improvement of 18.1% in F1-score. Recognition of Human Activity Additionally, the model shows a reasonable capacity to generalize across several sensor locations, particularly when adjusted using a small quantity of labeled data from a new location. The suggested model offers dependable performance despite its small size—less than 100,000 parameters and 30 million FLOPs—making it appropriate for implementation on wearable devices with constrained resources. These results imply that Transformer-based SSL can retain strong performance across a range of users and sensor setups while greatly reducing reliance on labeled data.

A Self-Supervised Transformer Approach for Human Activity Recognition From Accelerometer Signals

Key Points

Abstract

Cite This Study