Background The increasing global energy demand necessitates reliable techniques for appliance-level consumption analysis. Non-Intrusive Load Monitoring (NILM) enables device-level disaggregation from aggregated smart-meter measurements, supporting energy efficiency, demand management, and user awareness. However, the accurate classification of appliances from low-frequency (1 Hz) power signals remains challenging. Methods This study introduces a hybrid NILM framework that integrates time–frequency analysis, texture-based feature extraction, and deep–shallow hybrid classification. Power signals from the publicly available High-Resolution Household Appliances dataset were transformed into 256 × 256 Short-Time Fourier Transform (STFT) spectrograms, contrast-enhanced, and encoded using Local Phase Quantization (LPQ). The resulting 1 × 256 feature vectors were reorganized into multiple layouts, with the 4 × 64 structure providing the most discriminative representation. Classification was performed using (i) a Long Short-Term Memory (LSTM) network, and (ii) Support Vector Machine (SVM) and k-nearest neighbors (kNN) models trained on deep features extracted from the LSTM’s fully connected layer. Results Across the 18-appliance dataset, the proposed LPQ–LSTM–SVM pipeline achieved the highest accuracy (98.69%), followed by kNN (98.25%) and standalone LSTM (96.94%). Macro-F1 and macro-AUC metrics further validated the robustness of the approach, with most appliances achieving AUC values above 0.99. A 5-fold cross-validation analysis confirmed the stability of the results. Appliances with highly overlapping temporal signatures, such as dishwashers and hair dryers, showed marginally reduced separability. Discussion The findings demonstrate that combining STFT-based spectral imaging with LPQ descriptors and hybrid deep–shallow classifiers offers a powerful and computationally efficient solution for 1 Hz NILM. Compared with recent convolutional neural network (CNN), Autoencoder, and Transformer-based NILM models, the proposed approach provides competitive performance without requiring high-frequency data or heavy neural architectures. A t-SNE visualization of the learned feature space further illustrates the interpretability of the model. This framework offers a practical basis for future work on multimodal fusion and cross-household generalization.
Berna Gurler Ari (Thu,) studied this question.