To address the challenges of sensor installation limitations, severe background noise interference, and low model deployment efficiency in rolling bearing fault diagnosis in industrial environments, this paper proposes a lightweight, progressive fusion and knowledge-distillation diagnostic framework that integrates vibration and sound signals. First, considering the differences in physical characteristics between vibration and sound signals, a feature-extraction network for heterogeneous modality perception is designed: the vibration branch employs a large-kernel one-dimensional convolutional neural network, while the sound branch uses a small-kernel stacked two-dimensional convolutional neural network, with depthwise separable convolutions introduced for lightweight modification. Second, an attention-gated progressive feedback fusion strategy is proposed. Learnable gating units are used to filter the confidence of the fused features, feeding them back to the original input as residuals, effectively suppressing noise accumulation and improving fusion quality. Finally, a cross-architecture knowledge-distillation scheme is constructed, transferring the fault feature-discrimination ability from the deep heterogeneous fusion network (teacher network GAF-Net) to the lightweight LightGBM (student network Distilled-LGB). Combined with a normal sample statistical feature alignment mechanism, the student model can independently complete end-to-end fault diagnosis only with online-extractable handcrafted features, achieving microsecond-level pure model inference speed while ensuring diagnostic accuracy, fully meeting industrial edge deployment requirements. Experiments on a self-built industrial dataset and the public UOEMD-VAFCVS dataset show that GAF-Net achieves 97.89% (A → B) and 96.72% (15 Hz → 30 Hz) accuracy. Distilled-LGB achieves 21 ms inference time and 4.2 MB model size with <1% accuracy loss, demonstrating noise robustness, cross-condition generalization, and edge deployment capability.
Huang et al. (Tue,) studied this question.