Predictive maintenance (PdM) has emerged as a paradigm-shifting approach to asset lifecycle management in process industries, promising significant reductions in unplanned downtime, maintenance costs, and safety-related incidents relative to conventional schedule-based and condition-based strategies. The proliferation of industrial Internet of Things (IIoT) sensors, edge computing platforms, and cloud-based data pipelines has made large-scale vibration, temperature, acoustic emission, and operational parameter data routinely available for machine learning (ML) model development. However, translating raw time-series sensor data into actionable fault prognostics remains a non-trivial engineering challenge, encompassing signal preprocessing, feature extraction in time, frequency, and time-frequency domains, class imbalance handling, cross-machine generalisation, and operational deployment under real-time latency constraints. This study presents a comprehensive ML-based predictive maintenance framework evaluated on rotating equipment — specifically centrifugal pumps, induction motors, and gearboxes — instrumented with triaxial accelerometers, thermocouples, and current sensors across three process plant sites in Northern India. Five ML algorithms are benchmarked: Random Forest (RF), Gradient Boosting Machines (GBM), Support Vector Machine (SVM), Long Short-Term Memory (LSTM) networks, and a proposed hybrid CNN-LSTM architecture. Feature importance analysis using SHAP values identifies the ten most predictive features across fault types. A cost-benefit model quantifies maintenance savings relative to a baseline reactive maintenance regime. The CNN-LSTM hybrid achieves the highest macro-F1 score of 0.923 across seven fault classes — including bearing inner/outer race faults, impeller cavitation, rotor imbalance, and gear tooth spalling — outperforming standalone LSTM (0.891) and Random Forest (0.874). SHAP analysis reveals that RMS acceleration in the 1-3× running speed band, kurtosis of the envelope spectrum, and thermal gradient rate are the three dominant predictive features across all equipment types. Deployment on an edge computing node (Raspberry Pi 4 with NVIDIA Jetson inference) achieves inference latency of 47 ms per prediction cycle — within the 100 ms real-time threshold required by the plant SCADA system. The cost-benefit model projects annual savings of ₹18.6 lakhs per monitored asset, yielding payback period of 14 months on sensor and software investment
Dr. K. Sujatha (Mon,) studied this question.