What question did this study set out to answer?

This research aims to develop an AI-enabled framework for diagnosing and self-healing internal control deficiencies in enterprise process automation.

May 7, 2026

AI-Enabled Diagnosis and Autonomous Self-Healing Framework for Resolving Internal Control Deficiencies in Enterprise Process Automation

Key Points

This research aims to develop an AI-enabled framework for diagnosing and self-healing internal control deficiencies in enterprise process automation.
Employs bidirectional long short-term memory (Bi-LSTM) networks for predictive fault prediction.
Uses SHapley Additive exPlanations (SHAP) for interpretability-based root cause analysis.
Integrates rule-based logic and reinforcement learning (RL) for self-healing operations.
Trained on the Aliyun Cloud Fault Dataset, applying complex preprocessing techniques for data preparation.
Achieves fault detection accuracy of 98%, precision of 97%, recall of 99%, and healing success rate of 90%.
The RL agent demonstrates rapid convergence and generalization across various fault episodes.

Abstract

Enterprise process automation systems are more and more exposed to internal control deficiencies due to misconfigurations, resource bottlenecks, and software anomalies. The traditional fault detection and recovery systems are not scalable, interpretable, and not capable of providing real-time responses; hence, they cannot meet the requirements of the current cloud-based environments. The proposed methodology consists of three components: The predictive fault prediction by the bidirectional long short-term memory (Bi-LSTM) networks, the SHapley Additive exPlanations (SHAP)-based interpretability-based root cause analysis (RCA), and a hybrid self-healing engine which uses both rule-based logic and reinforcement learning (RL) for its operation. The whole setup is trained and tested on the Aliyun Cloud Fault Dataset, where detailed temporal and structural fault traces are provided from large-scale enterprise cloud clusters. The proposed solution involves a series of complex preprocessing techniques including KNN imputation, Min–Max scaling, and one-hot encoding which is then followed by statistical, temporal, event-pattern, and graph-based dimensions feature extraction. The Bi-LSTM model captures both forward and backward temporal dependencies that culminate in precise fault classification and severity scoring. To facilitate this, SHAP ranks the features by their contribution and the self-healing engine can run corrective actions automatically and also learns about new fault conditions through RL feedback loops. The results of the experiments show excellent fault detection accuracy (98%), precision (97%), recall (99%), and healing success rate (90%). The RL agent shows rapid convergence and generalization across different fault episodes. The system offers an auditable, adaptive, and scalable enterprise fault management solution that significantly reduces downtime and human effort. In the future, the solution will be extended to support multi-cloud environments and TinyML agents for edge deployment will be implemented.

AI에게 질문

Bookmark

Cite This Study

Wang et al. (Wed,) studied this question.

synapsesocial.com/papers/69fbef86164b5133a91a372d https://doi.org/https://doi.org/10.1142/s1469026826410099

AI에게 질문

Bookmark