Efficient data pipeline management is critical for cloud-native architectures, where data velocity, volume, and variety challenge traditional orchestration methods. This study proposes a Reinforcement Learning (RL)-based framework for autonomous optimization of data pipelines, enabling dynamic task scheduling, resource allocation, and failure recovery without human intervention. The framework models pipeline operations as a sequential decision-making problem, where an RL agent learns optimal policies to maximize throughput, minimize latency, and reduce operational costs. Experiments conducted on simulated and real-world cloud-native workloads demonstrate that the RL-optimized pipelines achieve significant performance improvements compared to conventional static and heuristic-based scheduling strategies. This approach highlights the potential of intelligent, self-adaptive data pipelines for scalable, resilient, and cost-efficient cloud-native data processing.
Kumar et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: