What type of study is this?

September 10, 2025

Efficient Big Data Processing in Financial Sector Applications: The Evaluation of Distributed Computing Architectures

Key Points

The developed 60-node system processes 1.2 million events per second with a latency of under 0.7 seconds, showcasing efficiency.
Using a containerized hybrid Lambda-Kappa model, the solution improves scalability and reduces operational costs by about 25%.
The study examines batch, micro-batch, and streaming paradigms tailored for financial applications, emphasizing real-time processing.
Future research will focus on serverless orchestration and privacy-aware analytics, critical for evolving financial systems.

Abstract

Abstract—Financial institutions are increasingly challenged by the influx of high-volume, high- velocity, and heterogeneous data streams, including transaction records and real-time market feeds. Conventional ETL pipelines and monolithic data warehouse systems fall short of delivering the low- latency responses, scalable throughput, and precise processing guarantees required for critical operations such as fraud detection, algorithmic trading, and real-time risk management. This paper presents a detailed examination of distributed computing paradigms—including batch, micro-batch, and streaming—as well as architectural patterns such as Lambda, Kappa, and hybrid frameworks, specifically adapted for financial applications. We introduce a containerized hybrid Lambda-Kappa model deployed on Kubernetes12, integrating Apache Kafka 6 for event ingestion, Apache Flink 5 (augmented with GPU powered processing) for real-time processing, and Apache Spark 213 for batch computation. Our 60-node prototype achieves 1.2 million events per second with p99 latency under 0.7 seconds and demonstrates nearly linear scalability (R2 = 0.99), reducing operational costs by approximately 25%. The paper also discusses system resilience, compliance and security considerations, and outlines future research directions in serverless orchestration 13, adaptive autoscaling, and privacy-aware analytics. Keywords—Big data, distributed computing, financial analytics, real-time streaming, Lambda architecture, Kubernetes, GPU acceleration.

Bookmark

Efficient Big Data Processing in Financial Sector Applications: The Evaluation of Distributed Computing Architectures

Key Points

Abstract

Cite This Study