Key points are not available for this paper at this time.
As enterprises increasingly rely on cloud services for scalable data processing, optimizing cost and efficiency in handling large datasets has become a priority. This paper explores the use of AWS Lambda for large-scale batch processing of call center transcripts, where data is stored in partitioned S3 buckets. We design a fault-tolerant and cost-effective architecture that leverages Lambda functions to process these datasets during off-peak hours, taking advantage of AWS’s pay-as-you-go pricing model. Our approach includes a retry logic for handling failures, ensuring the robustness of the system. The processed data, comprising AI-generated call transcripts, is saved back to S3. Through extensive experimentation, we demonstrate the efficiency of our method in terms of both cost and performance, making it a viable solution for large-scale data processing tasks in cloud environments.
Himanshu Gupta (Thu,) studied this question.