Abstract Efficient data ingestion and online data augmentation remain challenges in deep learning workflows, particularly when dealing with datasets containing non-standard formats or massive multidimensional arrays that natively optimised functions cannot fully manage. This work presents a parallel framework that integrates and global shared memory through a ring buffer architecture, enabling high-throughput data loading and flexible on-the-fly augmentation. The framework decouples data production from consumption, allowing multiple CPU workers to load and preprocess batches in parallel while completely bypassing the Python GIL and memory bottlenecks. Crucially, the framework supports both CPU-side and GPU-side augmentation strategies, adapting to whether complex conditional transformations or framework-native operations are required. The proposed approach was validated on two representative tasks: (i) sign language recognition from human pose CSV sequences, and (ii) hyperspectral image classification using massive arrays. Relative to standard sequential baselines, the proposed framework achieved up to 27 27 × acceleration in isolated data ingestion and up to 28 28 × in end-to-end training. Importantly, even against natively optimised parallel TensorFlow and PyTorch pipelines, it still delivered up to 8 8 × faster data loading and up to 7 7 × faster full training in memory-intensive scenarios. Overall, the proposed framework provides a scalable, multi-GPU compatible solution for deep learning pipelines, showing robust performance across both I/O-bound and memory-constrained scenarios in TensorFlow and PyTorch while alleviating memory fragmentation and allocation constraints.
Building similarity graph...
Analyzing shared references across papers
Loading...
Toro-Castro et al. (Sun,) studied this question.
synapsesocial.com/papers/69fd7ee0bfa21ec5bbf073af — DOI: https://doi.org/10.1007/s11227-026-08543-0
Antonio De Toro-Castro
University of Almería
Marcos Lupión
University of Ulster
Vicente González-Ruíz
University of Almería
The Journal of Supercomputing
Building similarity graph...
Analyzing shared references across papers
Loading...