Bulk Data Management, including the long-term archiving of massive datasets, is critical for advancing high-energy gamma-ray astrophysics research by ensuring data accessibility and scientific reproducibility. Within the Cherenkov Telescope Array Observatory (CTAO), managing and preserving petabyte-scale data poses unique challenges. To address these challenges, we present our prototyping efforts for the Bulk Data Management System (BDMS), a key sub-system of CTAO's Data Processing and Preservation System (DPPS) designed for long-term preservation. BDMS leverages Rucio — an open-source data management system developed at CERN. BDMS manages the ingestion of data products on-site, replication of data between CTAO Data centers, ensure their long-term preservation, and provide an interface to ingest, query, and retrieve. We provide details on the BDMS architecture and its main functional blocks, namely: Ingest (including replication), Data Management (track preservation, and monitoring), Archival Storage, File Query and Access, and BDMS Administration. Our prototyping contributions include containerized deployment using Helm charts and continuous integration tests on a Kubernetes (K8s) cluster provided by DESY Computing/Data center; metadata handling by implementing a setup to extract and store metadata from raw (DL0: Data level 0) data products, thereby enabling high-level dataset queries. Finally, we provide details on current status and outline our future plans.
Hasan et al. (Wed,) studied this question.