The global demand for plasmid DNA (pDNA) is rapidly increasing due to its key role in the manufacture of mRNA vaccines and advanced therapy medicinal products (ATMPs). However, the reliance on suboptimal batch and/or fed-batch processes, as well as the diversity and increasing complexity of plasmid constructs pose significant challenges during process development. In fact, the diversity of manufactured plasmids and employed host cell lines diminishes the benefits of using traditional model-based approaches to expedite process development despite the availability of sufficient data. To address these limitations, an ensemble hybrid modeling approach that integrates mechanistic dynamic mass balances with data-driven deep learning derived kinetic rates was developed. Briefly, each rate term was replaced with a fully connected artificial neural network (ANN) whose input variables included a vector of all state variables at timepoint (t), H historical data vectors of all first-order derivatives at timepoints t-H and t-1 and one-hot-encoded categorical variables (host, plasmid, media, and mode of operation). Hybrid models were trained and validated using an 80/20 ratio on a literature derived dataset comprising data from 18 unique E. coli fermentations across different media formulations, different plasmids, and different culture modes. Subsequently, a progressive transfer learning framework was employed to assess adaptability of a pretrained model to unseen E. coli strains and culture modes (from batch to fed-batch). This approach exploited historical data to substantially reduce model recalibration costs, emphasizing the potential of hybrid and transfer learning methods to accelerate pDNA bioprocess development.
Stratis et al. (Wed,) studied this question.