Batch crystallization is a key operation in the pharmaceutical industry, yet the optimization of product properties such as crystal size distribution, aspect ratio, and yield remains challenging. The difficulty arises from the complex crystallization process and the high cost of collecting sufficient experimental data. In this paper, a data-driven framework that integrates an efficient design of experiments (EDoE) with an integrated neural network (INN) model is proposed to enable systematic multi-objective optimization of such process. A data-driven modeling approach based on integrating the convolutional neural network (CNN) and recurrent neural network (RNN) is presented to map the relationship between the primary operating conditions (including the initial supersaturation and cooling rate) and two-dimensional (2-D) crystal size distribution (CSD). An EDoE is developed to generate informative data while reducing the number of experiments for the above modeling, by taking sensitivity analysis on the product CSD regarding the primary operating conditions. Meanwhile, a comprehensive loss function related to the prediction errors on product CSD is introduced to optimize the INN model hyperparameters. Building on this model, three alternative multi-objective optimization programs are offered to optimize these operating conditions for a good trade-off between product yield, the concentration of product CSD, and the aspect ratio (AR) of crystals. Simulation studies and experiments on β-form L -glutamic acid demonstrate that the proposed approach improves predictive accuracy, reduces experimental effort, and identifies superior operating strategies. • INN-based modeling maps the nonlinear relationship between primary operating conditions and 2-D crystal size distribution. • Efficient DoE via sensitivity analysis yields informative INN modeling data while minimizing experiment numbers. • Three MOO programs optimize operating conditions to balance product yield, CSD concentration, and crystal aspect ratio (AR). • A comprehensive loss function related to product CSD prediction errors optimizes the INN model hyperparameters. • L-glutamic acid experiments validate enhanced MOO and reduce modeling experiments by about 50% compared to traditional DoE.
Song et al. (Fri,) studied this question.