Gas hydrate formation in CO2/CH4 binary mixtures poses critical flow assurance challenges in natural gas pipelines under high-pressure and low-temperature conditions. The current experimental and thermodynamic modeling systems for predicting hydrate equilibrium temperatures are resource-intensive, time-consuming, and computationally expensive, which restricts their use in real-time pipeline monitoring and inhibitor screening. To overcome such drawbacks, a Support Vector Machine (SVM) regression was created with 946 experimental data points, which included 63 ionic liquid (IL) inhibitors. To identify the changes in model predictive accuracy caused by each input feature scaling method, seven methods of input feature scaling were systematically tested: L1-Normalization, Robust Scaler, Standard Scaler, Power Transformation, MaxAbs Scaler, MinMax Scaler, and Quantile Transformation. The best predictive accuracy was obtained with Quantile Transformation with R2 = 0.9156, RMSE = 1.0067 and MAE = 0.7125. L1-Normalization on the other hand fared worst so its results are at par with the unscaled baseline (R2 about 0.27). MinMax and MaxAbs scalers gave average results on the test set. The present research is the first systematic research that illustrates that the input feature scaling is not just a simple preprocessing formality but a critical determinant that dictates the accuracy of SVM model predictions in predicting CO2/CH4 hydrate inhibition with ionic liquids. The results prove that the proper choice of scaling methods may enhance R2 = 0.27 (no scaling) to R2 = 0.92 (Quantile Transformation) and transform machine learning into a potential real-time decision-support system applicable to flow assurance engineering.
Kumar et al. (Mon,) studied this question.