Data compatibility remains a major challenge in metabolomics, as commonly used measures of biological material—such as sample weight or cell count—are often poorly reproducible. Here, we systematically evaluated practical normalization strategies for GC × GC-MS-based metabolomic profiling of two widely used model cell lines: human hepatoblastoma (HepG2) and mesenchymal stromal cells (MSCs). We compared orthogonal biomass estimates, including total protein and double-stranded DNA quantified either directly in aliquots of the cell suspension lysate aliquots or in the post-extraction cell precipitate, alongside normalization based on extracted ion current (XIC). We also assessed three widely used extraction mixtures—methanol/chloroform/water (7:2:1); methanol/water (8:2); acetonitrile/isopropanol/water (3:3:2)—for metabolome coverage and normalization robustness. Under realistic biological variability, signal-to-biomass dependencies were moderate. In contrast, under strictly controlled conditions, DNA- and protein-based normalization yielded near-linear relationships with metabolite abundances (R2 > 0.90), demonstrating that biological variability is the dominant source of dispersion rather than technical factors. Methanol/chloroform/water system provided the broadest metabolome coverage and strongest correlation with injected biomass. Based on these findings, we recommend normalization to total precipitate protein or DNA using the methanol/chloroform/water extraction protocol, with XIC as a complementary quality control metric.
Kurbatov et al. (Thu,) studied this question.