• Integrating physiological knowledge with machine learning improves GPP estimation. • Process models convert environmental data to meaningful inputs for neural networks. • With abundant training data, the model achieved higher accuracy than baseline models. • Physiological features prevented overfitting under limited training data conditions. • Our hybrid approach shows potential for adaptation to unknown climate conditions. Estimating gross primary production (GPP) in croplands is essential for understanding terrestrial carbon cycles and crop productivity. Current machine learning (ML) methods for GPP estimation require extensive training data and exhibit poor generalization performance beyond training conditions. To address these limitations, we developed a hybrid model (ANNh) that combines process-based feature engineering with artificial neural networks. This approach converts environmental variables into leaf-level photosynthetic rates using physiological process models (C 4 /C 3 photosynthesis, energy balance, and stomatal conductance models), employing these rates alongside the enhanced vegetation index as inputs. Using FLUXNET2015 and ICOS data from eight maize and six wheat sites across diverse climates, we compared ANNh with a conventional neural network (ANNs) using environmental variables directly, alongside baseline light use efficiency models. When abundant and diverse training data were available, ANNh leveraged ML strengths and achieved higher accuracy than the baseline models. However, under cross-site validation with limited training data, ANNh demonstrated robust performance across both crops, whereas ANNs showed substantial degradation in accuracy, particularly under extreme drought conditions with large soil water potential fluctuations. ANNh is effective because its physiologically meaningful inputs capture fundamental photosynthetic processes through temperature and water stress constraints, functioning as an effective regularization mechanism that suppresses overfitting. However, its reliance on only two variables limits its representation of complex canopy dynamics. These findings suggest that incorporating physiological understanding into ML frameworks can considerably improve model generalizability and robustness even with limited training data, which is crucial for large-scale crop GPP assessments under increasingly extreme climate conditions.
Horikoshi et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: