Helicobacter pylori infection affects approximately 44% of the global population and is causally linked to gastric ulceration and cancer. Standard diagnostic methods detect the pathogen directly, requiring sufficient bacterial load to produce a positive result. This study evaluates whether the surrounding microbiome community structure retains predictive information after removal of direct pathogen-associated features -- a question with implications for community-based infection risk stratification. Using 16S rRNA amplicon sequencing data from 75 gastric microbiome samples, we compare five classifiers across four feature representations: ASV-level taxonomic profiles and PICRUSt2-predicted functional pathway profiles, each evaluated with and without Helicobacter-associated features. Taxonomic features without Helicobacter retained meaningful predictive performance (F1 = 0.693 +/- 0.032), relative to the full feature set (F1 = 0.711 +/- 0.050). Pathway features achieved the strongest overall performance (F1 = 0.721 +/- 0.088). SHAP analysis identified PWY-7373 as the dominant predictive pathway, consistent with known H. pylori sucrose metabolism. Counterfactual perturbation using DiCE identified ARGSYN-PWY and UDPNAGSYN-PWY as candidate metabolic targets, presented as computational hypotheses for further experimental investigation.
Hajar Khattabi (Thu,) studied this question.