Background: Machine learning models for antimicrobial resistance (AMR) prediction are trained predominantly on data from high-income countries, yet resistance prevalence varies dramatically across geographic regions. While algorithmic fairness frameworks have matured around race, sex, and age, geography has been examined in only 2.4% of medical AI fairness studies. We investigated whether geographic heterogeneity creates fundamental barriers to algorithmic fairness—a phenomenon we term the "Calibration Paradox." Methods: We conducted a two-cohort study (total N=77,548 isolates) using the BV-BRC database. The Primary Cohort (n=39,859 E. coli isolates from 132 countries) quantified regional resistance prevalence and demonstrated through simulation the mathematical consequences of applying any single classification threshold to populations with heterogeneous base rates. The Genomic Validation Cohort (n=37,689 E. coli isolates with fluoroquinolone resistance gene annotations) tested whether models using actual genomic predictors could avoid the threshold problem. Results: Ciprofloxacin resistance prevalence ranged from 16.8% in North America to 44.1% in Asia, a 27.3 percentage point gap (OR 3.90; 95% CI: 3.67-4.15; p<0.001). Simulation analysis demonstrated that when a well-calibrated model outputs regional prevalence as predictions, any single global threshold partitions regions into discrete classification groups. At a 30% threshold, 44.3% of resistant isolates from below-threshold regions would be missed. Genomic validation confirmed that a model trained on genomic features alone still produced regionally varying prediction scores, resulting in 19.6 percentage point sensitivity disparities at uniform thresholds. Conclusions: Geographic prevalence heterogeneity creates unavoidable fairness-accuracy trade-offs—the Calibration Paradox—for any globally-deployed AMR prediction model. No single threshold can achieve equitable performance across regions with different base rates. These findings demonstrate the need for region-specific models, mandatory geographic stratification in model evaluation, and recognition of geography as a protected attribute in medical AI fairness frameworks.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hayden Luke Farquhar
Building similarity graph...
Analyzing shared references across papers
Loading...
Hayden Luke Farquhar (Fri,) studied this question.
synapsesocial.com/papers/696c789ceb60fb80d1396cc0 — DOI: https://doi.org/10.5281/zenodo.18266559