What question did this study set out to answer?

To establish a statistical testing framework for identifying algorithmic bias in machine learning models.

March 13, 2026Open Access

A Statistical Framework for Detecting Algorithmic Bias in Machine Learning Models Using Hypothesis Testing

Key Points

To establish a statistical testing framework for identifying algorithmic bias in machine learning models.
Developed a hypothesis testing framework for detecting algorithmic bias.
Used classical statistical tests, including the two-proportion z-test and chi-square test.
Applied the framework to a synthetic dataset for demonstration.
Detected statistically significant bias even with balanced overall model performance.
Highlighted the necessity of assessing uncertainty and statistical significance in fairness evaluations.

Abstract

Abstract As machine learning (ML) systems become deeply embedded in contemporary decision-making processes, concerns regarding algorithmic bias have attracted increasing scholarly and societal attention. Automated models are now widely used in high-impact domains including recruitment, credit approval, education, and the criminal justice system, where discriminatory outcomes may arise, where biased outcomes may reinforce existing inequalities. Consequently, fairness has become a fundamental requirement rather than an optional design goal. Although many fairness-aware ML approaches rely on descriptive performance metrics such as accuracy, precision, recall, or selection rates, these measures alone are insufficient to determine whether observed disparities between demographic groups are statistically meaningful or merely due to random variation. This paper proposes a simple yet rigorous statistical hypothesis testing framework for detecting algorithmic bias by formally comparing model outcomes across protected groups. The framework employs classical statistical tools, including the two-proportion z-test and the chi-square test of independence, to evaluate group-level differences in decision outcomes. A small synthetic dataset is used to demonstrate the proposed methodology in a transparent and interpretable manner. The results illustrate that statistically significant bias can be detected even when overall model performance appears balanced. The study emphasizes the importance of incorporating uncertainty and statistical significance into fairness assessments

A Statistical Framework for Detecting Algorithmic Bias in Machine Learning Models Using Hypothesis Testing

Key Points

Abstract

Cite This Study

Also Consider

Also Consider