What question did this study set out to answer?

The research aims to evaluate bias in continuous predictions of standardized test scores and propose methods to reduce it.

March 29, 2026Open Access

Beyond binary outcomes: Evaluating and mitigating bias in national standardized test score prediction

Read Full Paperexternally

Key Points

The research aims to evaluate bias in continuous predictions of standardized test scores and propose methods to reduce it.
Introduced a fairness evaluation framework based on statistical distance measures.
Developed two adapted metrics, AreaPDF and AreaCDF, to assess group disparities.
Implemented five reweighting-based data-balancing methods to mitigate bias in predictions.
Conducted empirical analyses on standardized test scores in early childhood education.
The new metrics, AreaPDF and AreaCDF, highlighted biases missed by traditional measures.
Reweighting strategies significantly improved fairness in predictions.
Bias in continuous outcomes results from complex demographic interactions and score distributions.

Abstract

Predictive analytics in education holds promise for supporting student learning but poses risks of bias potentially disadvantaging certain groups based on demographic attributes such as gender. Prior work on fairness in educational prediction has largely focused on binary outcomes in higher education, leaving continuous predictions, such as standardized test scores, comparatively understudied, especially in early schooling where consequences can be long-lasting. Despite a few fairness metrics available for continuous outcomes, they often rely on mean differences, obscuring distributional disparities that may disadvantage specific students. Even fewer studies have attempted to address bias in continuous predictions. This study addresses these gaps by introducing a fairness evaluation framework grounded in statistical distance measures with two adapted metrics, AreaPDF and AreaCDF, that capture group disparities across the full distribution of continuous predicted outcomes. We evaluate these metrics in the context of predicting national standardized test scores in early childhood education and further develop five reweighting-based data-balancing methods to mitigate bias in continuous prediction tasks. Empirical analyses show that the proposed metrics reveal biases overlooked by traditional measures and that the reweighting strategies substantially improve fairness, underscoring the importance of aligning debiasing methods with demographic and score-distribution characteristics. • Measurement of bias in continuous educational predictions should go beyond simple group averages to account for difference across the entire distribution between groups, as the proposed AreaCDF and AreaPDF metrics do, offering a finer-grained view of disparities in predictions that empowers stakeholders with greater flexibility to act on these insights. • In continuous educational predictions, bias more likely arises from the complex interaction between demographic group representation and score distributions, rather than demographic factors alone. Importantly, achieving balanced demographic group representation does not guarantee fair predictive performance across groups. • There is no one-size-fits-all data balancing methods for improving predictive fairness. Fairness interventions should be adapted to the specific data characteristics and predictive models at hand, taking into account whether demographic imbalances are global, local, or intertwined with skewed score distributions.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

L. Li

Monash University

Namrata Srivatava

Jia Rong

Monash University

Journals

Computers and Education Artificial Intelligence

Actions

Institutions

Vanderbilt University

Monash University

Jinan University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Beyond binary outcomes: Evaluating and mitigating bias in national standardized test score prediction

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study