Objective: High-stakes examinations, such as those used for board certification, must be valid and fair across demographic groups. The American Board of Emergency Medicine (ABEM) developed a structured process for bias and fairness assessment to identify and refine potentially biased examination items. Methods: ABEM implemented a three-phase innovation: (1) statistical flagging of potentially biased items using differential item functioning (DIF) analysis; (2) expert panel qualitative review; and (3) holistic content review by the editorial team. Results: Over an 8-year period, 3736 items were analyzed. DIF flagged 597 items (16.0%) for review. The expert Bias and Fairness Panel recommended deletion of 62 (10.4% of flagged items) due to construct-irrelevant bias, most often related to racial bias (53.2% of items recommended for deletion), followed by regional jargon or practice variation (43.5%). The process has been adopted consistently and is being extended to new examination formats. Conclusion: A structured, theory-informed bias and fairness assessment process can reduce construct-irrelevant variance in high-stakes learner assessments. This can serve as a replicable model for other certifying bodies and medical educators seeking to enhance their approach to assessment.
Joldersma et al. (Wed,) studied this question.