Introduction: Examiner variability in medical theory assessments might compromise fairness, reliability, and learner trust. The present study aimed to evaluate the effect of a structured faculty development program (FDP) on reducing examiner variability and improving scoring consistency in undergraduate medical theory assessments using short essay questions (SEQs) and multiple-choice questions (MCQs). Materials and methods: This quasi-experimental study involved 32 experienced physiology faculty members from medical colleges affiliated with a state health university in India. Participants were divided into a trained group (n = 16) and an untrained control group (n = 16). All examiners first scored identical anonymized answer scripts (one SEQ and ten MCQs) under standard conditions without rubrics or answer keys. Following a three-week washout period, the trained group participated in a structured faculty development program focused on assessment literacy, rubric design, examiner calibration, and reflective practices. Both groups then re-scored the same scripts. Intra-rater consistency, inter-rater reliability (intraclass correlation coefficients as ICC and Fleiss’ kappa), and scoring accuracy were compared pre- and post-intervention. Results: The structured FDP significantly reduced examiner variability in undergraduate medical theory assessments. Trained examiners showed marked improvements in intra-rater consistency, inter-rater reliability (SEQ ICC from 0.79 to 0.91; MCQ Fleiss’ kappa from 0.16 to 0.33), and MCQ accuracy (48% to 88%). Variability decreased substantially in the trained group, while the untrained group exhibited minimal change or slight decline. Significant group × time interaction (p = 0.013 for SEQs) and group main effect (p = 0.043 for MCQs) confirmed the intervention’s effectiveness. These findings highlight the value of targeted FDPs in enhancing assessment fairness and reliability in medical education. Conclusion: A targeted, short-duration FDP effectively reduced examiner variability and enhanced reliability in undergraduate theory assessments. Implementing such programs can promote fairer, more defensible evaluations in medical education, particularly in resource-constrained settings.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sukanti Bhattacharyya
Alka Rawekar
Arkaprabha Sau
Cureus
Building similarity graph...
Analyzing shared references across papers
Loading...
Bhattacharyya et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69c8c277de0f0f753b39cd35 — DOI: https://doi.org/10.7759/cureus.105946