Introduction Assessment format may influence the extent to which student learning approaches are reflected in performance, yet whether constructed-response descriptive assessments (DAs) and selected-response multiple-choice question assessments (MCQAs) differ in their sensitivity to variation in learning approach within formative physiology education has not been directly examined. This study tested whether baseline deep and surface learning approaches were differentially associated with performance in DAs and MCQAs within a longitudinal formative undergraduate medical physiology programme. Methods This three-month longitudinal observational study was conducted at a single medical college in South India. Among 150 invited first-year medical students, 109 completed the baseline Revised Two-Factor Study Process Questionnaire (R-SPQ-2F) and contributed to the study. Eight physiology topics, selected through a modified Delphi process, were taught sequentially and assessed on a rolling basis. For each topic, students completed both a DA and an MCQA in the same sitting, with items matched on construct and revised Bloom's taxonomy level. A linear mixed-effects model was used to test whether the association between learning approach and marks differed by assessment format, with a student-level random intercept to account for repeated observations. A post hoc inter-rater reliability audit was conducted on a randomly selected subset of DA scripts. Sensitivity analyses included a random-slopes model and a Deep-minus-Surface composite parameterisation. Results The association between learning approach and marks differed by assessment format, with significant format-by-deep-learning and format-by-surface-learning interactions (both p < 0.001). Within DAs, higher deep-learning scores were associated with higher marks (β = 0.032, p = 0.022), whereas higher surface-learning scores were associated with lower marks (β = -0.038, p = 0.003). Within MCQAs, neither learning-approach dimension was significantly associated with marks. The negative association between surface learning and DA performance remained robust across model specifications, whereas the positive association between deep learning and DA performance was attenuated in the random-slopes model (p = 0.072). The Deep-minus-Surface sensitivity analysis supported the same overall format-dependent pattern. A post hoc inter-rater reliability audit on 25% of DA scripts yielded strong agreement with an intraclass correlation coefficient (ICC) of 0.87. Conclusions Assessment format moderated the association between learning approach and assessment performance in this cohort: DAs showed clearer differentiation by learning approach than MCQAs, with poorer DA performance at higher surface-learning scores. These findings do not show that MCQAs reward surface learning but suggest that DAs may provide more discriminating information about variation in learning approach than MCQAs within a programmatic assessment framework.
Ismail-Khan et al. (Mon,) studied this question.