What question did this study set out to answer?

The aim is to assess agreement between two examiners in scoring nursing students' technical and communication skills during OSCEs.

February 2, 2026Open Access

Inter-Rater Reliability in a Pre-Graduation Nursing Objective Structured Clinical Examination: A Kappa and Prevalence-Adjusted Bias-Adjusted Kappa Comparison of Technical Skill and Communication Items

Puntos clave

The aim is to assess agreement between two examiners in scoring nursing students' technical and communication skills during OSCEs.
90 nursing students participated in two OSCE stations: one for technical procedures and one for communication.
Each performance was scored by two faculty examiners using binary checklists.
Cohen's kappa and prevalence- and bias-adjusted kappa were computed for inter-rater agreement.
Higher inter-rater agreement was observed for technical skills compared to communication skills.
Agreement improved for technical performances across successive attempts, indicating examiner calibration.
Communication station scores remained consistently low, especially for empathy-related items.

Resumen

Background Objective structured clinical examinations (OSCEs) are widely used to assess nursing students’ clinical skills. However, ensuring consistent scoring between examiners remains difficult, particularly for subjective areas such as communication. Purpose This study aimed to evaluate inter-rater agreement between two faculty examiners in a pre-graduation OSCE recommended for final-year pre-registration nursing students in Japan, comparing technical skill stations with communication-focused stations. Methods A total of 90 final-year nursing students completed two OSCE stations: one assessing technical procedures, the other assessing communication and patient education. Two examiners independently scored each aspect of performance using binary checklists. Inter-rater agreement was calculated using Cohen's kappa and prevalence- and bias-adjusted kappa. Results Higher inter-rater agreement was found for psychomotor items (e.g., auscultation) than for verbal or empathy-based items. In the technical station, agreement improved across successive circuits, suggesting examiner calibration. In contrast, in the communication station, agreement remained consistently low. Empathy-related items showed the greatest discrepancy between kappa and prevalence- and bias-adjusted kappa, highlighting challenges in evaluating subjective skills. Conclusions OSCE inter-rater reliability was higher for objective technical skills than for subjective communication skills and empathy-related behaviors, among pre-registration nursing students. Implications for Practice Improving checklist clarity and providing targeted examiner training for communication and empathy-related items may enhance the reliability of OSCE scoring in nursing education.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Yayama et al. (Thu,) studied this question.

synapsesocial.com/papers/6980fe9bc1c9540dea810cef https://doi.org/https://doi.org/10.1177/23779608261417794

Me gusta

Guardar

Ver artículo completo