Key points are not available for this paper at this time.
The authors present a study comparing the use of an ordinal scale versus an interval scale for workplace-based assessment instruments to assess and evaluate the competency of 60 general surgeons on 3 complex trauma procedures performed on fresh cadavers. Their findings indicated that the ordinal scale used was more likely to overestimate the competence of the surgeons than an interval scale for the 3 procedures observed: resuscitative thoracotomy, Cattell-Braasch maneuver (right-to-left visceral medial rotation), and supraclavicular exposure of the subclavian artery.1 One goal of measuring competence in medical practice is patients' safety by providing a roadmap for learning based on an accurate description of what competence in practice looks like and providing formative feedback. This happens frequently in competency-based training and assessment paradigms, such as entrustable professional activities (EPAs). However, these techniques are most often used with trainees such as medical students, residents, and fellows. While we need to continually review our implementation of EPAs and associated measurement systems, this study raises another issue in that assessments for surgeons in practice happen with much less frequency than they do for trainees. An argument can be made that this is detrimental to both patients' safety and surgeon's well-being. For example, the authors discuss assessing critical performance components, which they define as the most impactful competency areas with the greatest potential to, if not performed up to standards, harm the patient, surgical team, or the surgeon. Accurate description and the right level of nuance and comparison to a "gold standard" of practice are important. The reliance on a "gold standard" or an established criterion standard of performance enhances the validity and reliability of competency-based assessments. These standards, often developed through expert consensus and grounded in evidence-based practice, serve as a robust benchmark against which trainees' performance can be measured. This not only ensures consistency in evaluation but also aligns assessments with the expectations of the broader surgical community. Additionally, we also need to be able to check for potential bias while managing the roles of subjectivity and unpredictability when conducting assessments in clinical settings in which standardization is not possible. Continuous learning and improvement are a part of being a surgeon, and reliable and accurate assessment and feedback are needed for this continuous learning to happen. However, questions about what this looks like for practicing surgeons remain. These include: Who administers these assessments? When and in what contexts do they occur? How are they combined with other metrics of performance, such as patient outcomes? What are the consequences of performance below a standard deemed as competent? And, how can this approach be developed as an extension of the EPA framework already in use in training? As we grapple with these questions, we need to keep in mind that regardless of approach, assessment of competence should be viewed as an opportunity for continued growth for practicing surgeons in order to benefit themselves as much as their patients. Questions of assessment practices and the use of the data for making decisions will only continue to grow as we persist in the direction of competency-based training and practice, and it is positive that we are taking this on as a surgical community. Understanding what types of decisions different types of scales support is essential for the accurate and fair use of data and evaluation of competence. An interval scale is ideal for criterion-referenced assessments involving high-stakes decisions such as certification, graduation, and promotion. Conversely, for norm-referenced assessments aimed at providing feedback and guiding learning, a less precise scale, such as an ordinal scale, may suffice. However, it is essential to acknowledge that the use of interval scales, while valuable, is not without its challenges. The interpretation of scale scores can be influenced by factors such as rater subjectivity and the complexity of the observed behaviors. Additionally, the calibration of interval scales to reflect meaningful differences in performance levels requires careful attention to detail. We must question our confidence in and any potential drawbacks of any assessment paradigm. The findings of this study should be taken into consideration as we continue to work as a community of surgical practice to understand and assess competence in ways that are accurate and fair based on the decisions that are made based on that assessment data. Finally, it is important to think about how these assessment data can be combined with interventions, such as coaching, that can help to facilitate performance improvement in a precision education approach.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hee Soo Jung
Ting Sun
Annals of Surgery Open
University of Wisconsin–Madison
University of Utah
Building similarity graph...
Analyzing shared references across papers
Loading...
Jung et al. (Mon,) studied this question.
www.synapsesocial.com/papers/68e6a612b6db643587628b8d — DOI: https://doi.org/10.1097/as9.0000000000000410