Skip to main content

Research Repository

Advanced Search

Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE)

Yeates, Peter; Moult, Alice; Cope, Natalie; McCray, Gareth; Xilas, Eleftheria; Lovelock, Tom; Vaughan, Nicholas; Daw, Dan; Fuller, Richard; McKinley, Robert K. (Bob)


Eleftheria Xilas

Tom Lovelock

Nicholas Vaughan

Dan Daw

Richard Fuller

Robert K. (Bob) McKinley


Purpose Ensuring that examiners in different parallel circuits of objective structured clinical examinations (OSCEs) judge to the same standard is critical to the chain of validity. Recent work suggests examiner-cohort (i.e., the particular group of examiners) could significantly alter outcomes for some candidates. Despite this, examiner-cohort effects are rarely examined since fully nested data (i.e., no crossover between the students judged by different examiner groups) limit comparisons. In this study, the authors aim to replicate and further develop a novel method called Video-based Examiner Score Comparison and Adjustment (VESCA), so it can be used to enhance quality assurance of distributed or national OSCEs. Method In 2019, 6 volunteer students were filmed on 12 stations in a summative OSCE. In addition to examining live student performances, examiners from 8 separate examiner-cohorts scored the pool of video performances. Examiners scored videos specific to their station. Video scores linked otherwise fully nested data, enabling comparisons by Many Facet Rasch Modeling. Authors compared and adjusted for examiner-cohort effects. They also compared examiners’ scores when videos were embedded (interspersed between live students during the OSCE) or judged later via the Internet. Results Having accounted for differences in students’ ability, different examiner-cohort scores for the same ability of student ranged from 18.57 of 27 (68.8%) to 20.49 (75.9%), Cohen’s d = 1.3. Score adjustment changed the pass/fail classification for up to 16% of students depending on the modeled cut score. Internet and embedded video scoring showed no difference in mean scores or variability. Examiners’ accuracy did not deteriorate over the 3-week Internet scoring period. Conclusions Examiner-cohorts produced a replicable, significant influence on OSCE scores that was unaccounted for by typical assessment psychometrics. VESCA offers a promising means to enhance validity and fairness in distributed OSCEs or national exams. Internet-based scoring may enhance VESCA’s feasibility.


Yeates, P., Moult, A., Cope, N., McCray, G., Xilas, E., Lovelock, T., …McKinley, R. K. (. (2021). Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE). Academic Medicine, 96(8), 1189-1196.

Journal Article Type Article
Acceptance Date Sep 29, 2020
Online Publication Date Mar 2, 2021
Publication Date 2021-08
Publicly Available Date May 26, 2023
Journal Academic Medicine
Print ISSN 1040-2446
Publisher Lippincott, Williams & Wilkins
Peer Reviewed Peer Reviewed
Volume 96
Issue 8
Pages 1189-1196
Keywords Education, General Medicine
Public URL
Publisher URL