Richard D. Riley
Minimum sample size for external validation of a clinical prediction model with a binary outcome
Riley, Richard D.; Debray, Thomas P. A.; Collins, Gary S.; Archer, Lucinda; Ensor, Joie; van Smeden, Maarten; Snell, Kym I. E.
Authors
Thomas P. A. Debray
Gary S. Collins
Lucinda Archer
Joie Ensor
Maarten van Smeden
Kym I. E. Snell
Abstract
In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.
Citation
Riley, R. D., Debray, T. P. A., Collins, G. S., Archer, L., Ensor, J., van Smeden, M., & Snell, K. I. E. (2021). Minimum sample size for external validation of a clinical prediction model with a binary outcome. Statistics in Medicine, 40(19), 4230-4251. https://doi.org/10.1002/sim.9025
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 22, 2021 |
Online Publication Date | May 24, 2021 |
Publication Date | Aug 30, 2021 |
Publicly Available Date | May 30, 2023 |
Journal | Statistics in Medicine |
Print ISSN | 0277-6715 |
Publisher | Wiley |
Volume | 40 |
Issue | 19 |
Pages | 4230-4251 |
DOI | https://doi.org/10.1002/sim.9025 |
Keywords | Statistics and Probability, Epidemiology |
Publisher URL | https://onlinelibrary.wiley.com/doi/10.1002/sim.9025 |
Files
sim.9025.pdf
(2.2 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
Calibration plots for multistate risk predictions models
(2024)
Journal Article
Downloadable Citations
About Keele Repository
Administrator e-mail: research.openaccess@keele.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search