Richard D Riley
Developing clinical prediction models when adhering to minimum sample size recommendations: The importance of quantifying bootstrap variability in tuning parameters and predictive performance
Riley, Richard D
Authors
Abstract
Recent minimum sample size formula (Riley et al.) for developing clinical prediction models help ensure that development datasets are of sufficient size to minimise overfitting. While these criteria are known to avoid excessive overfitting on average, the extent of variability in overfitting at recommended sample sizes is unknown. We investigated this through a simulation study and empirical example to develop logistic regression clinical prediction models using unpenalised maximum likelihood estimation, and various post-estimation shrinkage or penalisation methods. While the mean calibration slope was close to the ideal value of one for all methods, penalisation further reduced the level of overfitting, on average, compared to unpenalised methods. This came at the cost of higher variability in predictive performance for penalisation methods in external data. We recommend that penalisation methods are used in data that meet, or surpass, minimum sample size requirements to further mitigate overfitting, and that the variability in predictive performance and any tuning parameters should always be examined as part of the model development process, since this provides additional information over average (optimism-adjusted) performance alone. Lower variability would give reassurance that the developed clinical prediction model will perform well in new individuals from the same population as was used for model development.
Citation
Martin, G. P., Riley, R. D., Collins, G. S., & Sperrin, M. (2021). Developing clinical prediction models when adhering to minimum sample size recommendations: The importance of quantifying bootstrap variability in tuning parameters and predictive performance. Statistical Methods in Medical Research, 30(12), 2545-2561. https://doi.org/10.1177/09622802211046388
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 8, 2021 |
Online Publication Date | Oct 8, 2021 |
Publication Date | Oct 8, 2021 |
Publicly Available Date | May 30, 2023 |
Journal | Statistical Methods in Medical Research |
Print ISSN | 0962-2802 |
Publisher | SAGE Publications |
Volume | 30 |
Issue | 12 |
Pages | 2545-2561 |
DOI | https://doi.org/10.1177/09622802211046388 |
Keywords | Clinical prediction model, penalisation, shrinkage, validation, overfitting |
Publisher URL | https://journals.sagepub.com/doi/10.1177/09622802211046388?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed |
Files
09622802211046388.pdf
(4.8 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
Calibration plots for multistate risk predictions models
(2024)
Journal Article
Downloadable Citations
About Keele Repository
Administrator e-mail: research.openaccess@keele.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search