Penalisation and shrinkage methods can produce unreliable clinical prediction models especially when sample size is small

doi:10.1016/j.jclinepi.2020.12.005

Penalisation and shrinkage methods can produce unreliable clinical prediction models especially when sample size is small

Abstract

Objectives
When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (‘tuning parameters’) are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance.

Study Design and Setting
This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net.

Results
In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell is low. The problem can lead to considerable miscalibration of model predictions in new individuals.

Conclusion
Penalization methods are not a ‘carte blanche’; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters.

Citation

(2020). Penalisation and shrinkage methods can produce unreliable clinical prediction models especially when sample size is small. Journal of Clinical Epidemiology, 88-96. https://doi.org/10.1016/j.jclinepi.2020.12.005

Acceptance Date	Dec 2, 2020
Publication Date	Dec 8, 2020
Journal	Journal of Clinical Epidemiology
Print ISSN	0895-4356
Publisher	Elsevier
Pages	88-96
DOI	https://doi.org/10.1016/j.jclinepi.2020.12.005
Keywords	Risk prediction models, Penalization, Shrinkage, Overfitting, Sample size
Publisher URL	https://doi.org/10.1016/j.jclinepi.2020.12.005

Files

Shrinkage uncertainty - revised SUBMITTED.docx (7.7 Mb)
Document

Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/

Downloadable Citations

HTML

BIB

RTF