Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models

Navarro, Constanza L. Andaur; Damen, Johanna A. A.; van Smeden, Maarten; Takada, Toshihiko; Nijman, Steven W. J.; Dhiman, Paula; Ma, Jie; Collins, Gary S.; Bajpai, Ram; Riley, Richard D.; Moons, Karel G. M.; Hooft, Lotty

doi:10.1016/j.jclinepi.2022.11.015

Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models

Navarro, Constanza L. Andaur; Damen, Johanna A. A.; van Smeden, Maarten; Takada, Toshihiko; Nijman, Steven W. J.; Dhiman, Paula; Ma, Jie; Collins, Gary S.; Bajpai, Ram; Riley, Richard D.; Moons, Karel G. M.; Hooft, Lotty

Authors

Constanza L. Andaur Navarro

Johanna A. A. Damen

Maarten van Smeden

Toshihiko Takada

Steven W. J. Nijman

Paula Dhiman

Jie Ma

Gary S. Collins

Dr Ram Bajpai r.bajpai@keele.ac.uk

Richard D. Riley

Karel G. M. Moons

Lotty Hooft

Abstract

Background and Objectives
We sought to summarize the study design, modelling strategies, and performance measures reported in studies on clinical prediction models developed using machine learning techniques.

Methods
We search PubMed for articles published between 01/01/2018 and 31/12/2019, describing the development or the development with external validation of a multivariable prediction model using any supervised machine learning technique. No restrictions were made based on study design, data source, or predicted patient-related health outcomes.

Results
We included 152 studies, 58 (38.2% [95% CI 30.8–46.1]) were diagnostic and 94 (61.8% [95% CI 53.9–69.2]) prognostic studies. Most studies reported only the development of prediction models (n = 133, 87.5% [95% CI 81.3–91.8]), focused on binary outcomes (n = 131, 86.2% [95% CI 79.8–90.8), and did not report a sample size calculation (n = 125, 82.2% [95% CI 75.4–87.5]). The most common algorithms used were support vector machine (n = 86/522, 16.5% [95% CI 13.5–19.9]) and random forest (n = 73/522, 14% [95% CI 11.3–17.2]). Values for area under the Receiver Operating Characteristic curve ranged from 0.45 to 1.00. Calibration metrics were often missed (n = 494/522, 94.6% [95% CI 92.4–96.3]).

Conclusion
Our review revealed that focus is required on handling of missing values, methods for internal validation, and reporting of calibration to improve the methodological conduct of studies on machine learning–based prediction models.

Citation

Navarro, C. L. A., Damen, J. A. A., van Smeden, M., Takada, T., Nijman, S. W. J., Dhiman, P., …Hooft, L. (2023). Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models. Journal of Clinical Epidemiology, 154, 8-22. https://doi.org/10.1016/j.jclinepi.2022.11.015

Journal Article Type	Review
Acceptance Date	Nov 22, 2022
Online Publication Date	Nov 25, 2022
Publication Date	2023-02
Deposit Date	Jun 28, 2023
Journal	Journal of Clinical Epidemiology
Print ISSN	0895-4356
Publisher	Elsevier
Peer Reviewed	Peer Reviewed
Volume	154
Pages	8-22
DOI	https://doi.org/10.1016/j.jclinepi.2022.11.015

Effects of participatory ‘A’rt-Based Activity On ‘Health’ of Older Community-Dwellers: results from a randomized control trial of the Singapore A-Health Intervention (2023)
Journal Article

Musculoskeletal disorders and pain in agricultural workers in Low- and Middle-Income Countries: a systematic review and meta-analysis (2023)
Journal Article

Safety of colchicine and NSAID prophylaxis when initiating urate-lowering therapy for gout: propensity score-matched cohort studies in the UK Clinical Practice Research Datalink (2023)
Journal Article

The relationship between pain and depression and anxiety in patients with inflammatory arthritis: a systematic review protocol (2023)
Journal Article

In Regard to Park et al. (2023)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

You might also like

Downloadable Citations