Skip to main content

Research Repository

Advanced Search

A Comprehensive Performance Comparison Study of Various Statistical Models that Accommodate Challenges of the Gut Microbiome Data

Hajihosseini, Morteza; Amini, Payam; Saidi-Mehrabad, Alireza; Hajizadeh, Nastaran; Kozyrskyj, Anita L.; Dinu, Irina

Authors

Morteza Hajihosseini

Alireza Saidi-Mehrabad

Nastaran Hajizadeh

Anita L. Kozyrskyj

Irina Dinu



Abstract

The human gut microbiome refers to trillions of symbiotic bacteria that colonize the human gut after birth, having an essential role in maintaining human health. Various factors can influence the human microbiome, delaying normal gut microbiota’s maturation and leading to the onset of various diseases. Therefore, studying gut microbiome composition offers evidence for early disease detection and intervention opportunities. Stool samples analyzed based on 16S ribosomal RNA via high-throughput sequencing technologies, usually result in the generation of a count table (number of reads) of detected species per sample in a form of amplicon sequence variants. The ASV count data has several inherent challenges, such as over-dispersion, within-samples correlation, and a large number of zeros. Appropriate statistical methods are necessary to measure the effect of important factors on the gut microbial community while addressing specific challenges inherent to the ASV counts. This paper compared the behavior of the most common statistical methods that accommodate the challenges of gut microbiome data in a comprehensive simulation study. Sixty-seven percent of our simulation scenarios indicate that Zero Inflated Negative Binomial model had a lower mean square error as compared to the other methods, and the zero-inflated gaussian mixture model had better statistical power. The real data application on the SKOT Cohorts dataset showed the effect of maternal obesity on the taxon abundance of infants at 9- and 18-months assessments. Our study showed that some of the more recent methods could adequately accommodate the challenges in the gut microbiome data without requiring data transformation or normalization.

Citation

Hajihosseini, M., Amini, P., Saidi-Mehrabad, A., Hajizadeh, N., Kozyrskyj, A. L., & Dinu, I. (in press). A Comprehensive Performance Comparison Study of Various Statistical Models that Accommodate Challenges of the Gut Microbiome Data. Statistics in Biosciences, https://doi.org/10.1007/s12561-024-09435-8

Journal Article Type Article
Acceptance Date May 3, 2024
Online Publication Date May 31, 2024
Deposit Date Jul 12, 2024
Journal Statistics in Biosciences
Print ISSN 1867-1764
Electronic ISSN 1867-1772
Publisher Springer Verlag
Peer Reviewed Peer Reviewed
DOI https://doi.org/10.1007/s12561-024-09435-8
Public URL https://keele-repository.worktribe.com/output/875152
Additional Information Received: 21 November 2023; Revised: 24 April 2024; Accepted: 3 May 2024; First Online: 31 May 2024; : ; : The authors declare that there is no conflict of interest.