Summerlee Science Complex Room 1511
CANDIDATE: SIYU CHEN
ABSTRACT:
Gut microbial dysbiosis contributes to the risk of colorectal cancer, thus it is important to study the gut mucosal microbiome. Gut bacteria microbiome data has the features of excess zeros and overdispersion that restrict the use of fitting traditional Poisson regression models to this kind of count data. We propose the use of the generalized zero-inflated Poisson (GZIP) regression mixture model for analyzing such data. When fitting a mixture model, we need to specify the number of components in a given population. However, the number of components is unknown. In this thesis, we choose the number of components using the Bayesian information criterion (BIC). The EM algorithm is used to estimate parameters and the performance of the models is assessed by simulation studies. The GZIP mixture model is applied to gut bacteria microbiome data from a colorectal cancer study. We only consider the carcinoma and healthy groups as a health state covariate and select the best fitted GZIP model to each bacteria genus from models of two, three, or four components. Some special cases where the proposed methods failed to be applied are also discussed.
Advisory Committee
- Z. Feng, Advisor
- G. Darlington
Examining Committee
- J. Balka, Chair
- Z. Feng
- G. Darlington
- L. Deeth