Sanjeena Dang (Subedi) (Carleton University)
Date
Friday November 8, 20242:30 pm - 3:30 pm
Location
234 JEFFERY HALLMath & Stats Department Colloquium
Friday, November 8th, 2024
Time: 2:30 p.m. Place: Jeffery Hall, Room 234
Speaker: Sanjeena Dang (Subedi) (Carleton University)
Title: Clustering microbiome data using a mixture of logistic normal multinomial distributions
Abstract: The human microbiome plays an important role in human health and disease status. Next-generating sequencing technologies allow for quantifying the composition of the human microbiome. Clustering these microbiome data can provide valuable information by identifying underlying patterns across samples. However, clustering these datasets is challenging. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance, and therefore often are treated as compositional. Analysing such compositional data presents many challenges because they are restricted to a simplex. I will present recent advances in clustering microbiome data using a mixture of logistic normal multinomial models. In a logistic normal multinomial model, the relative abundance of the microbiome is mapped from a simplex to a latent variable in the real Euclidean space using the additive log-ratio transformation. While a logistic normal multinomial approach brings flexibility for modelling the data, it comes with a heavy computational cost as the parameter estimation typically relies on Bayesian techniques. In our work, we utilize an efficient framework for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. Some other recent and ongoing developments using extensions of LNM distribution to cluster microbiome data will be discussed.