Zhi Zhao: Multivariate Bayesian variable selection in high-dimensional settings

Image may contain: Person, Forehead, Nose, Cheek, Smile.

Precision cancer medicine aims to determine the optimal treatment for each patient. In-vitro cancer drug sensitivity screens combined with multi-omics characterization of the cancer cells has become an important tool to achieve this aim. Analyzing such pharmacogenomic studies requires flexible and efficient joint statistical models for associating drug sensitivity with high-dimensional multi-omics data. We propose a structured multivariate Bayesian variable selection modelling framework for sparse identification of omics features associated with multiple correlated drug responses. We have provided an efficient implementation of a class of models in the BayesSUR R package (https://CRAN.R-project.org/package=BayesSUR). BayesSUR allows the specification of the models in a modular way, where the user chooses among three priors for variable selection and among three priors for covariance selection separately. Since many anti-cancer drugs are designed for specific molecular targets, our approach can make use of known structure between responses and predictors, e.g. molecular pathways and related omics features targeted by specific drugs, via a Markov-random-field (MRF) prior for the latent variable selection indicators of the coefficient matrix in sparse seemingly unrelated regression (SUR). The structure information included in the MRF prior can improve the model performance, i.e. variable selection and response prediction, compared to other common priors. The proposed approach is validated by simulation studies and applied to data from the Genomics of Drug Sensitivity in Cancer database, which includes pharmacological profiling and multi-omics characterization of a large set of heterogeneous cell lines. Finally, as an alternative to the SUR setup of the Bayesian models, we also suggest Gaussian copula models for multivariate responses of diverse types for identifying important variables from high-dimensional covariates.


About the speaker:

Zhi Zhao received a PhD in Biostatistics at the University of Oslo in 2020. Since then he has been a postdoc at Radiumhospitalet, Oslo University Hospital. His work focuses on development of statistical methodologies for personalized cancer therapy. His research interests include sparse Bayesian models, penalized likelihood methods, mixed models, survival analysis, and modelling of complex and high-dimensional data.

Published Apr. 7, 2022 5:02 PM - Last modified May 19, 2022 3:01 PM