Statistical and Computational Methods
Single-Cell Genomics
(Under Construction)Gene Regulation
 
GEO Database has massive public available gene expression data, while the chromatin accessibility data (e.g. DNase-seq) is quite limited. This study aims to build a scalable and computational efficient regression model using matched gene expression data and regulatory profiles, generalize our model for prediction across multiple platforms, and with the ultimate goal to construct comprehensive regulome landscape.  
Microbiome and Metagenomics
 
Analyzing human microbiome data from individual study might yield inconsistent results due to technical variabilities. To address the potential biases from sequencing protocols, we developed a kernel regression framework for integrative analysis of multiple microbiome datasets, which accounts for between-study dissimilarity, and proposed a score based statistic for joint test of common effect and heterogeneity.
Biomedical Application and Collaboration Projects
Cancer Immunology
Immune checkpoint inhibitors are immunotherapy drugs that block checkpoint protein from binding with their partner proteins (e.g. PD-1 and PD-L1). We are interested in detecting such ligand and receptor pair and their interaction between different cell subpopulation based on scRNA-seq data.
Human Microbiome Project Data
Human Microbiome Project data can be accessed via The database of Genotypes and Phenotypes (dbGaP). Our contribution is to develop open source R toolkit to download, decrypt and analyze controlled access data for users with dbGaP repository key.
Course Projects
- 
    BARTlearner: A unified software for heterogenous treatment effect estimation in observation studies via BART. [link] 
 Special Studies and Research Biostatistics | Advisor: Ravi Varadhan |PH.140.840
- 
    A Bayesian hierarchical model for PANSS score trajectory prediction. [link] 
 Advanced Topics in Bayesian Hierarchical Models | PH.140.850
- 
    NHANES data presentaion. [link] 
 Advanced Topics in Wearable Computing | PH.140.850
- 
    2019-2020 NBA Playoffs Prediction. [link] 
 Advanced Data Science | PH.140.712