Statistical and Computational Methods

Single-Cell Genomics

(Under Construction)

Gene Regulation

GEO Database has massive public available gene expression data, while the chromatin accessibility data (e.g. DNase-seq) is quite limited. This study aims to build a scalable and computational efficient regression model using matched gene expression data and regulatory profiles, generalize our model for prediction across multiple platforms, and with the ultimate goal to construct comprehensive regulome landscape.

Microbiome and Metagenomics

Analyzing human microbiome data from individual study might yield inconsistent results due to technical variabilities. To address the potential biases from sequencing protocols, we developed a kernel regression framework for integrative analysis of multiple microbiome datasets, which accounts for between-study dissimilarity, and proposed a score based statistic for joint test of common effect and heterogeneity.


Biomedical Application and Collaboration Projects

Cancer Immunology

Immune checkpoint inhibitors are immunotherapy drugs that block checkpoint protein from binding with their partner proteins (e.g. PD-1 and PD-L1). We are interested in detecting such ligand and receptor pair and their interaction between different cell subpopulation based on scRNA-seq data.

Human Microbiome Project Data

Human Microbiome Project data can be accessed via The database of Genotypes and Phenotypes (dbGaP). Our contribution is to develop open source R toolkit to download, decrypt and analyze controlled access data for users with dbGaP repository key.


Course Projects

  • BARTlearner: A unified software for heterogenous treatment effect estimation in observation studies via BART. [link]
    Special Studies and Research Biostatistics | Advisor: Ravi Varadhan |PH.140.840

  • A Bayesian hierarchical model for PANSS score trajectory prediction. [link]
    Advanced Topics in Bayesian Hierarchical Models | PH.140.850

  • NHANES data presentaion. [link]
    Advanced Topics in Wearable Computing | PH.140.850

  • 2019-2020 NBA Playoffs Prediction. [link]
    Advanced Data Science | PH.140.712