Jiacheng Miao    jiacheng.miao at wisc.edu

Jiacheng is a fourth-year Ph.D. candidate in Biomedical Data Science at UW-Madison, where he works with Prof. Qiongshi Lu. and Prof. Lauren Schmitz. He used to intern at Regeneron Genetics Center. He got his B.S. in Statistics from Nanjing University.

CV  /  Twitter  /  Google scholar

profile photo
News

  • Jan 4 2024: POP-GWAS is preprinted at medRxiv! Try it to use machine learning to boost your GWAS power.

Research

My recent research has focused on

  • Machine learning (ML)-assisted inference
  • Heterogeneous treatment effect, gene-environment interactions, and gene-gene interactions
  • Transfer learning and portability of cross-ancestry polygenic risk score

In general, I am interested in using rigorous statistics and interpretable machine learning to answer scientific questions, especially in human genetics.

Below are my publications and preprints: ('*' denotes equal authorship, representative papers are highlighted).

Lead-authored.
[16] Valid inference for machine learning-assisted GWAS.
Miao J., Wu Y., Sun Z., Miao X., Lu T., Zhao J., Lu Q. (2024).
Submitted. (preprint available on medRxiv)
Preprint / Software

--We report the pervasive risks for false positive associations in conventional GWAS on outcomes predicted by machine learning (ML). We introduce POP-GWAS, a novel statistical framework that reimagines GWAS on ML-imputed outcomes. POP-GWAS provides valid and optimal statistical inference irrespective of the quality of imputation or variables and algorithms used for imputation. It also only requires GWAS summary statistics as input and is optimized for the characteristics of GWAS data.

[15] Assumption-lean and data-adaptive post-prediction inference.
Miao J*., Miao X.*, Wu Y., Zhao J., Lu Q. (2023).
Submitted. (preprint available on arXiv)
Preprint / Software

--We introduce an assumption-lean and data-adaptive Post-Prediction Inference (POP-Inf) procedure that allows valid and powerful inference based on ML-predicted outcomes. Its "assumption-lean" property guarantees reliable statistical inference without assumptions on the ML-prediction, for a wide range of statistical quantities. Its "data-adaptive'" feature guarantees an efficiency gain over existing post-prediction inference methods, regardless of the accuracy of ML-prediction.

[14] Statistical Methods for Gene-environment Interaction Analysis.
Miao J., Wu Y., Lu Q. (2023).
WIREs Computational Statistics (Review)
Journal

--We provide a comprehensive review of the evluation statistical methods for GxE interaction analysis from pre-GWAS era to the present data, featured by meta-analysis conducted by big genetics consortia, sharing of summary association statistics, and statistical analysis only requiring summary data as input.

[13] Reimagining Gene-Environment Interaction Analysis for Human Complex Traits
Miao J., Song G., Wu Y., Hu J., Wu Y., Basu S., Andrews J., Schaumberg K., Fletcher J., Schmitz L., Lu Q. (2022).
Submitted. (preprint available on bioRxiv)
Preprint / Software

-We present a unified theory and framework, called PIGEON, for modeling polygenic GxE effects for complex traits. It allows us to define the estimands of interest and to establish the connection and distinction between different methods in GxE inference. Motivated by our theory, we have also developed an innovative approach to estimate GxE interactions using genome-wide summary data. It is unbiased, computationally efficient, robust to sample overlap, heteroscedasticity, and gene-environment correlation.

[12] Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics.
Miao J.*, Guo H.*, Song G., Zhao Z., Hou L., Lu Q. (2023).
Nature Communications, 14, 832
Journal / Preprint / Software / Poster

--We introduce a cross-population genetic risk prediction framework, called X-Wing, that (1) quantifies cross-population local genetic correlation (2) incorporates it as prior into a Bayesian framework which amplifies correlated SNP effects between populations (3) uses summary statistics-based ensemble learning to further improve prediction accuracy.

[11] Identifying genetic loci associated with complex trait variability.
Miao J., Lu Q. (2022).
Handbook of Statistical Bioinformatics (2nd Edition). Springer. (Book Chapter)
Book Chapter

--We provide a comprehensive review of the evluation statistical methods for identifying genetic loci associated with complex trait variability. It talks about the history of the development of the methods, the assumptions they make, and the pros and cons of each method.

[10] A quantile integral linear model to quantify genetic effects on phenotypic variability.
Miao J., Lin Y., Wu Y., Zheng B., Schmitz L., Fletcher J., Lu Q. (2022).
Proceedings of the National Academy of Sciences (PNAS), 119(39): e2212959119.
Journal / Preprint / Software / Slides

    - Winner of Distinguished Student Paper Award from the Section on Statistical Genomics and Genetics of the American Statistical Association (ASA) 2022.
    - Winner of Reviews's Choice Award from American Society of Human Genetics Meeting (ASHG) Meeting 2021. (Top 10%)

--We propose a unified statistical framework, called QUAIL, for estimating genetic effects on the variability of quantitative traits and for prioritizing genetic variants involved in interactions. QUAIL constructs a quantile-integral phenotype that aggregates information from all quantile levels, makes no assumptions about the distribution of the phenotype, and accounts for confounding effects at both the trait level variability.

Collaborative & co-authored
[9] Controlling for polygenic genetic confounding in epidemiologic association studies.
Zhao Z., Yang X., Miao J., Dorn S., Barcellos S., Fletcher J., Lu Q. (2024).
Submitted. (preprint available on bioRxiv)
Preprint
[8] Pervasive biases in proxy GWAS based on parental history of Alzheimer's disease.
Wu Y.*, Sun Z.*, Zheng Q., Miao J., Dorn S., Mukherjee S., Fletcher J., Lu Q. (2023).
Submitted. (preprint available on bioRxiv)
Preprint
[7] The impact of genomic variation on function (IGVF) consortium.
IGVF Consortium (2023)
Submitted. (preprint available on arXiv)
Preprint
[6] Neurogenetic Mechanisms of Risk for ADHD: Examining Associations of Functionally-Annotated Polygenic Scores and Brain Volumes in a Population Cohort.
He, Q., Keding, T., Zhang, Q., Miao J., Herringa, R., Lu, Q., Travers, B., & Li, J. J. (2022).
Submitted. (preprint available on medRxiv)
Preprint
[5] Optimizing and benchmarking polygenic risk scores with GWAS summary statistics.
Zhao, Z., Gruenloh, T., Wu, Y., Sun, Z., Miao J., Wu, Y., Song, J., & Lu, Q. (2022).
Submitted. (preprint available on bioRxiv)
Preprint
[4] Neuropathology-based APOE genetic risk score better quantifies Alzheimer's risk.
Deming Y., Vasiljevic E., Morrow A., Miao J., Van Hulle C., Jonaitis E., Ma Y., Whitenack V., Kollmorgen G., Wild N., Suridjan I., Shaw L., Asthana S., Carlsson C., Johnson S., Zetterberg H., Blennow K., Bendlin B., Lu Q., Engelman C., the Alzheimer's Disease Neuroimaging Initiative (2023).
Alzheimer's & Dementia
Preprint
[3] Decomposing heritability and genetic covariance by direct and indirect effect paths.
Song J., Zou Y., Wu Y., Miao J., Yu Z., Fletcher J., Lu Q. (2023).
PLOS Genetics, 19(1): e1010620.
Journal
[2] The socioeconomic gradient in epigenetic aging clocks: evidence from the Multi-ethnic Study of Atherosclerosis and the Health and Retirement Study.
Schmitz L., Zhao W., Ratliff S., Goodwin J., Miao J., Lu Q., Guo X., Taylor K., Ding J., Liu Y., Levine M., Smith J. (2021).
Epigenetics
Journal / Preprint
[1] The impact of late-career job loss and genetic risk on body mass index: evidence from variance polygenic scores.
Schmitz L., Goodwin J., Miao J., Lu Q., Conley D. (2021).
Scientific Reports
Journal / Preprint


Selected Awards

- Yixuan, an undergraduate researcher I mentored, won the 2023 ASHG/Charles J. Epstein Trainee Award for Excellence in Human Genetics Research - Semifinalist.

- Distinguished Student Paper Award from American Statistical Association (ASA) Section on Statistical Genomics and Genetics, 2022 [WiscBMI News]

- Reviews's Choice Award from American Society of Human Genetics Meeting (ASHG) (Top 10%), 2021


Software

- POPInf, a toolbox for statistical inference with variables predicted by machine learning.

- POP-TOOLS, a comprehensive toolbox for genetic association analysis on outcomes predicted by machine learning.

- PIGEON for polygenic gene-environment (GxE) interactions inference.

- X-Wing for improving genetic risk prediction in ancestrally diverse populations.

- QUAIL for estimating genetic effects on the variance of quantitative traits.


Mentoring

I find great fulfillment in mentoring students, witnessing their growth, and helping them achieve their goals. Below is a list of students I have had the privilege of mentoring:

- Gefei Song, Undergraduate Researcher at UW-Madison '22, Winner of Hilldale Undergraduate/Faculty Research Award, now PhD in Biostatistics, University of California, Berkeley.

- Yixuan Wu, Undergraduate Researcher at UW-Madison '24, Winner of Hilldale Undergraduate/Faculty Research Award, Winner of 2023 ASHG/Charles J. Epstein Trainee Award for Excellence in Human Genetics Research - Semifinalist.


Notes

I employ Epsilon-Greedy Algorithm for exploration-exploitation tradeoff. Below are my notes on this exploration process:

- Causal Inference.

- An Owner’s Guide to the Human Genome.

- Patterns, Predictions, and Actions.


Quote
  • "Everything should be made as simple as possible, but no simpler" - Albert Einstein

Template from Jon Barron. Big thanks!