Selected papers are below. Most of the papers described below is accompanied by open-source software and publicly released datasets. For a full list of papers, please see Google Scholar.
AI/ML for genetic and scientific discovery
AI/ML is widely used in many scientific disciplines. I am working to make the scientific conclusions from AI/ML more reliable. I have developed theory, methods, and software for machine-learning-assisted statistical inference. It makes reliable genetic and scientific discoveries from unreliable AI/ML predictions.
Valid inference for machine learning-assisted genome-wide association studies.Heterogeneous treatment effect and gene-environment interactions
Genetic effects on complex traits may depend on contexts, such as age, sex, genetic background, or social settings. Similarly, treatment effects may depend on genetic variations. I am developing theory, methods, and software for detecting and interpreting genetic effects across diverse contexts, as well as understanding how genetic variations modify policy or treatment effects.
Reimagining gene-environment interaction analysis for human complex traits
Miao J., Song G., Wu Y., Hu J., Wu Y., Basu S., Andrews J., Schaumberg K., Fletcher J., Schmitz L., Lu Q. (2022).
--Featured by Center for Genomic Science Innovation, UW-Madison
Nature Human Behaviour, in press
[Preprint]
[Software: PIGEON]
[Summary]
A quantile integral linear model to quantify genetic effects on phenotypic variability
Miao J., Lin Y., Wu Y., Zheng B., Schmitz L., Fletcher J., Lu Q. (2022).
--Winner of the 2022 ASA Section on Statistics in Genomics and Genetics Student Paper Award
Proceedings of the National Academy of Sciences (PNAS)
[Journal]
[Preprint]
[Software: QUAIL]
[Summary]
Transcriptome-level interpretation of gene-by-sex interactions for human complex traits
Miao J., Wu Y., Yang X., Schmitz L., Lu Q. (2024).
--Winner of the 2023 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Semifinalist
Genetic risk prediction in ancestrally diverse populations
The vast majority of GWAS participants are of European descent, leading to genetic risk prediction models, such as polygenic risk scores (PRS), being more accurate in Europeans but significantly less effective in other populations. This limitation severely restricts their clinical utility. To address this, I have developed a transfer learning algorithm to improve the generalizability of genetic risk prediction in ancestrally diverse populations.
Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics
Miao J.*, Guo H.*, Song G., Zhao Z., Hou L.†, Lu Q.† (2023).
--Featured by Department of Statistics and Data Science, Tsinghua university (Chinese).
Nature Communications
[Journal]
[Preprint]
[Software: X-Wing]
[Summary]