Selected papers are below. Most of the papers described below is accompanied by open-source software and publicly released datasets. For a full list of papers, please see Google Scholar.
AI/ML for genetic and scientific discovery
AI/ML is widely used in many scientific disciplines. I am working to make the scientific conclusions from AI/ML more reliable. I have developed theory, methods, and software for machine-learning-assisted statistical inference. It makes reliable genetic and scientific discoveries from unreliable AI/ML predictions.
Valid inference for machine learning-assisted genome-wide association studies.Heterogeneous treatment effect, gene-environment interactions, and genetic modifiers
Genetic effects on complex traits may depend on contexts, such as age, sex, genetic background, or social settings. Similarly, treatment effect may depends on gentic variations. I am developing theory, methods, and software for detecting and interpreting genetic effects across diverse contexts, as well as understanding how genetic variatios modify policy or treatment effects.
Transcriptome-Level Interpretation of Gene-by-Sex Interactions for Human Complex Traits
Miao J.*, Wu Y.*, Yang X., Schmitz L., Lu Q. (2024).
--Winner of the 2023 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Semifinalist
Reimagining gene-environment interaction analysis for human complex traits
Miao J., Song G., Wu Y., Hu J., Wu Y., Basu S., Andrews J., Schaumberg K., Fletcher J., Schmitz L., Lu Q. (2022).
Submitted
[Preprint]
[Software: PIGEON]
[Summary]
A quantile integral linear model to quantify genetic effects on phenotypic variability
Miao J., Lin Y., Wu Y., Zheng B., Schmitz L., Fletcher J., Lu Q. (2022).
--Winner of the 2022 ASA Section on Statistics in Genomics and Genetics Student Paper Award
Proceedings of the National Academy of Sciences (PNAS)
[Journal]
[Preprint]
[Software: QUAIL]
[Summary]
Generalizability of genetic risk prediction
in ancestrally diverse populations
The vast majority of GWAS participants are of European descent. As a result, current genetic risk prediction models, such as polygenic risk scores (PRS), are more effective in Europeans but have substantially reduced accuracy in other diverse populations. It severely limits their clinical utility. I have developed a transfer learning algorithm to improve the generalizability of genetic risk prediction in ancestrally diverse populations.
Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics
Miao J.*, Guo H.*, Song G., Zhao Z., Hou L.†, Lu Q.† (2023).
Nature Communications
[Journal]
[Preprint]
[Software: X-Wing]
[Summary]