Causal inference for multiple risk factors and diseases from genomics data
Machnik NN, Mahmoudi SM, Borczyk M, Krätschmer I, Bauer MJ, Robinson MR. 2024. Causal inference for multiple risk factors and diseases from genomics data. bioRxiv, 10.1101/2023.12.06.570392.
Download (ext.)
https://doi.org/10.1101/2023.12.06.570392
[Preprint]
Preprint
| Published
| English
Author
Machnik, Nick NISTA ;
Mahmoudi, Seyed MahdiISTA;
Borczyk, Malgorzata;
Krätschmer, IlseISTA ;
Bauer, Markus J.;
Robinson, Matthew RichardISTA
Corresponding author has ISTA affiliation
Department
Grant
Abstract
Statistical causal learning in genomics relies on the instrumental variable method of
Mendelian Randomization (MR). Currently, an overwhelming number of MR studies
purport to show causal relationships among a wide range of risk factors and outcomes.
Here, we show that selecting instrument variables from genome-wide association study
estimates leads to high false discovery rates for many MR approaches, which can be
greatly reduced by employing a graphical inference approach which: (i) explicitly tests
instrumental variable assumptions; (ii) distinguishes direct from indirect factors in very
high-dimensional data; (iii) discriminates pleiotropic from trait-specific markers, controlling for LD genome-wide; (iv) accommodates rare variants and binary outcomes in a
principled way; and (v) identifies potential unobserved latent confounding. For 17 traits
and 8.4M variants recorded for 458,747 individuals in the UK Biobank, we show that
standard MR analysis gives an abundance of findings that disappear under stringent
assumption checks, with many relationships reflecting potential unmeasured confounding. This implies that mixtures of temporal precedence and potential for reverse-causality
prohibit understanding the underlying nature of phenotypic and genetic correlations in
biobank data. We propose that well-curated longitudinal records are likely needed and
that our approach provides a first-step toward robust principled screening for potential
causal links.
Publishing Year
Date Published
2024-08-10
Journal Title
bioRxiv
Acknowledgement
We thank Zoltan Kutalik and members of the Robinson group
at ISTA for their comments, which improved this manuscript. This work was funded
by a research collaboration agreement between Boehringer Ingelheim and the research
group of MRR at the Institute of Science and Technology Austria. Additional funding
was also provided by an SNSF Eccellenza Grant to MRR (PCEGP3-181181), and by
core funding from the Institute of Science and Technology Austria. We would like
to acknowledge the participants and investigators of the UK Biobank study. High-
performance computing was supported by the Scientific Service Units (SSU) of IST
Austria through resources provided by Scientific Computing (SciComp).
Acknowledged SSUs
IST-REx-ID
Cite this
Machnik NN, Mahmoudi SM, Borczyk M, Krätschmer I, Bauer MJ, Robinson MR. Causal inference for multiple risk factors and diseases from genomics data. bioRxiv. 2024. doi:10.1101/2023.12.06.570392
Machnik, N. N., Mahmoudi, S. M., Borczyk, M., Krätschmer, I., Bauer, M. J., & Robinson, M. R. (2024). Causal inference for multiple risk factors and diseases from genomics data. bioRxiv. https://doi.org/10.1101/2023.12.06.570392
Machnik, Nick N, Seyed Mahdi Mahmoudi, Malgorzata Borczyk, Ilse Krätschmer, Markus J. Bauer, and Matthew Richard Robinson. “Causal Inference for Multiple Risk Factors and Diseases from Genomics Data.” BioRxiv, 2024. https://doi.org/10.1101/2023.12.06.570392.
N. N. Machnik, S. M. Mahmoudi, M. Borczyk, I. Krätschmer, M. J. Bauer, and M. R. Robinson, “Causal inference for multiple risk factors and diseases from genomics data,” bioRxiv. 2024.
Machnik NN, Mahmoudi SM, Borczyk M, Krätschmer I, Bauer MJ, Robinson MR. 2024. Causal inference for multiple risk factors and diseases from genomics data. bioRxiv, 10.1101/2023.12.06.570392.
Machnik, Nick N., et al. “Causal Inference for Multiple Risk Factors and Diseases from Genomics Data.” BioRxiv, 2024, doi:10.1101/2023.12.06.570392.
All files available under the following license(s):
Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0):
Link(s) to Main File(s)
Access Level
Open Access
Material in ISTA:
Dissertation containing ISTA record