{"_id":"18648","article_processing_charge":"No","citation":{"chicago":"Machnik, Nick N, Seyed Mahdi Mahmoudi, Malgorzata Borczyk, Ilse Krätschmer, Markus J. Bauer, and Matthew Richard Robinson. “Causal Inference for Multiple Risk Factors and Diseases from Genomics Data.” BioRxiv, 2024. https://doi.org/10.1101/2023.12.06.570392.","apa":"Machnik, N. N., Mahmoudi, S. M., Borczyk, M., Krätschmer, I., Bauer, M. J., & Robinson, M. R. (2024). Causal inference for multiple risk factors and diseases from genomics data. bioRxiv. https://doi.org/10.1101/2023.12.06.570392","mla":"Machnik, Nick N., et al. “Causal Inference for Multiple Risk Factors and Diseases from Genomics Data.” BioRxiv, 2024, doi:10.1101/2023.12.06.570392.","ieee":"N. N. Machnik, S. M. Mahmoudi, M. Borczyk, I. Krätschmer, M. J. Bauer, and M. R. Robinson, “Causal inference for multiple risk factors and diseases from genomics data,” bioRxiv. 2024.","ama":"Machnik NN, Mahmoudi SM, Borczyk M, Krätschmer I, Bauer MJ, Robinson MR. Causal inference for multiple risk factors and diseases from genomics data. bioRxiv. 2024. doi:10.1101/2023.12.06.570392","short":"N.N. Machnik, S.M. Mahmoudi, M. Borczyk, I. Krätschmer, M.J. Bauer, M.R. Robinson, BioRxiv (2024).","ista":"Machnik NN, Mahmoudi SM, Borczyk M, Krätschmer I, Bauer MJ, Robinson MR. 2024. Causal inference for multiple risk factors and diseases from genomics data. bioRxiv, 10.1101/2023.12.06.570392."},"user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","oa_version":"Preprint","OA_place":"repository","author":[{"last_name":"Machnik","id":"3591A0AA-F248-11E8-B48F-1D18A9856A87","orcid":"0000-0001-6617-9742","first_name":"Nick N","full_name":"Machnik, Nick N"},{"full_name":"Mahmoudi, Seyed Mahdi","first_name":"Seyed Mahdi","last_name":"Mahmoudi","id":"b9f6d5ef-7774-11eb-a47f-df2c75c02ee7"},{"last_name":"Borczyk","first_name":"Malgorzata","full_name":"Borczyk, Malgorzata"},{"full_name":"Krätschmer, Ilse","orcid":"0000-0002-5636-9259","first_name":"Ilse","id":"30d4014e-7753-11eb-b44b-db6d61112e73","last_name":"Krätschmer"},{"full_name":"Bauer, Markus J.","first_name":"Markus J.","last_name":"Bauer"},{"full_name":"Robinson, Matthew Richard","orcid":"0000-0001-8982-8813","first_name":"Matthew Richard","last_name":"Robinson","id":"E5D42276-F5DA-11E9-8E24-6303E6697425"}],"corr_author":"1","date_updated":"2024-12-19T14:37:45Z","date_created":"2024-12-11T10:42:59Z","type":"preprint","day":"10","department":[{"_id":"MaRo"}],"status":"public","main_file_link":[{"url":"https://doi.org/10.1101/2023.12.06.570392","open_access":"1"}],"abstract":[{"lang":"eng","text":"Statistical causal learning in genomics relies on the instrumental variable method of\r\nMendelian Randomization (MR). Currently, an overwhelming number of MR studies\r\npurport to show causal relationships among a wide range of risk factors and outcomes.\r\nHere, we show that selecting instrument variables from genome-wide association study\r\nestimates leads to high false discovery rates for many MR approaches, which can be\r\ngreatly reduced by employing a graphical inference approach which: (i) explicitly tests\r\ninstrumental variable assumptions; (ii) distinguishes direct from indirect factors in very\r\nhigh-dimensional data; (iii) discriminates pleiotropic from trait-specific markers, controlling for LD genome-wide; (iv) accommodates rare variants and binary outcomes in a\r\nprincipled way; and (v) identifies potential unobserved latent confounding. For 17 traits\r\nand 8.4M variants recorded for 458,747 individuals in the UK Biobank, we show that\r\nstandard MR analysis gives an abundance of findings that disappear under stringent\r\nassumption checks, with many relationships reflecting potential unmeasured confounding. This implies that mixtures of temporal precedence and potential for reverse-causality\r\nprohibit understanding the underlying nature of phenotypic and genetic correlations in\r\nbiobank data. We propose that well-curated longitudinal records are likely needed and\r\nthat our approach provides a first-step toward robust principled screening for potential\r\ncausal links.\r\n"}],"title":"Causal inference for multiple risk factors and diseases from genomics data","date_published":"2024-08-10T00:00:00Z","project":[{"grant_number":"PCEGP3_181181","name":"Improving estimation and prediction of common complex disease risk","_id":"9B8D11D6-BA93-11EA-9121-9846C619BF3A"},{"_id":"bd936e6f-d553-11ed-ba76-a82299f63e8c","name":"Advanced statistical modelling to facilitate more accurate characterisation of disease phenotypes, improved genetic mapping, and effective therapeutic hypothesis generation","grant_number":"590359"}],"publication":"bioRxiv","language":[{"iso":"eng"}],"publication_status":"published","year":"2024","acknowledged_ssus":[{"_id":"ScienComp"}],"oa":1,"related_material":{"record":[{"status":"public","relation":"dissertation_contains","id":"18642"}]},"doi":"10.1101/2023.12.06.570392","acknowledgement":"We thank Zoltan Kutalik and members of the Robinson group \r\nat ISTA for their comments, which improved this manuscript. This work was funded \r\nby a research collaboration agreement between Boehringer Ingelheim and the research \r\ngroup of MRR at the Institute of Science and Technology Austria. Additional funding \r\nwas also provided by an SNSF Eccellenza Grant to MRR (PCEGP3-181181), and by \r\ncore funding from the Institute of Science and Technology Austria. We would like \r\nto acknowledge the participants and investigators of the UK Biobank study. High- \r\nperformance computing was supported by the Scientific Service Units (SSU) of IST \r\nAustria through resources provided by Scientific Computing (SciComp). ","tmp":{"short":"CC BY-NC (4.0)","image":"/images/cc_by_nc.png","legal_code_url":"https://creativecommons.org/licenses/by-nc/4.0/legalcode","name":"Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)"},"month":"08","OA_type":"free access"}