DOI: 10.1002/gepi.22535 ISSN:

Inference of causal metabolite networks in the presence of invalid instrumental variables with GWAS summary data

Siyi Chen, Zhaotong Lin, Xiaotong Shen, Ling Li, Wei Pan
  • Genetics (clinical)
  • Epidemiology


We propose structural equation models (SEMs) as a general framework to infer causal networks for metabolites and other complex traits. Traditionally SEMs are used only for individual‐level data under the assumption that all instrumental variables (IVs) are valid. To overcome these limitations, we propose both one‐ and two‐sample approaches for causal network inference based on SEMs that can: (1) perform causal analysis and discover causal relationships among multiple traits; (2) account for the possible presence of some invalid IVs; (3) allow for data analysis using only genome‐wide association studies (GWAS) summary statistics when individual‐level data are not available; (4) consider the possibility of bidirectional relationships between traits. Our method employs a simple stepwise selection to identify invalid IVs, thus avoiding false positives while possibly increasing true discoveries based on two‐stage least squares (2SLS). We use both real GWAS data and simulated data to demonstrate the superior performance of our method over the standard 2SLS/SEMs. For real data analysis, our proposed approach is applied to a human blood metabolite GWAS summary data set to uncover putative causal relationships among the metabolites; we also identify some metabolites (putative) causal to Alzheimer's disease (AD), which, along with the inferred causal metabolite network, suggest some possible pathways of metabolites involved in AD.

More from our Archive