10.80027/ABCDE
Gieger Christian
0000-0001-6986-9554
Prehn Cornelia
0000-0002-1274-4715
Kastenmüller Gabi
0000-0002-2368-7322
Krumsiek Jan
0000-0003-4734-3791
Adamski Jerzy
0000-0001-9259-0199
Strauch Konstantin
Zaytseva O Olga
0000-0003-3960-5157
Wang-Sattler Rui
0000-0002-8794-8229
Sharapov Z Sodbo
0000-0003-0279-4900
Tsepilov A Yakov
0000-0002-4931-6052
Aulchenko S Yurii
0000-0002-7899-1575
Supporting data for "A network-based conditional genetic association analysis of the human metabolome"
GigaScience Database
2018
Metabolomic
genome-wide association study
multivariate model
metabolomics
conditional analysis
pleiotropy
2018-10-18
en-US
GigaDB Dataset
10.1093/gigascience/giy137
European Union FP7
unknown
602736
Ministry of Education and Science of the Russian Federation
http://dx.doi.org/10.13039/501100003443
Federal Agency of Scientific Organisations via the Institute of Cytology and Genetics
unknown
0324-2018-0017
85.66 MB
CC0 1.0 Universal
Genome-wide association studies have identified hundreds of loci that influence a wide variety of complex human traits; however, little is known regarding the biological mechanism of action of these loci. The recent accumulation of functional genomics ("omics"), including metabolomics data, has created new opportunities for studying the functional role of specific changes in the genome. Functional genomic data are characterized by their high dimensionality, the presence of (strong) statistical dependency between traits, and—potentially—complex genetic control. Therefore, the analysis of such data requires specific statistical genetics methods. To facilitate our understanding of the genetic control of omics phenotypes, we propose a trait-centered, network-based conditional genetic association (cGAS) approach for identifying the direct effects of genetic variants on omics-based traits. For each trait of interest, we selected from a biological network a set of other traits to be used as covariates in the cGAS. The network can be reconstructed either from biological pathway databases (a mechanistic approach) or directly from the data, using a Gaussian Graphical Model applied to the metabolome (a data-driven approach). We derived mathematical expressions which allow comparison of the power of univariate analyses with conditional genetic association analyses. We then tested our approach using data from a population-based KORA study (n=1784 subjects, 1.7 million SNPs) with measured data for 151 metabolites. We found that compared to single-trait analysis, performing a genetic association analysis that includes biologically relevant covariates can either gain or lose power, depending on specific pleiotropic scenarios, for which we provide empirical examples. In the context of analyzed metabolomics data, the mechanistic network approach had more power compared to the data-driven approach. Nevertheless, we believe that our analysis shows that neither a prior-knowledge-only approach nor a phenotypic-data-only approach is optimal, and we discuss possibilities for improvement.