265739 Test for Interactions between a SNP-set and Environment in Generalized Linear Models

Tuesday, October 30, 2012 : 12:30 PM - 12:50 PM

Xinyi (Cindy) Lin, AM , Department of Biostatistics, Harvard School of Public Health, Harvard University, Boston, MA
Seunggeun Lee, PhD , Department of Biostatistics, Harvard School of Public Health, Harvard University, Boston, MA
David Christiani, MD, MPH , Department of Environmental Health and Department of Epidemiology, Harvard School of Public Health, Harvard University, Boston, MA
Xihong Lin, PhD , Department of Biostatistics, Harvard School of Public Health, Harvard University, Boston, MA
Identification of gene-environment interactions has significant consequences for public health intervention and complex disease etiology. Knowledge of how genes and environment interact can provide guidelines on how a genetically predisposed person can reduce environmental exposure thereby reducing disease risk. The standard approaches of single marker tests are problematic for large genome-wide association studies due to the large number of tests conducted, where multiple testing corrections that do not sufficiently account for the correlation of the tests often results in low power. Besides suffering from low power, existing multi-marker tests are also inadequate as they may give unstable results due to the large models fitted. In this paper, we propose a computationally-efficient and powerful gene-environment set association test, called GESAT, which allows simultaneous testing of gene-environment interactions under a generalized linear model framework. We first group single nucleotide polymorphisms (SNPs) based on biologically meaningful criteria, and then test the grouped SNPs jointly for gene-environment interactions. GESAT accounts for correlation between terms tested, leading to reduced degrees of freedom and increased power. The development of GESAT is motivated by a problem in lung cancer in which we want to investigate whether the effect of variant(s) in 15q24-25.1 region on lung cancer risk is moderated by smoking. Using simulated SNP data in 15q24-25.1 region based on HapMap CEU population, we show that GESAT performs well. Lastly, we apply GESAT to real lung cancer data to investigate the gene-environment interactions between SNPs in 15q24-25.1 region and smoking.

Learning Areas:
Biostatistics, economics
Epidemiology
Public health biology

Learning Objectives:
Describe a method to analyze data to identify gene-environment interactions in large-scale genetic studies.

Keywords: Genetics, Biostatistics

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I am a PhD student in Biostatistics and have been actively involved in developing statistical methods.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.

Back to: 4248.0: Student Award Presentation