Online Program

Data Fusion: Assessing the Feasibility of Combining Alcohol Data Collected via Probability and Non-probability Samples

Monday, November 2, 2015 : 12:50 p.m. - 1:10 p.m.

Randy ZuWallack, MS, ICF International, Burlington, VT
Thomas K. Greenfield, PhD, Alcohol Research Group, Public Health Institute, Emeryville, CA
James Dayton, MBA, ICF International, Burlington, VT
Katherine J. Karriker-Jaffe, PhD, Alcohol Research Group, Public Health Institute, Emeryville, CA
Naomi Freedner-Maguire, MPH, ICF International, Burlington, VT
We assess the feasibility of using data fusion to combine data from a telephone interview collected from a dual-frame random digit dialing (RDD) sample, with data collected from an opt-in web panel.  While RDD telephone samples are based on probability sampling, they are challenged by decreasing response rates and increasing costs.  Data collected via opt-in web-panel raises concerns regarding population representativeness, yet is less expensive than a telephone sample. 

We hypothesize that a hybrid telephone/web panel approach will offer probability-based prevalence estimates for drinking behaviors (e.g., drank wine in past month), while collecting detailed data (e.g., alcohol expenditures, volume and brands consumed) via web panel.

Our research has three specific aims:

  • Identify linking variables to fuse the data and assess conditional independence
  • Compare the results of estimates from the telephone and web samples
  • Data fusion as a method for a) imputing responses for mid-survey terminates, and b) producing accurate population-based estimates

In the first aim, we use the telephone survey to evaluate the linking variables that provide conditional independence between data from the two sources, a critical assumption for data fusion. In the second aim we compare responses from web and telephone data, controlling for the linking variables.  These comparisons will increase our understanding of whether web panel responses will produce estimates comparable to telephone given that there could be mode effects and/or differences due to population representativeness. In the third aim, we assess the degree to which web data can be integrated with telephone.

Learning Areas:

Public health or related research

Learning Objectives:
Identify the advantages and disadvantages associated with collecting data via RDD telephone and opt-in web panel Describe how data fusion can be used to produce accurate population-based estimates Evaluate the results of estimates from the telephone and web samples

Keyword(s): Alcohol Use, Data Collection and Surveillance

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I have worked in the public health survey research field for 15 years, providing support to Federal, state, and local government agencies on survey, evaluation, and research projects. I lead the development of study designs and oversee data collection for many state, local and national health surveys, including the National Alcohol Survey, on which this feasibility study is based.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.