Online Program

Exploring E-Cigarette Twitter Data: Understanding Trends and Implementing Surveillance

Sunday, November 1, 2015

Jillian Pugatch, MPH, ICF International, Rockville, MD
Heather Cole-Lewis, PhD, ICF International, Rockville, MD
Arun Varghese, MPA, MEng, Health, Research, Informatics and Technology, ICF International, Durham, NC
Amy Sanders, MA, ICF International, Rockville, MD
Mary Schwarz, Digital Strategy; Marketing, Interactive & Technology Division, ICF International, Rockville, MD
Iva Stoyneva, ICF International, Rockville, MD
Erik Augustson, PhD, MPH, Tobacco Control Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Rockville, MD
Background: E-cigarettes is a growing topic among Twitter users. The ability to analyze conversations about e-cigarettes in real-time can provide important insight into the trends, and subsequently guide public health interventions. Automating this analysis process with machine learning and replicating it over time can serve as an important surveillance tool for e-cigarette discourse and other health topics on Twitter.

Methods: Manual content analysis of e-cigarette-related tweets from May 2013 to May 2014 guided the development of a supervised machine learning model to automate the analysis process. This automated process is being repeated for e-cigarette tweets from June 2014 to May 2015 to analyze trends over time.

Results: Predictive performance scores for machine learning classification models indicated that the models correctly labeled the tweets from 2013-2014. Replication of the automated process for tweets from 2014-2015 has been yielding promising results that document the continuing and changing trends in Twitter-based e-cigarette discourse over the past two years.

Conclusions: Automating the analysis of social media data can uncover long-term trends in personal sentiment, knowledge, attitudes, and behavior. Turning content analysis into computational methodologies can enhance and extend research and practice. This work was successful in automating a complex content analysis of e-cigarette-related content on Twitter using machine learning techniques, and then replicating the process for a second year. The study details how e-cigarette conversations on Twitter change over time, and provides a model for continuous surveillance of Twitter data for other health topics and applications.

Learning Areas:

Communication and informatics
Public health or related research

Learning Objectives:
Evaluate the use of automated content analysis of e-cigarette-related twitter data over time Discuss trends in the public’s knowledge, attitudes, and beliefs surrounding e-cigarettes, and explore implications for implement prevention and cessation efforts Discuss the application of machine learning and automated surveillance to other health topics

Keyword(s): Social Media, Tobacco Control

Presenting author's disclosure statement:

Qualified on the content I am responsible for because: I have been working with the research team for the past two years (as an NCI/Tobacco Control Research Branch's Health Communication Fellow at first, and an ICF International employee now). I have been involved with this project since the beginning of this year.
Any relevant financial relationships? No

I agree to comply with the American Public Health Association Conflict of Interest and Commercial Support Guidelines, and to disclose to the participants any off-label or experimental uses of a commercial product or service discussed in my presentation.