Towards Real-Time Measurement of Public Epidemic Awareness

Similar documents
CHALLENGES IN INFLUENZA FORECASTING AND OPPORTUNITIES FOR SOCIAL MEDIA MICHAEL J. PAUL, MARK DREDZE, DAVID BRONIATOWSKI

Asthma Surveillance Using Social Media Data

Regional Level Influenza Study with Geo-Tagged Twitter Data

Worldwide Influenza Surveillance through Twitter

USING SOCIAL MEDIA TO STUDY PUBLIC HEALTH MICHAEL PAUL JOHNS HOPKINS UNIVERSITY MAY 29, 2014

Forecasting Influenza Levels using Real-Time Social Media Streams

Tom Hilbert. Influenza Detection from Twitter

A Framework for Public Health Surveillance

Effects of Sampling on Twitter Trend Detection

Analysing Twitter and web queries for flu trend prediction

Cloud- based Electronic Health Records for Real- time, Region- specific

Large Scale Analysis of Health Communications on the Social Web. Michael J. Paul Johns Hopkins University

Examining Patterns of Influenza Vaccination in Social Media

Multivariate Linear Regression of Symptoms-related Tweets for Infectious Gastroenteritis Scale Estimation

Searching for flu. Detecting influenza epidemics using search engine query data. Monday, February 18, 13

arxiv: v1 [q-bio.pe] 9 Apr School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ

Real Time Early-stage Influenza Detection with Emotion Factors from Sina Microblog

Available online at ScienceDirect. Procedia Computer Science 70 (2015 ) Vinay Kumar Jain a, Shishir Kumar b

Lecture 20: CS 5306 / INFO 5306: Crowdsourcing and Human Computation

#Swineflu: Twitter Predicts Swine Flu Outbreak in 2009

Brendan O Connor,

Ongoing Influenza Activity Inference with Real-time Digital Surveillance Data

Assessing Google Correlate Queries for Influenza H1N1 Surveillance in Asian Developing Countries

Public Health Resources: Core Capacities to Address the Threat of Communicable Diseases

Triaging Mental Health Forum Posts

International Co-circulation of 2009 H1N1 and Seasonal Influenza (As of September 4, 2009; posted September 11, 2009, 6:00 PM ET)

2009 H1N1 (Pandemic) virus IPMA September 30, 2009 Anthony A Marfin

City, University of London Institutional Repository

Influenza forecast optimization when using different surveillance data types and geographic scale

Austin Public Health Epidemiology and Disease Surveillance Unit. Travis County Influenza Surveillance

Abstract. But what happens when we apply GIS techniques to health data?

Data Driven Methods for Disease Forecasting

H1N1 Flu Virus (Human Swine Influenza) Information Resources (last updated May 6th, 2009, 5 pm)

Advances in Viral Immunity Stemming from the 1918 Flu Pandemic

Outline General overview of the project Last 6 months Next 6 months. Tracking trends on the web using novel Machine Learning methods.

arxiv: v1 [cs.si] 16 Apr 2015

Carmen: A Twitter Geolocation System with Applications to Public Health

Predicting Popularity of Microblogs in Emerging Disease Event

Ohio Department of Health Seasonal Influenza Activity Summary MMWR Week 3 January 13-19, 2013

EpiDash A GI Case Study

Seasonal Influenza Report

Ohio Department of Health Seasonal Influenza Activity Summary MMWR Week 7 February 10-16, 2013

How you can help support the Beat Flu campaign

Ohio Department of Health Seasonal Influenza Activity Summary MMWR Week 14 April 1-7, 2012

How you can help support the Beat Flu campaign

Tracking the flu pandemic by monitoring the Social Web

Detecting influenza outbreaks by analyzing Twitter messages

Using Internet data to learn in the health domain

Tracking Disease Outbreaks using Geotargeted Social Media and Big Data

Identifying and Categorizing Disaster-Related Tweets

RESPONDING TO AN INFLUENZA OUTBREAK

Influenza Surveillance Summary: Denton County Hospitals and Providers 17.51% 15.95% 10.94% 7.41% 9.01% 8.08%

Pandemic Influenza. Bradford H. Lee, MD Nevada State Health Officer. Public Health: Working for a Safer and Healthier Nevada

Pandemic Influenza Planning:

Georgia Weekly Influenza Report

How is theflu vaccine determined each year

Georgia Weekly Influenza Report

Influenza RN.ORG, S.A., RN.ORG, LLC

Seasonal Influenza Report

Advances in epidemic forecasting & implications for vaccination

Temporal Topic Modeling to Assess Associations between News Trends and Infectious Disease Outbreaks

Disease Propagation in Social Networks: A Novel Study of Infection Genesis and Spread on Twitter

Monitoring of Epidemic Using Spatial Technology

Final Results Report. National Foundation for Infectious Diseases (NFID) 2015 Influenza/Pneumococcal News Conference

Identifying and Categorizing Disaster-Related Tweets

Automated Social Network Epidemic Data Collector

Buy The Complete Version of This Book at Booklocker.com:

H1N1 Flu Virus (Human Swine Influenza) Information Resources (last updated May 7th, 2009, 12pm)

Georgia Weekly Influenza Report

U.S. Human Cases of Swine Flu Infection (As of April 29, 2009, 11:00 AM ET)

Enhancing disease surveillance with novel data streams: challenges and opportunities

Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets

Influenza Surveillance Report

The Impact of Pandemic Influenza on Public Health

Promoting Public Health Security Addressing Avian Influenza A Viruses and Other Emerging Diseases

Seasonal Influenza Report

Coming Flu By J.L. Greger READ ONLINE

H1N1 Response and Vaccination Campaign

Overview of the NTCIR-13: MedWeb Task

Running head: INFLUENZA VIRUS SEASON PREPAREDNESS AND RESPONSE 1

FLU TREND PREDICTION IN THE WORLD OF BIG DATA. A new way to prevent influenza outbreaks

On Infectious Intestinal Disease Surveillance using Social Media Content

Infodemiology, an emerging area of research at the

Communication Challenges and Opportunities in the Influenza Season: Vaccine Effectiveness and Options

Ohio Department of Health Seasonal Influenza Activity Summary MMWR Week 14 April 3 9, 2011

Enhancing Disease Surveillance with Novel Data Streams: Challenges and Opportunities

Epidemiology Treatment and control Sniffles and Sneezes Mortality Spanish flu Asian flu Hong Kong flu The Swine flu scare

Assessing the Impact of a Health Intervention via User-generated Internet Content

County of San Diego April 30, 2014 Week #17, ending 4/26/14 INFLUENZA WATCH

Highlights. NEW YORK CITY DEPARTMENT OF HEALTH AND MENTAL HYGIENE Influenza Surveillance Report Week ending January 28, 2017 (Week 4)

Applying GIS and Machine Learning Methods to Twitter Data for Multiscale Surveillance of Influenza

A Smartphone-Driven Thermometer Application for Real-time Population- and Individual-Level Influenza Surveillance

Weekly Influenza & Respiratory Illness Activity Report

Minnesota Influenza Geographic Spread

Influenza-Associated Hospitalization and Death Surveillance: Dallas County

Surveillance Overview

H1N1 Update. What is a flu pandemic? 12/7/2009. PCAST - President s Council of Advisors On Science And Technology

Transcription:

Towards Real-Time Measurement of Public Epidemic Awareness Monitoring Influenza Awareness through Twitter Michael C. Smith David A. Broniatowski The George Washington University Michael J. Paul University of Colorado, Boulder Mark Dredze The Johns Hopkins University

Goal of Research Characterize influenza awareness signal (public concern) to determine which factors most influence disease awareness in a population Influenza incidence rate Infection surveillance using Twitter News media regarding influenza 1

Agenda Introduction / Literature Review: Why disease awareness? How disease awareness? Data Operationalize! Analysis What drives disease awareness? Discussion and Conclusions 2

Why Disease Awareness? Studies have shown that the public s awareness and their reaction to it may affect disease spread Funk et al. (2009);; Jones and Salathe (2009);; Granell, Gomez, and Arenas (2013) Public health officials often manage and monitor disease awareness during epidemics or threats Specific context: influenza Problem: no effective method exists for efficient, current awareness surveillance Such systems exist for flu incidence We don t know what drives awareness! 3

Agenda Introduction / Literature Review: Why disease awareness? How disease awareness? Data Operationalize! Analysis What drives disease awareness? Discussion and Conclusions 4

How Disease Awareness? Web and social media data! Numerous studies on disease surveillance showing benefits: cheap, real-time Search queries: Google Flu Trends (GFT): Ginsberg et al. (2009) Yuan et al. (2013);; Santillana et al. (2014);; Preis and Moat (2014) Social media Culotta (2010);; Aramaki, Maskawa, and Morita (2011);; Lampos and Cristianini (2012) Combining multiple sources - state-of-the-art Santillana et al. (2015) Gov t surveillance systems (tracking hospital visits) may not capture awareness (nor might other systems built to emulate) Social media might: people discuss and share concern 5

How Disease Awareness? Previous work separated infection from awareness Lamb, Paul, and Dredze (2013);; Broniatowski, Paul, Dredze (2013) I have the flu I m sick with the flu vs. tired of hearing about the flu vs. it s flu season don t get sick My kids gave me the flu vs. wash your hands, don t get the flu 6

Twitter flu prediction Our current system uses a cascade of 3 MaxEnt classifiers: about health vs not about health about flu vs not about flu Training data: flu infection vs flu awareness 11,900 labeled tweets collected through MTurk Estimated weekly flu rate: # tweets about flu infection that week # of all tweets that week Slide credit: Paul (2015) - Making Sense of the Web for Public Health using NLP 7

Twitter flu prediction Features: Stylometry Retweets, user mentions, URLs, emoticons 8 manually created word classes Infection Disease Concern Treatment/Preve ntion getting, got, recovered, have, having, had, has, catching, catch, cured, infected bird, flu, sick, epidemic afraid, worried, scared, fear, worry, nervous, dread, dreaded, terrified vaccine, vaccines, shot, shots, mist, tamiflu, jab, nasal spray Slide credit: Paul (2015) - Making Sense of the Web for Public Health using NLP 8

Twitter flu prediction Features: Part of speech templates (subject,verb,object) tuples always a good feature, IMO numeric references 100 more cases of swine flu whether flu is a noun or adjective tired of the flu vs tired of the flu hype whether flu is the subject or object I have the flu vs the flu is going around and others Slide credit: Paul (2015) - Making Sense of the Web for Public Health using NLP 9

How Disease Awareness? Previously focused on social media infection signal Now focus on social media awareness signal 2012-2013 flu season Our work builds upon this by analyzing awareness News media potential driver, can lead to overestimates in infection 12-13 GFT Studying flu awareness: Yields trends for public health officials Determines drivers for awareness in a population 10

Agenda Introduction / Literature Review: Why disease awareness? How disease awareness? Data Operationalize! Analysis What drives disease awareness? Discussion and Conclusions 11

Data 2012-2013 flu season September 30, 2012 to May 25, 2013 National and regional levels Map of Health and Human Services regions Source: 12

Data Twitter Awareness Healthtweets.org (Dredze et al. 2014) Lamb et al. (2013);; Broniatowski et al. (2013) Normalized by public tweet counts US tweets, and state level tweets Also downloaded Twitter infection signal Example trends from HealthTweets.org: weekly United States influenza counts 13

Data Government Flu Incidence Center for Disease Control and Prevention (CDC) US Outpatient Influenza-like Illness Surveillance Network (ILINet) yields Influenza-like Illness rates (ILI) Publicly available on the CDC s flu dashboard: Screenshot of the CDC s flu dashboard. Source: http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html 14

Data News Media Example search of NewsLibrary.com. Source: http://nl.newsbank.com/nlsearch/we/archives/?p_action=customized&s_search_type=customized&p_product=newslibrary&p_theme=newsli brary2&d_sources=location&d_place=united%20states&p_nbid=& 15

Data Summary Twitter awareness Infection signals: Twitter infection CDC ILI News Media 16

Agenda Introduction / Literature Review: Why disease awareness? How disease awareness? Data Operationalize! Analysis What drives disease awareness? Discussion and Conclusions 17

Analysis Overview Compare ILI to Twitter awareness, Twitter infection via Pearson correlations How related are they? Plot weekly regional awareness, Twitter infection How similar are the trends? Plot weekly awareness, ILI, infection, media What can we glean? Regressions: drivers of awareness Infection signals, news media Regional, national 18

Analysis Compare to Gold Standard Compare ILI to both awareness, Twitter infection Awareness significantly lower than Twitter infection (p=.029) nationally 19

Analysis Awareness Signal Awareness more similar than Twitter infection Differences in peak activity, off-peak activity 20

Analysis Awareness Signal 21

Analysis Effect of News Media Awareness Correlation with: Mean Regional Correlation Media.905 (SD.044).940 Twitter Infection.907 (SD.026).900 National Correlation: Can we better define the relationship? 22

Analysis Effect of News Media Bivariate linear regression model at the regional level: 23

Analysis National / Regional Media What about national media? Similar regression model: Weekly regional awareness as regional media and national media Generated 10 regional models with coefficients for regional media and national media Mean Regional Media: Mean National Media: -.088 (SD.582) 1.026 (SD.561) Mean regression coefficients across all ten HHS regions National media explains much more than regional media! 24

Agenda Introduction / Literature Review: Why disease awareness? How disease awareness? Data Operationalize! Analysis What drives disease awareness? Discussion and Conclusions 25

Discussion Distinction between awareness and infection important for flu surveillance Awareness more a function of media than infection National media more than regional media Little variation in regional awareness Awareness does not rise until the flu becomes severe Drops sharply after peak, though infection still high 26

Conclusions Awareness more function of media than infection Opportunity: target only certain national distribution channels National media levels contribute more than regional media Additional study: Relationship more complex than regression model News media Future work needed: generalize to other flu seasons Thoughts? Feedback? mikesmith@gwu.edu 27

References Aramaki, Eiji, Sachiko Maskawa, and Mizuki Morita. 2011. Twitter Catches the Flu: Detecting Influenza Epidemics Using Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1568 76. EMNLP 11. Stroudsburg, PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=2145432.2145600. Broniatowski, David A., Michael J. Paul, and Mark Dredze. 2013. National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic. PLOS ONE 8 (12): e83672. doi:10.1371/journal.pone.0083672. Culotta, Aron. 2010. Towards Detecting Influenza Epidemics by Analyzing Twitter Messages. In Proceedings of the First Workshop on Social Media Analytics, 115 22. SOMA 10. New York, NY, USA: ACM. doi:10.1145/1964858.1964874. Dredze, Mark, Renyuan Cheng, Michael J. Paul, and David Broniatowski. 2014. HealthTweets.org: A Platform for Public Health Surveillance Using Twitter. In. Quebec City. http://www.cs.jhu.edu/~mdredze/publications/2014_w3phi_healthtweets.pdf. Funk, Sebastian, Erez Gilad, Chris Watkins, and Vincent A. A. Jansen. 2009. The Spread of Awareness and Its Impact on Epidemic Outbreaks. Proceedings of the National Academy of Sciences 106 (16): 6872 77. doi:10.1073/pnas.0810762106. Granell, Clara, Sergio Gomez, and Alex Arenas. 2013. Dynamical Interplay between Awareness and Epidemic Spreading in Multiplex Networks. Physical Review Letters 111 (12). doi:10.1103/physrevlett.111.128701. Jones, James Holland, and Marcel Salathé. 2009. Early Assessment of Anxiety and Behavioral Response to Novel Swine-Origin Influenza A(H1N1). PLOS ONE 4 (12): e8032. doi:10.1371/journal.pone.0008032. Lamb, Alex, Michael J. Paul, and Mark Dredze. 2013. Separating Fact from Fear: Tracking Flu Infections on Twitter. In HLT-NAACL, 789 95. http://www.aclweb.org/anthology/n/n13/n13-1097.pdf. Lampos, Vasileios, and Nello Cristianini. 2012. Nowcasting Events from the Social Web with Statistical Learning. ACM Trans. Intell. Syst. Technol. 3 (4): 72:1 72:22. doi:10.1145/2337542.2337557. Paul, Michael J. 2015. Making Sense of the Web for Public Health Using NLP. National Cancer Institute, January 21. http://cmci.colorado.edu/~mpaul/files/slides_nih_1-21-15.pdf. Preis, Tobias, and Helen Susannah Moat. 2014. Adaptive Nowcasting of Influenza Outbreaks Using Google Searches. Royal Society Open Science 1 (2): 140095. doi:10.1098/rsos.140095. Santillana, Mauricio, André T. Nguyen, Mark Dredze, Michael J. Paul, Elaine O. Nsoesie, and John S. Brownstein. 2015. Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance. PLOS Comput Biol 11 (10): e1004513. doi:10.1371/journal.pcbi.1004513. Santillana, Mauricio, D. Wendong Zhang, Benjamin M. Althouse, and John W. Ayers. 2014. What Can Digital Disease Detection Learn from (an External Revision To) Google Flu Trends? American Journal of Preventive Medicine 47 (3): 341 47. doi:10.1016/j.amepre.2014.05.020. Yuan, Qingyu, Elaine O. Nsoesie, Benfu Lv, Geng Peng, Rumi Chunara, and John S. Brownstein. 2013. Monitoring Influenza Epidemics in China with Search Query from Baidu. PLOS ONE 8 (5): e64323. doi:10.1371/journal.pone.0064323. 28