Towards Real Time Epidemic Vigilance through Online Social Networks

Similar documents
Vision: Towards Real Time Epidemic Vigilance through Online Social Networks

Case Studies in Ecology and Evolution. 10 The population biology of infectious disease

A Spatio-Temporal Approach to the Discovery of Online Social Trends

Mathematical Modelling of Effectiveness of H1N1

Automated Social Network Epidemic Data Collector

#Swineflu: Twitter Predicts Swine Flu Outbreak in 2009

Tom Hilbert. Influenza Detection from Twitter

Global Health Security: Preparedness and Response: can we do better and stay safe?

Mathematical Modeling of Trending Topics on Twitter

SARS Outbreaks in Ontario, Hong Kong and Singapore

Online Social Networks Flu Trend Tracker - A Novel Sensory Approach to Predict Flu Trends

Stochastic Modelling of the Spatial Spread of Influenza in Germany

Management of Pandemic Influenza Outbreaks. Bryan K Breland Director, Emergency Management University of Alabama at Birmingham

City, University of London Institutional Repository

4.3.9 Pandemic Disease

Pandemic Preparedness

Searching for flu. Detecting influenza epidemics using search engine query data. Monday, February 18, 13

Modern Epidemiology A New Computational Science

CHALLENGES IN INFLUENZA FORECASTING AND OPPORTUNITIES FOR SOCIAL MEDIA MICHAEL J. PAUL, MARK DREDZE, DAVID BRONIATOWSKI

University of Colorado Denver. Pandemic Preparedness and Response Plan. April 30, 2009

What do epidemiologists expect with containment, mitigation, business-as-usual strategies for swine-origin human influenza A?

Tracking Disease Outbreaks using Geotargeted Social Media and Big Data

Introduction to Reproduction number estimation and disease modeling

Network Science: Principles and Applications

Running head: INFLUENZA VIRUS SEASON PREPAREDNESS AND RESPONSE 1

TJHSST Computer Systems Lab Senior Research Project

FORECASTING THE DEMAND OF INFLUENZA VACCINES AND SOLVING TRANSPORTATION PROBLEM USING LINEAR PROGRAMMING

Influenza; tracking an emerging pathogen by popularity of Google Searches

Agent-Based Models. Maksudul Alam, Wei Wang

USING SOCIAL MEDIA TO STUDY PUBLIC HEALTH MICHAEL PAUL JOHNS HOPKINS UNIVERSITY MAY 29, 2014

Pandemic H1N1 2009: The Public Health Perspective. Massachusetts Department of Public Health November, 2009

Global updates on avian influenza and tools to assess pandemic potential. Julia Fitnzer Global Influenza Programme WHO Geneva

Pandemic Influenza: Impact and Challenges. Massachusetts Department of Public Health February, 2007

Influenza: The Threat of a Pandemic

Interim WHO guidance for the surveillance of human infection with swine influenza A(H1N1) virus

Conflict of Interest and Disclosures. Research funding from GSK, Biofire

Statistical modeling for prospective surveillance: paradigm, approach, and methods

Pandemic Influenza Planning:

Ongoing Influenza Activity Inference with Real-time Digital Surveillance Data

Devon Community Resilience. Influenza Pandemics. Richard Clarke Emergency Preparedness Manager Public Health England South West Centre

A step towards real-time analysis of major disaster events based on tweets

Peterborough County-City Health Unit Pandemic Influenza Plan Section 1: Background

Religious Festivals and Influenza. Hong Kong (SAR) China. *Correspondence to:

arxiv: v1 [cs.si] 29 Jan 2018

2009 H1N1 (Pandemic) virus IPMA September 30, 2009 Anthony A Marfin

Exercises on SIR Epidemic Modelling

Promoting Public Health Security Addressing Avian Influenza A Viruses and Other Emerging Diseases

= Λ μs βs I N, (1) (μ + d + r)i, (2)

H1N1 Influenza. Faculty/Staff Meeting Presentation Minnesota State College Southeast Technical September 11, 2009

Mobile Health Surveillance: The Development of Software Tools for. Monitoring the Spread of Disease

Core 3: Epidemiology and Risk Analysis

2017 DISEASE DETECTIVES (B,C) KAREN LANCOUR National Bio Rules Committee Chairman

Ralph KY Lee Honorary Secretary HKIOEH

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges

Abstract. But what happens when we apply GIS techniques to health data?

Pandemic Influenza. Continuity of Operations (COOP) Training for Behavioral Health Service Providers

Public Health Wales CDSC Weekly Influenza Surveillance Report Wednesday 21 st January 2015 (covering week )

Infectious disease modeling

Avian influenza Avian influenza ("bird flu") and the significance of its transmission to humans

2016 DISEASE DETECTIVES (B,C) KAREN LANCOUR National Bio Rules Committee Chairman

Time series analyses and transmission models for influenza

Telehealth Data for Syndromic Surveillance

Towards an open-source, unified platform for disease outbreak analysis using

Inference Methods for First Few Hundred Studies

H1N1 Flu Virus (Human Swine Influenza) Information Resources (last updated May 6th, 2009, 5 pm)

MODELLING THE SPREAD OF PNEUMONIA IN THE PHILIPPINES USING SUSCEPTIBLE-INFECTED-RECOVERED (SIR) MODEL WITH DEMOGRAPHIC CHANGES

Matt Smith. Avian Flu Model

Influenza. Paul K. S. Chan Department of Microbiology The Chinese University of Hong Kong

SURVEILLANCE & EPIDEMIOLOGIC INVESTIGATION NC Department of Health and Human Services, Division of Public Health

Jonathan D. Sugimoto, PhD Lecture Website:

Influenza : What is going on? How can Community Health Centers help their patients?

Influenza. Gwen Clutario, Terry Chhour, Karen Lee

Middle East respiratory syndrome coronavirus (MERS-CoV) and Avian Influenza A (H7N9) update

H1N1 Pandemic The medical background. Marita Mike MD, JD Center for Health and Homeland Security

Durham Region Influenza Bulletin: 2017/18 Influenza Season

Communication Challenges and Opportunities in the Influenza Season: Vaccine Effectiveness and Options

Pandemic Influenza Preparedness

MAE 298, Lecture 10 May 4, Percolation and Epidemiology on Networks

Swine flu deaths expected to rise

How Math (and Vaccines) Keep You Safe From the Flu

STARK COUNTY INFLUENZA SNAPSHOT, WEEK 15 Week ending 18 April, With updates through 04/26/2009.

Pandemic Influenza Preparedness and Response

Machine Learning for Population Health and Disease Surveillance

Article Epidemic Analysis and Mathematical Modelling of H1N1 (A) with Vaccination

Forecasting Influenza Levels using Real-Time Social Media Streams

Global infectious disease surveillance through automated multi lingual georeferencing of Internet media reports

TABLE OF CONTENTS. Peterborough County-City Health Unit Pandemic Influenza Plan Section 1: Introduction

Year 9 Science STUDY GUIDE: Unit 2 Human Coordination

Influenza A(H1N1)2009 pandemic Chronology of the events in Belgium

Modelling global epidemics: theory and simulations

Step 1: Learning Objectives

Outbreak Response/Epidemiology Influenza Weekly Report Arkansas

FUNNEL: Automatic Mining of Spatially Coevolving Epidemics

TJHSST Computer Systems Lab Senior Research Project

Epidemiology Treatment and control Sniffles and Sneezes Mortality Spanish flu Asian flu Hong Kong flu The Swine flu scare

Applicable Sectors: Public Health, Emergency Services.

Strategies for containing an emerging influenza pandemic in South East Asia 1

Influenza B viruses are not divided into subtypes, but can be further broken down into different strains.

Avian Influenza and Other Communicable Diseases: Implications for Port Biosecurity

Incidence of Seasonal Influenza

Transcription:

Towards Real Time Epidemic Vigilance through Online Social Networks SNEFT Social Network Enabled Flu Trends Lingji Chen [1] Harshavardhan Achrekar [2] Benyuan Liu [2] Ross Lazarus [3] MobiSys 2010, San Francisco, CA, USA [1] Scientific Systems Company Inc, Woburn, MA [2] Computer Science Department, University of Massachusetts Lowell [3] Department of Population Medicine - Harvard Medical School

Outline Background Related Work Our Approach SNEFT System Architecture Detection and Prediction Initial Stage Results Conclusion

Seasonal flu Influenza (flu) is contagious respiratory illness caused by influenza viruses. Seasonal - wave occurrence pattern. 5 to 20 % of population gets flu 200,000 people are hospitalized from flu related complications. 36,000 people die from flu every year in USA. worldwide death toll is 250,000 to 500,000. Epidemiologists use early detection of disease outbreak to reduce no. of people affected.

Historical Background Historical Data Flu Pandemic /1918 Spanish flu SARS Swine Flu/H1N1 Cause overreaction of body s immune system SARS coronavirus Swine Influenza Virus Origin USA & France before getting to Spain. Guangdong, China USA and Mexico Infected Masses/Areas predominant in healthy young adults as opposed to juvenile,elderly or weak. 37 countries including USA 207 countries Timeline Infected cases Deaths Mar 1918 - Jun 1920 {World war I} 500 million {1/3 of world s population} 50 million (3% of world s population) {1.6 billion at that time} Nov 2002 - Jul 2003 Aug 2009 onwards 8,273 622,482 so far 775 15,174 so far

Related Work :- Google Flu Trends Certain Web Search terms are good Indicators of flu activity. Google Trend uses Aggregated search data on flu indicators. Estimate current flu activity around the world in real time. Accuracy of data {not every person who searches for Flu is sick} Link:- www.google.com/flutrends CDC stands for Center for Disease Control

Our Approach OSN emerged as popular platform for people to make connections,share information and interact. OSN represent a previously untapped data source for detecting onset of an epidemic and predicting its spread. { i am down with flu, get well soon } msg exchange between users provide early,robust predictions. Twitter/Facebook mobile users tweet/ posts updates with their geo-location updates. helps in carrying out refined analysis. User demographics like age, gender, location, affiliated networks.,etc can be inferred from data. snapshot of current epidemic condition and preview on what to expect next on daily or hourly bases. User Population (in millions) FaceBook:- 400, Myspace:- 200, Twitter:- 80

System Architecture of SNEFT Data Collection Engine downloader ILI Data ARMA Model ILI Prediction Internet crawler OSN Data Novelty Detector Flu Warning OSN models Filter / Predictor State Estimate Math models ILI stands for Influenza-Like Illness

Components of SNEFT Architecture Data Collection Downloader :- stores CDC ILI data/reports into ILI Database. Crawler :- collect publicly available data from online social networking sites. choose a list of keywords that are likely to be of significant. use OSN public search interfaces to collect relative keyword frequencies. store relevant information in a OSN spatio-temporal database. Novelty Detection Detecting transition from "normal" baseline situation to a pandemic in real time by monitoring volume and content of OSN data. provide timely {early stage} warning to public health authorities for investigations.

Components of SNEFT Architecture ILI prediction / ARMA [Auto-regressive Moving Average] Model build ARMA model to predict ILI incidence as a linear function of current and past OSN data and past ILI data. provide valuable preview of ILI cases well ahead of CDC reports. Integration with mathematical models Mathematical models to understand dynamics of influenza spread & effects of intervention. parameters are obtained by fitting historical data. build an "OSN sensor model" which describes what would be observed on OSN if the population is infected as such and such." integrate real time OSN data with the prediction of mathematical models, to obtain a posterior estimate of the " infected state" of the population. possible parameter values not consistent with OSN observations are weighted less, while those consistent are weighted more. OSN data "sharpen" the prediction of mathematical models.

OSN Data Collection Facebook Search API Result Set (Public Posts) containing Keywords HTML Content Scrapper Content, Timestamp Database Profile Id Facebook Profile Scan Engine Individual Users Organizations Profile Info, Location Details Community Design of the Facebook data collection engine / Crawler

Facebook Data Collection / Crawler Facebook Search Engine sign-in with a valid account. enter keyword to search with "Post by everyone" option to retrieve status updates and posts of users containing the keyword. Result Set containing Keyword Privacy settings :- user can publish his post/update to friends, group, or everyone. The "everyone" option (default setting) makes corresponding updates available to public and searchable by Facebook search engine. Results are available for public viewing for limited time span.

Facebook Data Collection / Crawler HTML Content Scrapper a screen scrapper for web pages. extract useful information out of posts that are returned as result set from the keyword search. Search response HTML content is input onto DOM Parser/Regular expression matcher and techniques of pattern matching are applied. retrieve profile ID time-stamp of the post post content {with story_id}.

Facebook Data Collection / Crawler Facebook Profile Scan Engine Given a profile ID, we will retrieve the detailed information of the profile name gender age affiliations (school, work, region) birthday location education history friends count. Profile last update time. profile may belong to an individual user, an organization, or a community.

Constraints in OSN Data Collection Search Rate Limit Return Result Limit User Activity Pattern disparities in user activities different hours of the day days of a week special holidays. Continuous Data Collection/ prevent Data Loss schedule search time to guarantee complete set of blog posts containing the keywords, no gap in the collected data.

Mitigation Search Rate Limit Constraint Resolution launch multiple concurrent search sessions from different IP addresses. to coordinate among themselves and collect data at different time intervals so that each session is within the search rate limit. Return Result Limit continuous http request and store response. Continuous Data Collection mechanism

Exponentially Weighted Moving Average (EWMA) scheme in OSN Data Collection EWMA Scheduling Mechanism to prevent data Loss volume of returned search results determine no. of active search sessions. Denote the estimated average and current search result volume at search round k by v(k) and u(k), respectively, α is the smoothing factor that reflects the weight of the previous estimate. EWMA(search result volume) is computed as follows: v(k) = αv(k-1) + (1 - α)u(k) If the required rate exceeds the rate limit, new search sessions will be triggered to share the load. When the search result volume becomes lighter, the number of active search sessions will be reduced.

Detection and Prediction (SIR Model) Susceptible-Infectious-Removed (SIR) model where the dynamics of the population in each compartment is described by ds = -βsi; dt Susceptible di = βsi -ϒI; dt Loss of immunity Infection R = N - S - I; N being the total population, β the transmission rate, ϒ the recovery rate. Removed Recovery Infectious let x(t) be the "state" of the population, which in this case is given by x = [S,I] T. θ be the parameter vector used in model, which is given by θ= [β,ϒ] T. (or death) Transition Probability of disease spread Prob(x(t+1) x(t), θ)

Initial Stage Results

Conclusion and Future Work achieve faster and near real time detection. predict emergence and spread of influenza epidemic. presented the design of a system called SNEFT, for collecting and aggregating OSN data, extracting information from it, and integrating it with mathematical models of influenza. OSN data - individually noisy but collectively revealing. potential use - disaster relief, supply chain management, epidemic vigilance.

Thank You