UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

Similar documents
THE MAKING OF MEMORIES. November 2016

MS&E 226: Small Data

The Century of Bayes

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Patrick Breheny. January 28

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

Probability II. Patrick Breheny. February 15. Advanced rules Summary

Artificial Intelligence Programming Probability

Response to the ASA s statement on p-values: context, process, and purpose

Probability Models for Sampling

Bayesian and Frequentist Approaches

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Exploring Experiential Learning: Simulations and Experiential Exercises, Volume 5, 1978 THE USE OF PROGRAM BAYAUD IN THE TEACHING OF AUDIT SAMPLING

Hypothesis-Driven Research

Supplementary notes for lecture 8: Computational modeling of cognitive development

Bayesians methods in system identification: equivalences, differences, and misunderstandings

August 29, Introduction and Overview

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

Evaluating you relationships

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Bayesian performance

Bradford District Community Advice Network (CAN)

AUTISM PROFESSIONALS AWARDS 2019

The Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit. Uncertainty, Risk and Probability: Fundamental Definitions and Concepts

Horizon Research. Public Trust and Confidence in Charities

Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

ADHD. What you need to know

Problem Set and Review Questions 2

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Attention and Concentration Problems Following Traumatic Brain Injury. Patient Information Booklet. Talis Consulting Limited

Enumerative and Analytic Studies. Description versus prediction

ACT Aspire. Science Test Prep

JUNIOR SEMINAR 3: MYERS-BRIGGS TYPE INDICATOR MARC TUCKER

The Next 32 Days. By James FitzGerald

FOREVER FREE STOP SMOKING FOR GOOD B O O K L E T. StopSmoking. For Good. What If You Have A Cigarette?

Lecture 9 Internal Validity

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

ADD Overdiagnosis. It s not very common to hear about a disease being overdiagnosed, particularly one

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Using Health Economics to Inform the Development of Medical Devices. Matthew Allsop MATCH / BITECIC

PSYC1024 Clinical Perspectives on Anxiety, Mood and Stress

Model calibration and Bayesian methods for probabilistic projections

Understanding Statistics for Research Staff!

Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease

CHAPTER 5: PRODUCING DATA

Coherence and calibration: comments on subjectivity and objectivity in Bayesian analysis (Comment on Articles by Berger and by Goldstein)

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Sheila Barron Statistics Outreach Center 2/8/2011

Managing Your Emotions

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling

CREATIVE EMPATHETIC PLANFUL. Presented in Partnership With

People s Panel today. You can use your views and experiences to help us help other young people.

Sensory specific satiation: using Bayesian networks to combine data from related studies

Lesson 87 Bayes Theorem

Nature of Science Review

Bayesian Adjustments for Misclassified Data. Lawrence Joseph

Chapter 1 Review Questions

Bayesian Tailored Testing and the Influence

Strategy Strategic delivery: Setting standards Increasing and informing choice. Details: Output: Demonstrating efficiency economy and value

Doing Thousands of Hypothesis Tests at the Same Time. Bradley Efron Stanford University

Chapter 8 Estimating with Confidence

Bayesian Adjustments for Misclassified Data. Lawrence Joseph

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 12. The One- Sample

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Case study. The Management of Mental Health at Work at Brentwood Community Print

The first section of this booklet will help you think about what alcohol can do to your health.

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION"

NEUROSCIENCE + CREATIVE DEVELOPMENT UNLOCK YOUR HUMAN POTENTIAL

Introduction to Research Methods

Econ 270: Theoretical Modeling 1

CPS331 Lecture: Coping with Uncertainty; Discussion of Dreyfus Reading

Anne Lise Kjaer, Kjaer Global Society Trends

An Introduction to Bayesian Statistics

Analysis of Variance (ANOVA)

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

Bayesian Analysis by Simulation

Biomedical Machine Learning

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression

Usually we answer these questions by talking about the talent of top performers.

How Food Moves Through The Body

Paul Figueroa. Washington Municipal Clerks Association ANNUAL CONFERENCE. Workplace Bullying: Solutions and Prevention. for

How will being a member of Physio First benefit you?

Study Reveals Consumer Knowledge of Government Regulations

Insight Assessment Measuring Thinking Worldwide

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Optimization and Experimentation. The rest of the story

ONCOLOGY: WHEN EXPERTISE, EXPERIENCE AND DATA MATTER. KANTAR HEALTH ONCOLOGY SOLUTIONS: FOCUSED I DEDICATED I HERITAGE

With Marie-Claire Ross The SELLSAFE Communication Mentor

Reliability and Validity

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?

Introduction: Statistics and Engineering

Choosing the right Office Chair

Choosing Life: Empowerment, Action, Results! CLEAR Menu Sessions. Substance Use Risk 2: What Are My External Drug and Alcohol Triggers?

Transcription:

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER 2016

DELIVERING VALUE WITH DATA SCIENCE BAYES APPROACH - MAKING DATA WORK HARDER The Ipsos MORI Data Science team increasingly use Bayesian techniques to help our clients make the most of their data. This approach is particularly useful for: BUT WHO IS BAYES AND WHAT IS A BAYESIAN APPROACH? It may surprise you to learn that the side project of a Presbyterian minister in Tunbridge Wells in the mid-18th century is still influencing how we look at big data sets and make predictions today. Reverend Thomas Bayes gave his name to a very simple piece of mathematics which links together information from two stages in a process, or from two sources, to assess probability, or make a prediction this is Bayes Theorem or Rule. BAYES THEOREM EXPLAINED In its simplest form Bayes Theorem involves working backwards from an observation to estimate the probability of how it happened. In the classic formula below, A and B are events, with A happening (or not) before B, P() means the probability of the event in the brackets and the bar mean given that :- making more accurate predictions improving the veracity of complex models compensating where data is sparse Thomas Bayes 1701 1761 One example involves the accuracy of medical tests and their use for screening. Suppose there is a disease for which the test is 99% accurate (ie 1 time out of 100 it gives the wrong result). Now suppose I take the test and it is positive. What is the chance that I have the disease? A Bayes approach says that it depends what my chances of having the disease were before I took the test. So if the disease only affects 1 person in 10,000, then my positive test result is much more likely to be a false positive. This is why a doctor will often only test for a disease if there are other reasons (such as symptoms ) which increase your chances of having it. P (B A) P (A) P (A B) = P (B) So: The Probability of A having happened, given that B has happened is equal to the probability of B happening given that A happened, multiplied by the probability of A happening, divided by the probability of B happening.

To put this into practice, an example from my morning The equation below shows the two ways that we can commute might help. I catch a train to work from a get to the outcome train was late. small station, where just two different drivers are used during the week. Talking to the station staff I discover: Andy works three days, Monday to Friday, while Bob works the other two, but Bob s work days are completely random. When Andy drives, the train is late 50% of the time, but when Bob drives it is late 80% of the time. This morning the train was late so who is more likely to have been the driver? Well of course either driver could have driven the train. One argument is that it is always more likely WE CAN WRITE THE PROBABILITY THAT ANDY WAS THE DRIVER OF THE LATE TRAIN AS: Prob (Andy was the driver, GIVEN THAT the train was late). And using Bayes Theorem, this = { Prob (Andy being the driver) x Prob (train was late) Andy works three days, Monday to Friday, while Bob works the other two, but Bob s work days are completely random. When Andy drives, the train is late 50% of the time, but when Bob drives it is late 80% of the time. This morning the train was late so who is more likely to have been the driver? to modify that assumption so that we get both an estimated value for this thing, as well as an indication of the reliability of that estimate. The reliability will depend on the variation in the data, the sample size, but also now on the veracity of the initial assumption so if the approach is to be beneficial then we need our initial assumption to be good! However, if we assume a flat initial assumption, whereby we have no prior assumptions (statisticians refer to this as a uniform prior ) then the Bayesian } result and the traditional result are numerically the same. So a Bayesian approach should be no worse to have been Andy (because he works more days and in some situations can give us a better result. each week), while another is that Bob is more likely The key to being Bayesian is describing the world in because he is more likely to be late. Bayes Theorem combines those two elements together to give an exact answer. And y or? b o B { Prob (train was late regardless of driver) } terms of probabilities and then using data to adjust those probabilities to get a tighter range of likely = { 0.6 x 0.5 } / { 0.6 x 0.5 + 0.4 x 0.8 } BAYES APPROACH IN A NUTSHELL = 48% We can use Bayes Theorem to improve the accuracy of our predictions and modelling, thanks to Bayesian So it is slightly more likely (52% chance) that Bob drove the train this morning. inference, where Bayes Theorem is used to update the probability for a hypothesis as more evidence or information becomes available. A Bayes approach starts with an assumption about the thing we are interested in and uses observed data outcomes. WHY IS BAYES IMPORTANT? Although Bayes Theorem is described in terms of discrete variables, the rule can be extended to continuous variables, and that is often how we apply it in market research. The key principle is that we describe a situation in terms of probabilities and use data to adjust our description.

HOW IPSOS MORI APPLIES BAYESIAN TECHNIQUES TO ADD VALUE 1. Getting more accurate estimates using Bayesian Estimation we can combine results from a survey with previous knowledge or data. For example, let us suppose that based on our previous surveys on drinking behaviour, we felt that the proportion of the population who regularly drink is between 30% and 40%, with 35% the most likely value. We then survey 2,000 people and find that 38% regularly drink. Now we can combine the new observation (the survey) with the a priori distribution (a probability distribution built from our previous expectations) to create a posterior distribution, which is our new probability distribution for the underlying proportion of the population. set of probabilities of taking different values according to the value of B. This can produce more flexible models (non-linear), but requires more input from the analyst in terms of making and testing assumptions. 3. Optimising combinations - Hierarchical Bayesian models are used in choice exercises (conjoint, max diff) where we give each respondent a limited selection of all possible choices. The models are described as Bayesian, because we take a distribution of parameters, apply the observed data and use it to refine our estimated distribution. We repeat this process a number of times so the estimation process is an iterative method which starts from an initial assumed distribution and converges towards a stable solution. This technique, often used for product, price and feature optimisation allows segmentation at a more individual level based on underlying needs. SO WHAT ARE THE BENEFITS FOR AN END USER? The key benefit is the efficient use of all the data that we have both survey and other sources to obtain a more robust measurement. However there are 3 key additional benefits: 1. Where we have strong confidence in our prior information then we can quote a narrower interval of reliability. 2. We can now run multilevel modelling even with very low numbers of groups. 3. Respondent-level estimates (including conjoint utilities) mean that we can provide more intriguingly titled MAD Bayes to investigate what benefits this may bring to our analysis of big data for our clients. For example, a recent paper by Bradley Efrom (RSS Journal 2015) has shown a way to give the frequentist accuracy of a Bayesian estimate which should enable us to provide confidence intervals for a wider variety of modelling work. Finally - the health warning. Like all analytical work, we should end with a warning about limitations and reliability. There are situations where a Bayesian approach does not make sense when a population is changing for example and/or we are measuring that change. Also a warning to beware of two claims we have seen recently: 1. We don t need as many data points if we use a Bayesian approach 2. Improving key driver analysis using Bayesian Modelling and Bayesian Networks we can better understand the relationship between two variables. This might be, for example, a satisfaction score (A) alongside a performance rating for a key attribute (B). Typical modelling work attempts to find some linear relationship (eg satisfaction score = 4.1 + 0.3 * delivery performance + random error ) whereas Bayesian Networks allow for the relationship between A and B to be driven by statistical distributions, so that A has a 4. Greater granularity Small Area Estimation is where we again apply Hierarchical Bayesian approaches to build a series of separate models for different areas covered within a larger survey in order to achieve greater granularity, for example for implementation of local or regional results. sophisticated simulation tools to clients. These give better estimates of, for example, the source of volume from product changes, which we use to measure the risk to overall profitability. NEW BAYESIAN TECHNIQUES We are constantly exploring applications of new Bayesian methods such as Approximate Bayesian Computation, Variational Bayesian Methods and the 2. We don t need as robust a sample if we use a Bayesian approach. Although there may be a small element of truth behind these statements, we would in general disagree with their widespread application as a panacea to poorly designed surveys. Remember - if you are using less data or less accurate data, then you must be relying on something else. Ensure first that your assumption about the prior is robust.

ABOUT IPSOS MORI CONTACT Ipsos MORI, part of the Ipsos group, is one of the UK s largest and most innovative research agencies, working for a wide range of global businesses, the FTSE100 and many government departments and public bodies. We specialise in solving a range of challenges for our clients, whether related to business, consumers, brands or society. In the field of data science, we have a large and diverse team of experts including mathematicians, statisticians, data scientists and behavioural economists. We are constantly seeking to break new ground in the understanding and application of large and complex data sets. We are passionately curious about people, markets, brands and society. We deliver information, and analysis that makes our complex world easier and faster to navigate and inspires our clients to make smarter decisions. Clive Frostick Head of Analytics, Ipsos MORI T: +44 (0)20 8861 8755 E: clive.frostick@ipsos.com www.ipsos-mori.com @ipsosmori