Statistical Issues in Road Safety Part I: Uncertainty, Variability, Sampling

Similar documents
Handout 16: Opinion Polls, Sampling, and Margin of Error

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

SECONDARY DATA ANALYSIS: Its Uses and Limitations. Aria Kekalih

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Chapter 1: The Nature of Probability and Statistics

Outline. Chapter 3: Random Sampling, Probability, and the Binomial Distribution. Some Data: The Value of Statistical Consulting

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Sta 309 (Statistics And Probability for Engineers)

PREVENTING DISTRACTED DRIVING. Maintaining Focus Behind the Wheel of a School Bus

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Assessment Schedule 2015 Mathematics and Statistics (Statistics): Evaluate statistically based reports (91584)

To the Point of Distraction

Journal of Exclusive Management Science November Vol 6 Issue 11 ISSN

Still important ideas

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy

Math 124: Modules 3 and 4. Sampling. Designing. Studies. Studies. Experimental Studies Surveys. Math 124: Modules 3 and 4. Sampling.

The Severity of Pedestrian Injuries in Alcohol-Related Collisions

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Chapter 3. Producing Data

STATS8: Introduction to Biostatistics. Overview. Babak Shahbaba Department of Statistics, UCI

Review of Pre-crash Behaviour in Fatal Road Collisions Report 1: Alcohol

South Australian Alcohol and Other Drug Strategy

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Serious Road Traffic Injuries a perspective for Europe

Unit 1 Exploring and Understanding Data

Still important ideas

Crash Risk Analysis of Distracted Driving Behavior: Influence of Secondary Task Engagement and Driver Characteristics

STA 291 Lecture 4 Jan 26, 2010

Impaired Driving. Javier Romero MD FACS. Co-Director Department of Trauma Services Ventura County Medical Center

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Distributions and Samples. Clicker Question. Review

Lecture 4: Research Approaches

You can t fix by analysis what you bungled by design. Fancy analysis can t fix a poorly designed study.

Chapter 1: Exploring Data

Lab 2: The Scientific Method. Summary

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

CHAPTER 5: PRODUCING DATA

The Four Pillars of Traffic Safety

Understandable Statistics

Lecture Outline Biost 517 Applied Biostatistics I

Homo heuristicus and the bias/variance dilemma

Chapter 3. Producing Data

Drunk Driving in Injury and Fatal Accidents in France in Interaction between Alcohol and other Offences

Design, Sampling, and Probability

How Safe Are Our Roads? 2016 Checkpoint Strikeforce campaign poster celebrating real area cab drivers as being Beautiful designated sober drivers.

Study Methodology: Tricks and Traps

The Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit. Uncertainty, Risk and Probability: Fundamental Definitions and Concepts

Alcohol Addiction. Peer Pressure. Handling Social Pressures. Peer Pressure 2/15/2012. Alcohol's Effect on One s Health and Future.

Survey Research Centre. An Introduction to Survey Research

I just didn t see it It s all about hazard perception. Dr. Robert B. Isler Associate Professor School of Psychology University of Waikato Hamilton

STA Module 9 Confidence Intervals for One Population Mean

Welcome to this third module in a three-part series focused on epidemiologic measures of association and impact.

Distracted Driving. Stephanie Bonne, MD

UNDER-REPORTING OF ACCIDENTS

Validity and reliability of measurements

Data collection, summarizing data (organization and analysis of data) The drawing of inferences about a population from a sample taken from

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Expert judgements in risk and reliability analysis

Distraction, Cognition, Behaviour and Driving Analysis of a large data set

Welcome to OSA Training Statistics Part II

Vocabulary. Bias. Blinding. Block. Cluster sample

Canadian Attitudes Toward Random Breath Testing (RBT) Conducted by Ipsos Reid April, 2010

Chapter 2 Doing Sociology: Research Methods

investigate. educate. inform.

Math 124: Module 3 and Module 4

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Recreational marijuana and collision claim frequencies

A situation analysis of health services. Rachel Jewkes, MRC Gender & Health Research Unit, Pretoria, South Africa

Biost 590: Statistical Consulting

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Math 140 Introductory Statistics

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

Subject: The Case for a Provincial 0.00% BAC Limit for All Drivers Under the Age of 21

REVIEW FOR THE PREVIOUS LECTURE

MS&E 226: Small Data

Statistics: A Brief Overview Part I. Katherine Shaver, M.S. Biostatistician Carilion Clinic

Analysis of TB prevalence surveys

Current New Zealand BAC Limit. BAC (mg/100ml)

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b

Profiling Drivers Risky Behaviour Towards All Road Users

Pooling Subjective Confidence Intervals

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Strategic Sampling for Better Research Practice Survey Peer Network November 18, 2011

INVESTIGATION SLEEPLESS DRIVERS

Defining and Measuring Recent infection

Crash Risk of Alcohol Impaired Driving

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

Objectives. Data Collection 8/25/2017. Section 1-3. Identify the five basic sample techniques

Elaboration for p. 20:

First Problem Set: Answers, Discussion and Background

Chapter 2 Psychological Research, Methods and Statistics

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Running head: FALSE MEMORY AND EYEWITNESS TESTIMONIAL Gomez 1

FMEA and RCA: the mantras * ; of modern risk management. FMEA and RCA really do work to improve patient safety. COMMENTARY FMEA and RCA.

Introduction to biostatistics & Levels of measurement

Objectives. Distinguish between primary and secondary studies. Discuss ways to assess methodological quality. Review limitations of all studies.

Ethics of Connected and Automated Vehicles

Measuring cancer survival in populations: relative survival vs cancer-specific survival

Examining Relationships Least-squares regression. Sections 2.3

Transcription:

Statistical Issues in Road Safety Part I: Uncertainty, Variability, Sampling Shrikant I. Bangdiwala, PhD McMaster University Hamilton, Canada 1

Why statistics in road safety research? Our questions are not simple: When and how accidents occur? Understanding a situation observe & estimate Why accidents occur? Understanding relationships observe & estimate; association models Exposure What can affect occurrence of accidents? Outcome Evaluation of actions experimental studies; intervene and then observe & estimate; Action Outcome test effectiveness 2

Why statistics in road safety research? Road safety and traffic issues are complex: When and how accidents occur? Multiple inter-connected factors Why accidents occur? Multiple factors may be associated; but causal relationship? What is the risk of occurrence? probability, chance, usually not 0 or 1 What can affect occurrence of accidents? Variability in exposures and in probabilities of occurrence Need proper experimental designs Uncertainties 3

Statistics definitions I & II Old definition measurements of the state: stat & ics Summarized description of the population Still used today: Census e.g. injury surveillance, FARS, IRTAD Counts: Police records of reported crashes e.g. FIR All hospitalizations due to trauma All insurance claims for injuries/deaths Definition based on how to misuse/abuse information A way to cheat and lie A way to find results Over-emphasis on significance and p-values 4

Statistics definition III Scientific definition measurements on a sample from the population 5

Role of statistics in addressing our questions Addressing our research questions in the face of uncertainty Inherent variability in what we are studying Incompleteness of information from sampling Role of chance Statistics is the methodological science that allows for the understanding of quantitative information in the midst of uncertainty 6 Quantify it, Understand it, Reduce it, Control it Probability (risk) models Descriptive analyses Controlled studies Regression models

Modeling risks We want to understand risks We need to control uncertainty in the estimation of the risks Risk model of a trend in a given locality mathematical functions Risk models in multiple individuals or localities statistical models Statistical methods are concerned with ways to control uncertainty reduce variability reduce sampling uncertainty to understand estimates of risks or relationships among quantitative factors and risks in a population 7

Probabilities are not well understood A probability is a theoretical mathematical concept Derived from theoretical postulates updated with data [Bayes] Estimated from data frequency approach Properties 0 1 A Not A A Pr(A B) B Not A or B Pr(A) + Pr(Not A) = 1 Pr(A B) = Pr(A) + Pr(B) Pr(A B) Pr(A B) + Pr(Not A or B) = 1 8

Probabilities are not well understood A probability is a prediction in the future, it does not provide a certainty What is the probability of electrocution? Is the probability of rain wrong? 9

Probabilities are not well understood Relative risks of driving under different scenarios against not using phone Talking on a handheld phone Talking on a hands-free phone Drunk with BAC=0.10% Texting or reading email Talking with an adult passenger 10

Probabilities are not well understood Probabilities of being in a crash are low 100% Proportion willing to take preventive action 0% 0 1 Probability of being in a crash But the expected loss is HIGH: E(L) = Pr(crash) * L(per crash) * Exposure(t) 11

Uncertainty When we estimate risks as a probability we do it with uncertainty!! Instantaneous conditional risk hazard function h(t X) Number of accidents/victims distribution function (e.g. Poisson model, Negative binomial model, ) Example: Delhi pedestrian risks from individual to collective Individual risk is very low ~ 0.00007 = 7 * 10-5 [how obtained?] Collective risk is high since exposure is high 13,000,000 exposed [who is exposed?] expect 910 pedestrian fatalities 12

Trends in road fatalities Provides estimates of risk of dying in a crash https://community.data.gov.in/stateut-wise-number-of-persons-killed-in-road-accidents-per-lakh-population-during-2009-2012/ 13

Uncertainty When we estimate risks we do it with uncertainty!! Addressing our research questions in the face of uncertainty Inherent variability in what we are studying Incompleteness of information from sampling Role of chance Also: measurement error in everything we study! Estimating numerator: outcomes Estimating denominator: exposures 14

The study of variability Every crash is so particularly, uniquely different Statisticians do NOT study individual crashes or persons, but study groups of crashes or persons The behavior of the group is called the distribution of the behavior Researchers focus on the central tendency (mean, median, mode) Statisticians focus on the variability (variance, range) Abbreviated Injury Score (AIS) in emergency department patients 15

Incompleteness Uncertainty In order to understand a situation must study several occurrences HOW MANY? Since we cannot usually study ALL situations, we study an incomplete subset A sample is never complete, leading to uncertainty How representative is it of the complete set? 16

Why do we have uncertainty? Uncertainty from variability & incompleteness Assume we want to study a population POPULATION 17

Why do we have uncertainty? Uncertainty from variability & incompleteness POPULATION If all in a population are exactly the same, then we need to study 18

Why do we have uncertainty? Uncertainty from variability & incompleteness POPULATION Subjects in a population are NOT exactly the same, so then we need to study 19

Why do we have uncertainty? Uncertainty from variability & incompleteness POPULATION We sample a few We have observed an incomplete part of the population Q1: Is the sample representative? Q2: Is the sample size adequate? 20

Why do we have uncertainty? Uncertainty from chance We sample a few Chance gave us the following sample Q1: Is the sample representative? POPULATION 21

Why do we have uncertainty? Uncertainty from chance We sample a few Chance gave us the following sample Q1: Is the sample representative? POPULATION 22

Why do we have uncertainty? Uncertainty from sampling 20 possible samples of size 3 all equally likely to happen POPULATION We usually take only 1 sample Chance gives 1 of many possible The one we get is the luck of the draw!! We use it to guess at the population, but we are never certain! 23

Uncertainty How can we eliminate the uncertainty? Reduce: stratified sampling Eliminate: study the entire population! Census; all medical records; all car crashes, POPULATION There is no need for statistics, except for summarizing information but, $$$ and often impractical or impossible! 24

Sampling process How do we select the sample? Criteria Sample should be like the population representative Sample should be selected without introducing personal biases objective Sample should provide a correct estimate of the population parameter unbiased Sample should provide a precise estimate of the population parameter adequate size Probability sample = we know the probability of selection of each person in the population 25

Sampling process Probability samples Simple random sample Systematic random sample Stratified random sample Cluster random sample Area random sample Complex multi-stage probability sample What about purposively selected sample? Convenience sample = garbage sample internet sample? What about not sampling and studying the entire population? 26

What about BIG data? Large, fast computers can handle HUGE datasets Data mining methodologies permit finding trends BUT, if the HUGE dataset is biased, the bias is NOT gone 27

Other sources of uncertainty Imprecision Systematic errors biases Systematic measurement errors Recall bias Observer (instrument) bias Data sources have different quality classification bias Systematic sampling errors Selection biases Data sources different coverage Non-response bias missing data Random errors Variation due to measurement Variation due to sampling chance! 28

How can statistics help us? Statistics helps understand the behavior of quantitative data in GROUPS In a population, we want to know: Behavior of a single variable at a given time point risks Behavior of single variable over time - trends Behavior of multiple variables relationships In a sample from the population, we are able to obtain: Behavior of a single variable at a given time point estimation Behavior of single variable over time time series analyses Behavior of multiple variables regression models 29

Research questions in Road safety What are the effects on risks of doing X? X = decisions in engineering, planning, regulation & policy; education, Examine links between variables/factors and safety risks Themes Accident analysis and prevention Behavioral and social issues Trauma care services Legal and compliance issues relationships 30

Unique issues in injury research Non-constant exposure impact on appropriateness of indicators Counting rare events impact on demonstrating effects and distributional models Multiple factors complexities Intervening on the extreme cases regression to the mean Study design options observational vs experimental 31

Exercise Research Question: Do lower speeds lead to safer roads? How do we answer this question? What type of study? How we define lower? How do we define safer? Who or what do we study? How many? Who or what do we compare results to? How many? What data do we collect? How do we measure it? When do we measure? For how long do we measure? What is a meaningful relationship? How can we know if what we observe could have been due to chance? 32