showcase the utility of models designed to incorporate zeros from multiple generating processes. We will examine predictors of absences as a vehicle

Size: px
Start display at page:

Download "showcase the utility of models designed to incorporate zeros from multiple generating processes. We will examine predictors of absences as a vehicle"

Transcription

1 Lauren Porter, Gloria Yeomans-Maldonado, Ann A. O Connell Not All Zero s Are Created Equal: Zero-Inflated and Hurdle Models for Counts with Excess Zeros Background: School absenteeism has been shown to be associated with a variety of individual, family and schoolbased factors (Kearney, 2008). In the general student population, the distribution of counts of student absences is usually characterized by a marked positive skew due to a high frequency of zero absences. Studies examining predictors of absenteeism have used varied approaches to address the distribution of student absences and the preponderance of excess zeros including dichotomization of the outcome (e.g., normal- vs high-absence) ordinal grouping (e.g., no/low/high absences) or analyses of subgroups of students that ignore and omit non- or low-absentee individuals. Our aim in this paper is to contribute to the methodological literature for examining absenteeism and clarify the benefits and challenges of using count models for data with excess zeros. In particular, we distinguish between zero-inflated and hurdle models, for which the generation of zeros are differentially conceived, and demonstrate their application using data from the National Longitudinal Survey of Youth. Traditional regression-based methods for working with count data, such as ordinary least squares models, assume a normal distribution of the residuals and present the risk of under-inflating the influence of excess zeros (Hu, Pavlicova, & Nunes, 2011). Poisson regression is a model specifically for counts that is free of the assumption of normal residuals, but carries its own assumption about the shape of the distribution; namely, that the mean and variance of the count distribution are equal. This property is rarely met in practice, particularly when excess zeros are present. The Poisson distribution is a discrete probability distribution expressing the chances of a given number of events occurring in a fixed interval of time (Haight, 1967) and can be considered as a special case of the Negative Binomial (NB) distribution. The NB distribution is also a discrete probability distribution for the number of cases falling into one of two outcome categories, without the Poisson constraint of the mean and variance being identical. A limitation of NB regression is the potential confound between whether the appearance of overdispersion is a result of actual overdispersion or an error in the systematic part of the model (Berk & MacDonald, 2008). An alternative to Poisson models for count data with or without adjustment for overdispersion, zeroinflated models provide a way of treating data with a preponderance of zeroes. These models consist of two distributions: one with only zeroes, and another with both zeroes and non-zeroes (Lambert, 1992). This model is essentially a mixture of a Bernoulli distribution and a Poisson distribution for which the count model has a nonzero probability of generating zeroes. Limitations of this approach include the assumption that the zeroes all came from the same place (SAS, 2015). Hurdle regression, originating with Cragg (1971) and later popularized by Mullahy (1986) comprises two separate processes in one analysis: both binary and count processes. Bernoulli probability governs the binary outcome and, for non-zero cases, the conditional distribution of the positives is governed by a truncated-at-zero count data model. One instance of hurdle models in educational literature assesses misreported binary outcomes in randomized control trials (Schochet, 2013). An addition example uses zero-inflated and overdispersed count models to explore school suspensions (Desjardins, 2016). We will demonstrate the utility of these approaches for absenteeism data. Purpose: We focus on the defining features of the Zero-Inflated and Hurdle models as applied to educational data, with both Poisson and Negative Binomial assumptions. Our empirical illustration will

2 showcase the utility of models designed to incorporate zeros from multiple generating processes. We will examine predictors of absences as a vehicle to highlight the usefulness and applicability of these approaches. Sample and Data Collection: Pre-existing public-use data were used for this research. Administrative records documenting 9th grade absences from the National Longitudinal Survey of Youth (NLSY79) were analyzed, consisting of a nationally representative sample of 12,686 individuals years old initially interviewed in Cases with missing data for 9 th grade absences were excluded from analysis. Research Design and Analysis: Data were analyzed through the following regression methods: Poisson, Negative Binomial, Zero- Inflated Poisson, Zero-Inflated Negative Binomial, and Hurdle Negative Binomial. The results of these analyses will serve as a demonstration of how these models can be best applied in education research for absenteeism. Finite Mixture Modeling (FMM) in SAS 9.4 was used to analyze the data. Variables include the dependent variable (number of student absences), and four predictors: measure of general health, enrollment in remedial English, household income, and how many time the student was charged for an illegal act (Charlton et al., 1991; Havik, Bru, & Ertesvåg, 2015).For space considerations, individual and family-context are demonstrated here; our presentation will also examine school-context predictors. Results: Figure 1 displays the dependent variable, demonstrating substantial amounts of zeroes. Table 1 presents parameter estimates for the models. The Poisson Hurdle model experienced convergence errors; these errors will be addressed leading up to the presentation. One variable, the number of times a student is charged with illegal activity, is a significant predictor across all models. Controlling for all other variables in the model, every charge of illegal activity resulted in an expected increase of absences. More telling is the difference in the mixing probabilities between Zero-inflated and Hurdle models. The Negative Binomial Hurdle model indicates that 25.72% of students individual position on this distribution is accounted for by either the binary or truncated negative binomial processes. The Zero-Inflated Negative Binomial model indicates that 98.25% of students individual position on this distribution is accounted for by either the binary or truncated negative binomial processes. Conclusions: The Zero-Inflated and Hurdle model s utility lies in their ability to handle zeroes in ways that are useful to educational researchers. The Hurdle model assumes multiple data-generating processes are at play, which is not an assumption that can typically be made without first-hand knowledge of the data collection process and outcome measure. Additional features of both approaches will be highlighted during the presentation, including the importance of data collection knowledge, a potential source of bias in handling cases with excess zeroes.

3 References Berk, R., & MacDonald, J. M. (2008). Overdispersion and poisson regression. Journal of Quantitative Criminology, 24(3), Charlton, A., Larcombe, I. J., Meller, S. T., Jones, P. H. M., Mott, M. G., Potton, M. W., Walker, J. J. P. (1991). Absence from school related to cancer and other chronic conditions, (36), Cragg, J. G. (1971). Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods Author ( s ): John G. Cragg. Econometrica, 39(5), Desjardins, C. D., & Desjardins, C. D. (2016). Modeling Zero-Inflated and Overdispersed Count Data : An Empirical Study of School Suspensions Modeling Zero-Inflated and Overdispersed Count Data : An Empirical Study of School Suspensions, 973(July). Haight, F. A. (1967). Handbook of the Poisson distribution. New York: Wiley. Havik, T., Bru, E., & Ertesvåg, S. K. (2015). truancy-related reasons for school non-attendance, (123), Hu, M.-C., Pavlicova, M., & Nunes, E. V. (2011). Zero-Inflated and Hurdle Models of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction Intervention Trial. The American Journal of Drug and Alcohol Abuse, 37(5), Lambert, D. (1992). Zero-Inflated Poisson With an Regression, in Manufacturing to Defects Application. Technometrics, 34(1), Mullahy, J. (1986). Specification and testing of some modified count data models. Journal of Econometrics, 33, National Longitudinal Survey of Youth. The NLSY79 Sample: An Introduction. Obtained August 31, from sample-introduction. SAS. Usage Note 48506: Fitting Hurdle Models. Obtained Jan, 15, 2015 from Schochet, P. Z. (2013). A Statistical Model for Misreported Binary Outcomes in Clustered RCTs of Education Interventions. Journal of Educational and Behavioral Statistics, 38(5),

4 Figure 1: NLSY Data Histogram Table 1 Model Parameter Estimates Model Estimate SE p-value Absences Mixing Probabilities Poisson Intercept < N/A General Health < Remedial English Activities < K) < Zero-Inflated Poisson Intercept < Component 1 =.9325 General Health < Component 2 = Remedial English < Activities < K) < Negative Binomial Intercept < N/A

5 General Health Remedial English Activities K) Zero-Inflated Negative Binomial Intercept < Component 1 =.9825 General Health Component 2 = Remedial English Activities K) Hurdle Negative Binomial Intercept < Component 1 =.2572 General Health Component 2 = Remedial English Activities K)

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Design, Sampling, and Probability

Design, Sampling, and Probability STAT 269 Design, Sampling, and Probability Three ways to classify data Quantitative vs. Qualitative Quantitative Data: data that represents counts or measurements, answers the questions how much? or how

More information

Correlation and regression

Correlation and regression PG Dip in High Intensity Psychological Interventions Correlation and regression Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Correlation Example: Muscle strength

More information

Statistical Models for Dental Caries Data

Statistical Models for Dental Caries Data Statistical Models for Dental Caries Data David Todem Division of Biostatistics, Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, USA 5 1. Introduction Tooth decay

More information

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016 Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP233201500069I 5/2/2016 Overview The goal of the meta-analysis is to assess the effects

More information

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide George B. Ploubidis The role of sensitivity analysis in the estimation of causal pathways from observational data Improving health worldwide www.lshtm.ac.uk Outline Sensitivity analysis Causal Mediation

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Multiple Approaches to Absenteeism Analysis

Multiple Approaches to Absenteeism Analysis Cornell University ILR School DigitalCommons@ILR CAHRS Working Paper Series Center for Advanced Human Resource Studies (CAHRS) 3-1-1996 Multiple Approaches to Absenteeism Analysis Michael C. Sturman Cornell

More information

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions 1/29 How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions David Huh, PhD 1, Eun-Young Mun, PhD 2, & David C. Atkins,

More information

How to analyze correlated and longitudinal data?

How to analyze correlated and longitudinal data? How to analyze correlated and longitudinal data? Niloofar Ramezani, University of Northern Colorado, Greeley, Colorado ABSTRACT Longitudinal and correlated data are extensively used across disciplines

More information

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen

More information

The Scholarly Commons. Cornell University School of Hotel Administration

The Scholarly Commons. Cornell University School of Hotel Administration Cornell University School of Hotel Administration The Scholarly Commons Articles and Chapters School of Hotel Administration Collection 1999 Multiple Approaches to Analyzing Count Data in Studies of Individual

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

PLEASE SCROLL DOWN FOR ARTICLE

PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [CDL Journals Account] On: 27 January 2009 Access details: Access Details: [subscription number 785022368] Publisher Informa Healthcare Informa Ltd Registered in England

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Results. NeuRA Motor dysfunction April 2016

Results. NeuRA Motor dysfunction April 2016 Introduction Subtle deviations in various developmental trajectories during childhood and adolescence may foreshadow the later development of schizophrenia. Studies exploring these deviations (antecedents)

More information

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale. Model Based Statistics in Biology. Part V. The Generalized Linear Model. Single Explanatory Variable on an Ordinal Scale ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10,

More information

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth 1 The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth Madeleine Benjamin, MA Policy Research, Economics and

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

Motivation Empirical models Data and methodology Results Discussion. University of York. University of York

Motivation Empirical models Data and methodology Results Discussion. University of York. University of York Healthcare Cost Regressions: Going Beyond the Mean to Estimate the Full Distribution A. M. Jones 1 J. Lomas 2 N. Rice 1,2 1 Department of Economics and Related Studies University of York 2 Centre for Health

More information

Models for HSV shedding must account for two levels of overdispersion

Models for HSV shedding must account for two levels of overdispersion UW Biostatistics Working Paper Series 1-20-2016 Models for HSV shedding must account for two levels of overdispersion Amalia Magaret University of Washington - Seattle Campus, amag@uw.edu Suggested Citation

More information

You must answer question 1.

You must answer question 1. Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Propensity Score Analysis Shenyang Guo, Ph.D.

Propensity Score Analysis Shenyang Guo, Ph.D. Propensity Score Analysis Shenyang Guo, Ph.D. Upcoming Seminar: April 7-8, 2017, Philadelphia, Pennsylvania Propensity Score Analysis 1. Overview 1.1 Observational studies and challenges 1.2 Why and when

More information

Ex-post Livestock Diseases, and Pastoralists' Averting Decisions in Tanzania

Ex-post Livestock Diseases, and Pastoralists' Averting Decisions in Tanzania Ex-post Livestock Diseases, and Pastoralists' Averting Decisions in Tanzania Mazbahul Ahamad Department of Agricultural Economics University of Nebraska-Lincoln mahamad@huskers.unl.edu mg.ahamad@gmail.com

More information

Regression Models for Count Data: Illustrations using Longitudinal Predictors of Childhood Injury*

Regression Models for Count Data: Illustrations using Longitudinal Predictors of Childhood Injury* Regression Models for Count Data: Illustrations using Longitudinal Predictors of Childhood Injury* Bryan T. Karazsia, MA and Manfred H. M. van Dulmen, PHD Kent State University Objective To offer a practical

More information

Results. NeuRA Worldwide incidence April 2016

Results. NeuRA Worldwide incidence April 2016 Introduction The incidence of schizophrenia refers to how many new cases there are per population in a specified time period. It is different from prevalence, which refers to how many existing cases there

More information

Problem solving therapy

Problem solving therapy Introduction People with severe mental illnesses such as schizophrenia may show impairments in problem-solving ability. Remediation interventions such as problem solving skills training can help people

More information

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point

More information

FACULTY OF SCIENCES Master of Statistics: Biostatistics

FACULTY OF SCIENCES Master of Statistics: Biostatistics FACULTY OF SCIENCES Master of Statistics: Biostatistics 2012 2013 Masterproef Sick leave and presenteeism in Ankylosing Spondylitis patients under treatment with Tumor Necrosis Factor (TNF) inhibitor Promotor

More information

Use of GEEs in STATA

Use of GEEs in STATA Use of GEEs in STATA 1. When generalised estimating equations are used and example 2. Stata commands and options for GEEs 3. Results from Stata (and SAS!) 4. Another use of GEEs Use of GEEs GEEs are one

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

Using statistical models to assess medical cost of hepatitis C virus

Using statistical models to assess medical cost of hepatitis C virus 0Gastroenterology and Hepatology From Bed to Bench 2012 RIGLD, Research Institute for Gastroenterology and Liver Diseases ORIGINAL ARTICLE Using statistical models to assess medical cost of hepatitis C

More information

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI /

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI / Index A Adjusted Heterogeneity without Overdispersion, 63 Agenda-driven bias, 40 Agenda-Driven Meta-Analyses, 306 307 Alternative Methods for diagnostic meta-analyses, 133 Antihypertensive effect of potassium,

More information

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy Final Proposals Read instructions carefully! Check Canvas for our comments on

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

Development of OTU Analysis in NutriGen

Development of OTU Analysis in NutriGen Development of OTU Analysis in NutriGen Integrating OTU data with other NutriGen Data Mateen Shaikh and Joseph Beyene McMaster University December 19 2014 Mateen and Joseph (McMaster) Development of OTU

More information

Technical appendix Strengthening accountability through media in Bangladesh: final evaluation

Technical appendix Strengthening accountability through media in Bangladesh: final evaluation Technical appendix Strengthening accountability through media in Bangladesh: final evaluation July 2017 Research and Learning Contents Introduction... 3 1. Survey sampling methodology... 4 2. Regression

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

investigate. educate. inform.

investigate. educate. inform. investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe

More information

South Australian Research and Development Institute. Positive lot sampling for E. coli O157

South Australian Research and Development Institute. Positive lot sampling for E. coli O157 final report Project code: Prepared by: A.MFS.0158 Andreas Kiermeier Date submitted: June 2009 South Australian Research and Development Institute PUBLISHED BY Meat & Livestock Australia Limited Locked

More information

The University of North Carolina at Chapel Hill School of Social Work

The University of North Carolina at Chapel Hill School of Social Work The University of North Carolina at Chapel Hill School of Social Work SOWO 918: Applied Regression Analysis and Generalized Linear Models Spring Semester, 2014 Instructor Shenyang Guo, Ph.D., Room 524j,

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library Research Article Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7171 One-stage individual participant

More information

Traumatic brain injury

Traumatic brain injury Introduction It is well established that traumatic brain injury increases the risk for a wide range of neuropsychiatric disturbances, however there is little consensus on whether it is a risk factor for

More information

Analyzing binary outcomes, going beyond logistic regression

Analyzing binary outcomes, going beyond logistic regression Analyzing binary outcomes, going beyond logistic regression 2018 EHE Forum presentation James O. Uanhoro Department of Educational Studies Premise Obtaining relative risk using Poisson regression Obtaining

More information

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang At the end of this session,

More information

RANDOMIZATION. Outline of Talk

RANDOMIZATION. Outline of Talk RANDOMIZATION Basic Ideas and Insights Marvin Zelen Harvard University Graybill Conference Ft.Collins, Colorado June 11, 2008 Outline of Talk Introduction Randomized vs. Observational Studies Conditioning

More information

ACHIEVING GOOD POWER WITH CLUSTERED AND MULTILEVEL DATA Keith E. Muller Department of Health Outcomes and Policy University of Florida,

ACHIEVING GOOD POWER WITH CLUSTERED AND MULTILEVEL DATA Keith E. Muller Department of Health Outcomes and Policy University of Florida, ACHIEVING GOOD POWER WITH CLUSTERED AND MULTILEVEL DATA Keith E. Muller Department of Health Outcomes and Policy University of Florida, KMuller@ufl.edu www.health-outcomes-policy.ufl.edu/muller Supported

More information

Clincial Biostatistics. Regression

Clincial Biostatistics. Regression Regression analyses Clincial Biostatistics Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a

More information

Non-Randomized Trials

Non-Randomized Trials Non-Randomized Trials ADA Research Toolkit ADA Research Committee 2011 American Dietetic Association. This presentation may be used for educational purposes Learning Objectives At the end of this presentation

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

Table. A [a] Multiply imputed. Outpu

Table. A [a] Multiply imputed. Outpu Appendix 7 (as supplied by the authors): Full regression model outputs from sub- and analysis of the Terres-Cries-de-la-Baies-James adultss from both multiply imputed complete-case analysis models. Table

More information

Kidane Tesfu Habtemariam, MASTAT, Principle of Stat Data Analysis Project work

Kidane Tesfu Habtemariam, MASTAT, Principle of Stat Data Analysis Project work 1 1. INTRODUCTION Food label tells the extent of calories contained in the food package. The number tells you the amount of energy in the food. People pay attention to calories because if you eat more

More information

Bayesian hierarchical modelling

Bayesian hierarchical modelling Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1 What is a statistical model? A statistical model:

More information

Donna L. Coffman Joint Prevention Methodology Seminar

Donna L. Coffman Joint Prevention Methodology Seminar Donna L. Coffman Joint Prevention Methodology Seminar The purpose of this talk is to illustrate how to obtain propensity scores in multilevel data and use these to strengthen causal inferences about mediation.

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

Results. NeuRA Forensic settings April 2016

Results. NeuRA Forensic settings April 2016 Introduction Prevalence quantifies the proportion of individuals in a population who have a disease during a specific time period. Many studies have reported a high prevalence of various health problems,

More information

Basic Biostatistics. Chapter 1. Content

Basic Biostatistics. Chapter 1. Content Chapter 1 Basic Biostatistics Jamalludin Ab Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine Content 2 Basic premises variables, level of measurements, probability distribution Descriptive

More information

Adjusting the Oral Health Related Quality of Life Measure (Using Ohip-14) for Floor and Ceiling Effects

Adjusting the Oral Health Related Quality of Life Measure (Using Ohip-14) for Floor and Ceiling Effects Journal of Oral Health & Community Dentistry original article Adjusting the Oral Health Related Quality of Life Measure (Using Ohip-14) for Floor and Ceiling Effects Andiappan M 1, Hughes FJ 2, Dunne S

More information

Results. NeuRA Hypnosis June 2016

Results. NeuRA Hypnosis June 2016 Introduction may be experienced as an altered state of consciousness or as a state of relaxation. There is no agreed framework for administering hypnosis, but the procedure often involves induction (such

More information

Impact Evaluation Toolbox

Impact Evaluation Toolbox Impact Evaluation Toolbox Gautam Rao University of California, Berkeley * ** Presentation credit: Temina Madon Impact Evaluation 1) The final outcomes we care about - Identify and measure them Measuring

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Differential Effects of Cigarette Price on Youth Smoking Intensity

Differential Effects of Cigarette Price on Youth Smoking Intensity Differential Effects of Cigarette Price on Youth Smoking Intensity Lan Liang, PhD Frank J. Chaloupka, PhD February 2001 Research Paper Series, No. 6 ImpacTeen is part of the Bridging the Gap Initiative:

More information

Adolescent alcohol use and educational outcomes

Adolescent alcohol use and educational outcomes University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2006 Adolescent alcohol use and educational outcomes Wesley A. Austin University of South Florida Follow this

More information

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Multilevel Data Statistical analyses that fail to recognize

More information

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 2 In Today s Class Recap Single dummy variable Multiple dummy variables Ordinal dummy variables Dummy-dummy interaction Dummy-continuous/discrete

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

Distraction techniques

Distraction techniques Introduction are a form of coping skills enhancement, taught during cognitive behavioural therapy. These techniques are used to distract and draw attention away from the auditory symptoms of schizophrenia,

More information

Study protocol v. 1.0 Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials

Study protocol v. 1.0 Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials Study protocol v. 1.0 Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials Harm Jan de Grooth, Jean Jacques Parienti, [to be determined],

More information

Bias Adjustment: Local Control Analysis of Radon and Ozone

Bias Adjustment: Local Control Analysis of Radon and Ozone Bias Adjustment: Local Control Analysis of Radon and Ozone S. Stanley Young Robert Obenchain Goran Krstic NCSU 19Oct2016 Abstract Bias Adjustment: Local control analysis of Radon and ozone S. Stanley Young,

More information

Do They Know What They are Doing? Risk Perceptions and Smoking Behaviour Among Swedish Teenagers

Do They Know What They are Doing? Risk Perceptions and Smoking Behaviour Among Swedish Teenagers The Journal of Risk and Uncertainty, 28:3; 261 286, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Do They Know What They are Doing? Risk Perceptions and Smoking Behaviour Among

More information

Are Illegal Drugs Inferior Goods?

Are Illegal Drugs Inferior Goods? Are Illegal Drugs Inferior Goods? Suryadipta Roy West Virginia University This version: January 31, 2005 Abstract Using data from the National Survey on Drug Use and Health, evidence of income inferiority

More information

Chapter 1: Data Collection Pearson Prentice Hall. All rights reserved

Chapter 1: Data Collection Pearson Prentice Hall. All rights reserved Chapter 1: Data Collection 2010 Pearson Prentice Hall. All rights reserved 1-1 Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer

More information

Manuscript Presentation: Writing up APIM Results

Manuscript Presentation: Writing up APIM Results Manuscript Presentation: Writing up APIM Results Example Articles Distinguishable Dyads Chung, M. L., Moser, D. K., Lennie, T. A., & Rayens, M. (2009). The effects of depressive symptoms and anxiety on

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton Effects of propensity score overlap on the estimates of treatment effects Yating Zheng & Laura Stapleton Introduction Recent years have seen remarkable development in estimating average treatment effects

More information

Introduction to Econometrics

Introduction to Econometrics Global edition Introduction to Econometrics Updated Third edition James H. Stock Mark W. Watson MyEconLab of Practice Provides the Power Optimize your study time with MyEconLab, the online assessment and

More information

Animal-assisted therapy

Animal-assisted therapy Introduction Animal-assisted interventions use trained animals to help improve physical, mental and social functions in people with schizophrenia. It is a goal-directed intervention in which an animal

More information

Analyzing hospitalization data: potential limitations of Poisson regression

Analyzing hospitalization data: potential limitations of Poisson regression Nephrol Dial Transplant (2015) 30: 1244 1249 doi: 10.1093/ndt/gfv071 Advance Access publication 25 March 2015 Analyzing hospitalization data: potential limitations of Poisson regression Colin G. Weaver

More information

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE Dr Richard Emsley Centre for Biostatistics, Institute of Population

More information

An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models

An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models Neil H Spencer University of Hertfordshire Antony Fielding University of Birmingham

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Brady T. West Michigan Program in Survey Methodology, Institute for Social Research, 46 Thompson

More information

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Dan A. Black University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Matching as a regression estimator Matching avoids making assumptions about the functional form of the regression

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information

Analyzing diastolic and systolic blood pressure individually or jointly?

Analyzing diastolic and systolic blood pressure individually or jointly? Analyzing diastolic and systolic blood pressure individually or jointly? Chenglin Ye a, Gary Foster a, Lisa Dolovich b, Lehana Thabane a,c a. Department of Clinical Epidemiology and Biostatistics, McMaster

More information

Results. NeuRA Treatments for internalised stigma December 2017

Results. NeuRA Treatments for internalised stigma December 2017 Introduction Internalised stigma occurs within an individual, such that a person s attitude may reinforce a negative self-perception of mental disorders, resulting in reduced sense of selfworth, anticipation

More information