Graphical Exploration of Statistical Interactions. Nick Jackson University of Southern California Department of Psychology 10/25/2013

Size: px
Start display at page:

Download "Graphical Exploration of Statistical Interactions. Nick Jackson University of Southern California Department of Psychology 10/25/2013"

Transcription

1 Graphical Exploration of Statistical Interactions Nick Jackson University of Southern California Department of Psychology 10/25/2013 1

2 Overview What is Interaction? 2-Way Interactions Categorical X Categorical Continuous X Categorical Continuous X Continuous 3-Way Interactions Categorical X Continuous X Continuous Continuous X Continuous X Continuous Time in a Three-Way Interaction 4-Way and beyond 2

3 What is an Interaction? Equivalent Statements: When the relationship between X and Y depends on the levels of a third variable Z. Z modifies the effect of X on Y. X and Y s relationship is different at differing levels of Z Also Called Moderation or Effect Modification. Moderation is a stupid term. Moderation (n): The avoidance of excess or extremes. Moderate (v): To make or become less extreme or intense Those are kinda the opposite of what we mean when we say moderation in a statistical sense. 3

4 What is an Interaction? As SEM diagrams: Z Z X*Z X Y X Y 4

5 What is an Interaction? Z Modifies the effect of X on Y Effect of X on Y if we ignore Z Z=1 Y Y Z=0 X X 5

6 Types of Interaction Quantitative Interaction Only X*Z, p<0.05 Qualitative Interaction Y X=0 X=1 X=0 X=1 Y X=0 X=1 X=0 X=1 Z=0 Z=1 Z=0 Z=1 Quantitative Interaction: Difference between X(0) and X(1) is significantly different between Z(0) and Z(1), though these differences are not qualitatively different (visually these things look to be about the same). This occurs as a result of substantial power. Qualitative Interaction: Difference between X(0) and X(1) may or may not be significantly different between Z(0) and Z(1), however these differences are qualitatively different (ie. it really does look like an interaction) 6

7 Graphing the Interaction Why Graph? Interpreting the interaction coefficient(s) is not always intuitive Two ways to graph: 1) Look at observed means/values Represents your actual data Very easy to do in any package Does not represent the statistical model being used 2) Look at marginal (predicted) means/values from regression equation A direct representation of the statistical model you are using For interactions with continuous variables, it allows you to see where the interaction is occurring. 7

8 Graphing the Interaction More about marginal (predicted) means/values from regression equation The General Idea: Take the regression equation and predict values for the different levels of your variables X and Z For any covariates, use the their mean levels An Example: Blood_Press = β 0 + β 1 Diabetes + β 2 gender + β 3 diabetesxgender Blood_Press = Diabetes + 15 gender diabetesxgender Find the predicted means: Diabetes=1, Gender=1: (1) + 15(1) (1*1)=121 Diabetes=0, Gender=1: (0) + 15(1) (0*1)=90 Diabetes=1, Gender=0: (1) + 15(0) (1*0)=95.5 Diabetes=0, Gender=0: (0) + 15(0) (0*0)=75 Can get Standard Errors of predictions, though a bit difficult. 8

9 Graphing the Interaction (Marginal Estimates) Available in most Software Packages: Margins/marginsplot command in Stata lsmeans and effects Packages in R. predict and predict.lm commands in R. Some good ways to look at interactions in R. Least-Squares Means (LSMEANS), Slicing, Contrasts, Estimate in SAS. SPSS GLM (emmeans), estimated marginal means 9

10 Two-Way Interactions Categorical X Categorical Interaction Use Bar Graphs 2 X 2: Below are equivalent representations of the same interaction so which is it? Asian White Asian White Blood Pressure Blood Pressure Male Female Male Female Among Whites, Females have a higher blood pressure than Males. Among Asians, Females have a lower blood pressure than Males. Male Female Among males, Asians have a higher blood pressure than whites. Among females, Asians have a lower blood pressure than whites. 10

11 Two-Way Interactions Continuous X Categorical Interaction Could make continuous variable categorical and use a bar graph. Better idea, Use Scatter Plots/Linear Prediction for each category Adjusted Predictions of gender with 95% CIs We can see that as BMI increases, blood pressure increases more sharply in Men than in Women. Blood Press By looking at the Confidence Intervals we can start to get an idea about when the genders diverge (statistically) in their effects body mass index (k/m-sq) male female

12 Two-Way Interactions Continuous X Categorical Interaction Look at how the Slope of Gender (differences between Men and Women) change across varying levels of BMI. We can use the 95% CI to see when these differences become significant. 20 Conditional Marginal Effects of 2.gender with 95% CIs The differences in mean blood pressure between men and women become more pronounced at higher BMI s such that women have a lower BP than men as BMI increases. These differences are statistically significant (95% CI of difference does not include 0) past a BMI of around 35. Difference in Blood Press body mass index (k/m-sq)

13 Two-Way Interactions Continuous X Categorical Interaction With more than Two Group categorical variable 13

14 Two-Way Interactions Continuous X Categorical Interaction With more than Two Group categorical variable Same as before, just plotting the differences relative to the reference group Works the same with non-linear continuous variables. 95% Confidence Intervals of the Difference in BMI between Sleep Duration Groups (Referenced to 7-8 Hours) across Age <=4 Hours of Sleep 5-6 Hours of Sleep >=9 Hours of Sleep Difference in BMI Difference in BMI Age Age Age

15 Two-Way Interactions Continuous X Continuous Interaction Traditional Methods Discretize one of the continuous variables making it categorical and do the usual procedures for categorical X continuous interactions. Usually +1 and -1 SD (This method sucks ) Can miss where the interaction occurs Newer Method: Predict values at percentiles of the continuous variables Generally avoid the extremes of the percentiles (<5 or >95) as the variability is greater at the extremes Newer Method: Use 3-D Graphing (Surface/Mesh Plots) Same ideas as predicting values at the percentiles, but utilizing a 3D modeling software

16 Two-Way Interactions Continuous X Continuous Interaction: Predicted values at percentiles bp_sys1_bl Effect Modification of bp_sys1_bl vs bmi by cholest_bl bmi*cholest_bl Interaction p= bmi At 1% cholest_bl At 99% cholest_bl 5,10,25,50,75,90,95% Percentiles of cholest_bl 16

17 Two-Way Interactions Continuous X Continuous Interaction: Which way we graph it is fairly arbitrary Effect Modification of bp_sys1_bl vs bmi by cholest_bl bmi*cholest_bl Interaction p= Effect Modification of bp_sys1_bl vs cholest_bl by bmi cholest_bl*bmi Interaction p= bp_sys1_bl bp_sys1_bl bmi cholest_bl At 1% cholest_bl 5,10,25,50,75,90,95% Percentiles of cholest_bl At 1% bmi 5,10,25,50,75,90,95% Percentiles of bmi At 99% cholest_bl At 99% bmi We can see that the nature of the relationship changes at around a BMI 30. We could say that BMI has a positive association with Blood Pressure, and that this relationship is the strongest among those with high cholesterol. Those with low cholesterol do not see a relationship of BMI with Blood Pressure We can see that the nature of the relationship changes at around a cholesterol value of 3.5. We could say that Cholesterol has a positive association with Blood Pressure, and that this relationship is the strongest among those with high BMI. Those with low BMI have a negative or no relationship of Cholesterol with Blood Pressure 17

18 Two-Way Interactions Continuous X Continuous Interaction: Another way to interpret: 4-Corners Method Low Chol, Low BMI=133 Low Chol, High BMI=125 High Chol, Low BMI=130 High Chol, High BMI=155 bp_sys1_bl The combination of being Obese (BMI >30) and having high cholesterol results in high BP Effect Modification of bp_sys1_bl vs bmi by cholest_bl bmi*cholest_bl Interaction p= bmi At 1% cholest_bl At 99% cholest_bl 5,10,25,50,75,90,95% Percentiles of cholest_bl 18

19 Two-Way Interactions Continuous X Continuous Interaction: 3D Mesh Plots (Matlab, Sigma Plot, R) Same data as before, same interpretation. Use 4-Corners Observed Data Marginal Estimates Data Blood Pressure BMI Cholesterol 10 Blood Pressure BMI Cholesterol Why we generally don t use observed data not smooth

20 Two-Way Interactions Continuous X Continuous Interaction: Useful for Non-linear continuous interactions (Response Surface Model) 20

21 Three-Way Interactions Now things get complicated. Variables W*X*Z used to predict Y. The Interaction of X*Z is different at differing levels of W Or X*W is different at differing levels of Z Or Z*W is different at differing levels of X Or relationship of X and Y is different according to the levels of W and Z etc. Substantially easier when one of X, W, or Z are categorical 21

22 Three-Way Interactions Substantially easier when one of X, W, or Z are categorical. so we pick a small range of values to predict one of the variables over treating it as semi-discrete (Quartiles?) Often Time is the third variable Interested in if the interaction of X*Z change over Time (W) 22

23 Three-Way Interactions Categorical X Continuous X Continuous Interaction: Sleep Medication (Y/N) * BMI * Pulse: Stratify on categorical var Sleep Meds med_sleep=0 Predictive Margins med_sleep=1 The interaction of BMI and Pulse exists for those on Sleep Medications only. Apnea Index body mass index (k/m-sq) Pulse

24 Three-Way Interactions Another way to look at this is how the difference in Apnea between those on Sleep Medications versus Not changes depending upon the relationships of pulse and BMI Conditional Marginal Effects of 1.med_sleep Apnea Index body mass index (k/m-sq) Pulse

25 Three-Way Interactions Continuous X Continuous X Continuous Interaction: Glucose Level* BMI * Pulse: Stratify on Glucose Asks the question: How does the interaction of Pulse and BMI change across levels of glucose Apnea Index Adjusted Predictions glucose_bl=5 glucose_bl=6 glucose_bl=7 glucose_bl= body mass index (k/m-sq) Pulse

26 Three-Way Interactions Continuous X Continuous X Continuous Interaction: Glucose Level* BMI * Pulse: Look at how the slopes of Glucose on Apnea change. Asks the question: How does the relationship of Glucose to Apnea change across levels of BMI and pulse. Apnea Index Average Marginal Effects of glucose_bl body mass index (k/m-sq) Pulse

27 Three-Way Interactions What if we have time as our third variable? Same techniques, but perhaps in the future we won t be limited to just static graphs. Interaction of BMI and Pulse on Apnea Score across Time 27

28 Presenting Data in Motion Even better, lets do some of this: als_new_insights_on_poverty.html 28

29 Four-Way Interactions and Beyond Understanding anything much more complex than a 3- way interaction is difficult without a good way to break down variables into categories Classification Techniques/Machine Leaning/Exploratory Data Mining Can take high-dimensional data and find homogenous groups based upon relationships of continuous/categorical variables. 29

30 Four-Way Interactions and Beyond CART Model: 4-Way Interaction of continuous variables on Apnea Severity Smaller Structure Lateral Walls Larger Structure Soft Palate ± ± 12.3 Genioglossus ± 17.9 Mandibular Width ± ±

31 Take Home Points Test for interactions in the beginning of model building Cause they are interesting Cause they obscure your main effects Interactions give us clues about underlying etiology (David Schwartz). It is not enough to detect them, we have to understand why the interaction exists. We must search for the variable(s) that make interactions go away (mediated moderation) Modern classification/data Mining Methods are great at detecting high-dimensional (numerous variables) nonlinear interactions Stata Version 12 and 13 are amazing at doing these types of plots (margin plots). Also, check out Interpreting and Visualizing Regression Models Using Stata by Michael Mitchell 31

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

end-stage renal disease

end-stage renal disease Case study: AIDS and end-stage renal disease Robert Smith? Department of Mathematics and Faculty of Medicine The University of Ottawa AIDS and end-stage renal disease ODEs Curve fitting AIDS End-stage

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

Studying the effect of change on change : a different viewpoint

Studying the effect of change on change : a different viewpoint Studying the effect of change on change : a different viewpoint Eyal Shahar Professor, Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

Clincial Biostatistics. Regression

Clincial Biostatistics. Regression Regression analyses Clincial Biostatistics Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

How to interpret scientific & statistical graphs

How to interpret scientific & statistical graphs How to interpret scientific & statistical graphs Theresa A Scott, MS Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott 1 A brief introduction Graphics:

More information

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical

More information

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression One Categorical Independent Variable with Several Categories Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships CHAPTER 3 Describing Relationships 3.1 Scatterplots and Correlation The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Reading Quiz 3.1 True/False 1.

More information

Pitfalls in Linear Regression Analysis

Pitfalls in Linear Regression Analysis Pitfalls in Linear Regression Analysis Due to the widespread availability of spreadsheet and statistical software for disposal, many of us do not really have a good understanding of how to use regression

More information

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018 [a] Welcome Back! Please pick up a new packet Get a Chrome Book Complete the warm up Choose points on each graph and find the slope of the line. [b] Agenda 05 MIN Warm Up 25 MIN Notes Correlation 15 MIN

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Math 124: Module 2, Part II

Math 124: Module 2, Part II , Part II David Meredith Department of Mathematics San Francisco State University September 15, 2009 What we will do today 1 Explanatory and Response Variables When you study the relationship between two

More information

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: 1 MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: Predicting State Rates of Robbery per 100K We know that robbery rates vary significantly from state-to-state in the United States. In any given state, we

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Results. Example 1: Table 2.1 The Effect of Additives on Daphnia Heart Rate. Time (min)

Results. Example 1: Table 2.1 The Effect of Additives on Daphnia Heart Rate. Time (min) Notes for Alphas Line graphs provide a way to map independent and dependent variables that are both quantitative. When both variables are quantitative, the segment that connects every two points on the

More information

Complex Regression Models with Coded, Centered & Quadratic Terms

Complex Regression Models with Coded, Centered & Quadratic Terms Complex Regression Models with Coded, Centered & Quadratic Terms We decided to continue our study of the relationships among amount and difficulty of exam practice with exam performance in the first graduate

More information

Plotting multivariate data

Plotting multivariate data Plotting multivariate data Statistics 7 ISU Multivariate data and graphics With multivariate data we want to understand the associations between multiple s, which might be considered to be understanding

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

How to describe bivariate data

How to describe bivariate data Statistics Corner How to describe bivariate data Alessandro Bertani 1, Gioacchino Di Paola 2, Emanuele Russo 1, Fabio Tuzzolino 2 1 Department for the Treatment and Study of Cardiothoracic Diseases and

More information

A response variable is a variable that. An explanatory variable is a variable that.

A response variable is a variable that. An explanatory variable is a variable that. Name:!!!! Date: Scatterplots The most common way to display the relation between two quantitative variable is a scatterplot. Statistical studies often try to show through scatterplots, that changing one

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Unit 7 Comparisons and Relationships

Unit 7 Comparisons and Relationships Unit 7 Comparisons and Relationships Objectives: To understand the distinction between making a comparison and describing a relationship To select appropriate graphical displays for making comparisons

More information

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Activity 1 Examining Data From Class Background Download

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

SCATTER PLOTS AND TREND LINES

SCATTER PLOTS AND TREND LINES 1 SCATTER PLOTS AND TREND LINES LEARNING MAP INFORMATION STANDARDS 8.SP.1 Construct and interpret scatter s for measurement to investigate patterns of between two quantities. Describe patterns such as

More information

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. 1. The bell-shaped frequency curve is so common that if a population has this shape, the measurements are

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Discovering Meaningful Cut-points to Predict High HbA1c Variation

Discovering Meaningful Cut-points to Predict High HbA1c Variation Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi

More information

Analysis of Categorical Data from the Ashe Center Student Wellness Survey

Analysis of Categorical Data from the Ashe Center Student Wellness Survey Lab 6 Analysis of Categorical Data from the Ashe Center Student Wellness Survey Before starting this lab, you should be familiar with: the difference between categorical and quantitative variables, and

More information

Types of data and how they can be analysed

Types of data and how they can be analysed 1. Types of data British Standards Institution Study Day Types of data and how they can be analysed Martin Bland Prof. of Health Statistics University of York http://martinbland.co.uk In this lecture we

More information

LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5

LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5 1.1 1. LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5 1 Three adventurers are in a hot air balloon. Soon, they find themselves lost in a canyon in

More information

Sample Exam Paper Answer Guide

Sample Exam Paper Answer Guide Sample Exam Paper Answer Guide Notes This handout provides perfect answers to the sample exam paper. I would not expect you to be able to produce such perfect answers in an exam. So, use this document

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60 M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 1-10 10 11 3 12 4 13 3 14 10 15 14 16 10 17 7 18 4 19 4 Total 60 Multiple choice questions (1 point each) For questions

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Statistical Interaction between Two Continuous (Latent) Variables

Statistical Interaction between Two Continuous (Latent) Variables 11th Congress of the Swiss Psychological Society August 19-20, 2009, University of Neuchâtel Statistical Interaction between Two Continuous (Latent) Variables Guillaume Fürst & Paolo Ghisletta University

More information

Moving beyond regression toward causality:

Moving beyond regression toward causality: Moving beyond regression toward causality: INTRODUCING ADVANCED STATISTICAL METHODS TO ADVANCE SEXUAL VIOLENCE RESEARCH Regine Haardörfer, Ph.D. Emory University rhaardo@emory.edu OR Regine.Haardoerfer@Emory.edu

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

Eating and Sleeping Habits of Different Countries

Eating and Sleeping Habits of Different Countries 9.2 Analyzing Scatter Plots Now that we know how to draw scatter plots, we need to know how to interpret them. A scatter plot graph can give us lots of important information about how data sets are related

More information

Introduction. Chapter Before you start Formulation

Introduction. Chapter Before you start Formulation Chapter 1 Introduction 1.1 Before you start Statistics starts with a problem, continues with the collection of data, proceeds with the data analysis and finishes with conclusions. It is a common mistake

More information

Bayes Linear Statistics. Theory and Methods

Bayes Linear Statistics. Theory and Methods Bayes Linear Statistics Theory and Methods Michael Goldstein and David Wooff Durham University, UK BICENTENNI AL BICENTENNIAL Contents r Preface xvii 1 The Bayes linear approach 1 1.1 Combining beliefs

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Introduction to Statistical Data Analysis I

Introduction to Statistical Data Analysis I Introduction to Statistical Data Analysis I JULY 2011 Afsaneh Yazdani Preface What is Statistics? Preface What is Statistics? Science of: designing studies or experiments, collecting data Summarizing/modeling/analyzing

More information

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE Using StatCrunch for confidence intervals (CI s) is super easy. As you can see in the assignments, I cover 9.2 before 9.1 because

More information

Observational studies; descriptive statistics

Observational studies; descriptive statistics Observational studies; descriptive statistics Patrick Breheny August 30 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 38 Observational studies Association versus causation

More information

Inferential Statistics

Inferential Statistics Inferential Statistics and t - tests ScWk 242 Session 9 Slides Inferential Statistics Ø Inferential statistics are used to test hypotheses about the relationship between the independent and the dependent

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS 17 December 2009 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland Street, Portsmouth

More information

Before we get started:

Before we get started: Before we get started: http://arievaluation.org/projects-3/ AEA 2018 R-Commander 1 Antonio Olmos Kai Schramm Priyalathta Govindasamy Antonio.Olmos@du.edu AntonioOlmos@aumhc.org AEA 2018 R-Commander 2 Plan

More information

Supplementary Appendix

Supplementary Appendix Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Bjerregaard LG, Jensen BW, Ängquist L, Osler M, Sørensen TIA,

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Modeling unobserved heterogeneity in Stata

Modeling unobserved heterogeneity in Stata Modeling unobserved heterogeneity in Stata Rafal Raciborski StataCorp LLC November 27, 2017 Rafal Raciborski (StataCorp) Modeling unobserved heterogeneity November 27, 2017 1 / 59 Plan of the talk Concepts

More information

9 research designs likely for PSYC 2100

9 research designs likely for PSYC 2100 9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one

More information

3.2 Least- Squares Regression

3.2 Least- Squares Regression 3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these

More information

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal

More information

Correlation and regression

Correlation and regression PG Dip in High Intensity Psychological Interventions Correlation and regression Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Correlation Example: Muscle strength

More information

Presenting Survey Data and Results"

Presenting Survey Data and Results Presenting Survey Data and Results" Professor Ron Fricker" Naval Postgraduate School" Monterey, California" Reading Assignment:" 2/15/13 None" 1 Goals for this Lecture" Discuss a bit about how to display

More information

Linear Regression in SAS

Linear Regression in SAS 1 Suppose we wish to examine factors that predict patient s hemoglobin levels. Simulated data for six patients is used throughout this tutorial. data hgb_data; input id age race $ bmi hgb; cards; 21 25

More information

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape.

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. MODULE 02: DESCRIBING DT SECTION C: KEY POINTS C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. C-2:

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About 7 Statistical Issues that Researchers Shouldn t Worry (So Much) About By Karen Grace-Martin Founder & President About the Author Karen Grace-Martin is the founder and president of The Analysis Factor.

More information

Biostatistics for Med Students. Lecture 1

Biostatistics for Med Students. Lecture 1 Biostatistics for Med Students Lecture 1 John J. Chen, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 14, 2018 Lecture note: http://biostat.jabsom.hawaii.edu/education/training.html

More information

Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment. Joel Schwartz Harvard TH Chan School of Public Health

Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment. Joel Schwartz Harvard TH Chan School of Public Health Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment Joel Schwartz Harvard TH Chan School of Public Health Risk Assessment and Susceptibility Typically we do risk assessments

More information

Frequency distributions

Frequency distributions Applied Biostatistics distributions Martin Bland Professor of Health Statistics University of York http://www-users.york.ac.uk/~mb55/ Types of data Qualitative data arise when individuals may fall into

More information

7. Bivariate Graphing

7. Bivariate Graphing 1 7. Bivariate Graphing Video Link: https://www.youtube.com/watch?v=shzvkwwyguk&index=7&list=pl2fqhgedk7yyl1w9tgio8w pyftdumgc_j Section 7.1: Converting a Quantitative Explanatory Variable to Categorical

More information

Chapter 9: Answers. Tests of Between-Subjects Effects. Dependent Variable: Time Spent Stalking After Therapy (hours per week)

Chapter 9: Answers. Tests of Between-Subjects Effects. Dependent Variable: Time Spent Stalking After Therapy (hours per week) Task 1 Chapter 9: Answers Stalking is a very disruptive and upsetting (for the person being stalked) experience in which someone (the stalker) constantly harasses or obsesses about another person. It can

More information

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Circle the best answer. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners to the lung

More information

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics Clinical Biostatistics Data, frequencies, and distributions Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Types of data Qualitative data arise when individuals

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

The Strucplot Framework for Visualizing Categorical Data

The Strucplot Framework for Visualizing Categorical Data The Strucplot Framework for Visualizing Categorical Data David Meyer 1, Achim Zeileis 2 and Kurt Hornik 2 1 Department of Information Systems and Operations 2 Department of Statistics and Mathematics Wirtschaftsuniversität

More information

Lesson 9: Two Factor ANOVAS

Lesson 9: Two Factor ANOVAS Published on Agron 513 (https://courses.agron.iastate.edu/agron513) Home > Lesson 9 Lesson 9: Two Factor ANOVAS Developed by: Ron Mowers, Marin Harbur, and Ken Moore Completion Time: 1 week Introduction

More information

These results are supplied for informational purposes only.

These results are supplied for informational purposes only. These results are supplied for informational purposes only. Prescribing decisions should be made based on the approved package insert in the country of prescription Sponsor/company: sanofi-aventis ClinialTrials.gov

More information

The Effectiveness of Captopril

The Effectiveness of Captopril Lab 7 The Effectiveness of Captopril In the United States, pharmaceutical manufacturers go through a very rigorous process in order to get their drugs approved for sale. This process is designed to determine

More information

STP226 Brief Class Notes Instructor: Ela Jackiewicz

STP226 Brief Class Notes Instructor: Ela Jackiewicz CHAPTER 2 Organizing Data Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that can be assigned a numerical value or nonnumerical

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Following is a list of topics in this paper:

Following is a list of topics in this paper: Preliminary NTS Data Analysis Overview In this paper A preliminary investigation of some data around NTS performance has been started. This document reviews the results to date. Following is a list of

More information

Biostatistics 2 - Correlation and Risk

Biostatistics 2 - Correlation and Risk BROUGHT TO YOU BY Biostatistics 2 - Correlation and Risk Developed by Pfizer January 2018 This learning module is intended for UK healthcare professionals only. PP-GEP-GBR-0957 Date of preparation Jan

More information

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet Standard Deviation and Standard Error Tutorial This is significantly important. Get your AP Equations and Formulas sheet The Basics Let s start with a review of the basics of statistics. Mean: What most

More information