M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

Size: px
Start display at page:

Download "M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1"

Transcription

1 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis FIGURE Minitab worksheet containing computed values for the Studentized deleted residuals, the hat matrix elements, and Cook s distance statistics for the OmniPower sales data 15.6 Influence Analysis 1 In Sections 13.5 and 14.3, you used residual analysis to evaluate the regression assumptions. This section introduces several methods that measure the influence of individual observations: The hat matrix elements, The Studentized deleted residuals, t i Cook s distance statistic, Figure presents the values of these statistics computed by Minitab for the OmniPower sales data.

2 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page 2 2 CHAPTER 15 Multiple Regression Model Building The Hat Matrix Elements, In Section 13.8, was defined for the simple linear regression model when constructing the confidence interval estimate of the mean response. For multiple regression models, the equation for calculating the hat matrix diagonal elements,, requires the use of matrix algebra and is beyond the scope of this text (see references 4, 5, and 7). The hat matrix diagonal element for observation i, denoted, reflects the possible influence of X i on the regression equation. If potentially influential observations are present, you may need to delete them from the model. In a regression model containing k independent variables, Hoaglin and Welsch (see reference 5) suggest the following decision rule: If 7 2(k + 1)>n, then X i is an influential observation and is a candidate for removal from the model. For the OmniPower sales data, because n = 34 and k = 2, you flag any value greater than 2(2 + 1)>34 = Referring to Figure 15.16, you see that none of the values are greater than Therefore, none of the observations are candidates for removal from the analysis. The Studentized Deleted Residuals, t i Recall from Section 13.5 that a residual is the difference between the observed value of Y and the predicted value of Y [see Equation (13.14) on page 539]. Studentized residuals are the residuals divided by the standard error of the estimate S YX and adjusted for the distance from X. The Studentized deleted residual, expressed as a t statistic in Equation (15.10), measures the difference of each Y i from the value predicted by a model that includes all observations except observation i. STUDENTIZED DELETED RESIDUAL n - k - 1 t i = e i A SSE(1 - ) - e 2 i where e i = residual for observation i k = number of independent variables SSE = error sum of squares of the regression model fitted = hat matrix diagonal element for observation i (15.10) Hoaglin and Welsch (see reference 5) suggest that if t i 7 t a>2 or t i 6 t a>2 (using a level of significance of 0.10), the observed and predicted values are so different that observation i is highly influential on the regression equation and is a candidate for removal. For the OmniPower sales data, n = 34 and k = 2. Thus, you flag any t i whose absolute value is greater than (see Table E.3). In Figure 15.16, t 14 = , t 15 = , and t 20 = are highlighted. Thus, the 14th, 15th, and 20th observations may each have an adverse effect on the model. These observations were not previously flagged according to the criterion. Since and t i measure different aspects of influence, neither criterion is sufficient by itself. When is small, t i may be large. When is large, t i may be moderate or small because the observed is consistent with the rest of the data. Y i Cook s Distance Statistic, Cook s distance statistic,, based on both and the Studentized residual, is a third criterion for identifying influential observations. To decide whether an observation flagged by either the or t i criterion is unduly affecting the model, Cook and Weisberg (see reference 4) developed Cook s statistic.

3 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 3 COOK S STATISTIC where = e 2 i k MSE c (1 - ) d 2 e i = residual for observation i k = number of independent variables MSE = mean square error of the regression model fitted = hat matrix diagonal element for observation i (15.11) TABLE 15.4 Selected Critical Values of F for Cook s Statistic Cook and Weisberg suggest that if 7 F a (the critical value of the F distribution having k + 1 degrees of freedom in the numerator and n - k - 1 degrees of freedom in the denominator at a 0.50 level of significance), the observation is highly influential on the regression equation and is a candidate for removal. Table 15.4 shows critical values for Cook s statistic. A 0.50 Numerator df k 1 Denominator df n k q Source: Extracted from E. S. Pearson and H. O. Hartley, eds., Biometrika Tables for Statisticians, 3rd ed., 1966, by permission of the Biometrika Trustees. For the OmniPower sales data, since n = 34 and k = 2, there are 3 degrees of freedom in the numerator and 31 degrees of freedom in the denominator. Thus, any 7 F a, = is flagged. Referring to Figure 15.16, you see that none of the values exceed 0.187, and therefore no observations are identified as influential using Cook s statistic. Overview This section discussed three criteria for evaluating the influence of each observation on the multiple regression model. The various statistics did not lead to a consistent set of conclusions. According to both the and the criteria, none of the observations is a candidate for removal. Under such circumstances, most statisticians would conclude that there is insufficient evidence for the removal of any observation from the analysis. In addition to the three criteria presented here, there are other measures of influence (see references 1 and 6). Although different statisticians seem to prefer particular measures, currently there is no consensus as to the best measure.

4 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page 4 4 CHAPTER 15 Multiple Regression Model Building Problems for Section 15.6 APPLYING THE CONCEPTS In Problem 14.4 on page 583, you used sales and number of orders to predict distribution costs at a mail-order catalog business (stored in Warecost ). Perform an influence analysis on your results and determine whether any observations the regression model after deleting these observations and compare your results In Problem 14.5 on page 583, you used horsepower and weight to predict gasoline mileage (stored in Auto2010 ). Perform an influence analysis on your results and determine whether any observations should be deleted from the analysis. If necessary, reanalyze the regression model after deleting these observations and compare your results In Problem 14.6 on page 583, you used the amount of radio advertising and newspaper advertising to predict sales (stored in Advertise ). Perform an influence analysis on your results and determine whether any observations the regression model after deleting these observations and compare your results In Problem 14.7 on page 584, you used the total staff present and remote hours to predict standby hours (stored in Standby ). Perform an influence analysis on your results and determine whether any observations should be deleted from the analysis. If necessary, reanalyze the regression model after deleting these observations and compare your results In Problem 14.8 on page 584, you used the land area of the property and age in years to predict appraised value (stored in GlenCove ). Perform an influence analysis on your results and determine whether any observations the regression model after deleting these observations and compare your results. REFERENCES 1. Andrews, D. F., and D. Pregibon, Finding the Outliers That Matter, Journal of the Royal Statistical Society 40 (Ser. B., 1978): Atkinson, A. C., Robust and Diagnostic Regression Analysis, Communications in Statistics 11 (1982): Belsley, D. A., E. Kuh, and R. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity (New York: Wiley, 1980). 4. Cook, R. D., and S. Weisberg, Residuals and Influence in Regression (New York: Chapman and Hall, 1982). 5. Hoaglin, D. C., and R. Welsch, The Hat Matrix in Regression and ANOVA, The American Statistician, 32 (1978): Hocking, R. R., Developments in Linear Regression Methodology: , Technometrics 25 (1983): Kutner, M., C. Nachtsheim, J. Neter, and W. Li, Applied Linear Statistical Models, 5th ed. (New York: McGraw- Hill/Irwin, 2005). 8. Minitab Release 16 (State College, PA: Minitab, Inc., 2010)

5 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page 5 EG15.6 EXCEL GUIDE FOR INFLUENCE ANALYSIS There are no Excel Guide instructions for this section. MG15.6 MINITAB GUIDE FOR INFLUENCE ANALYSIS Use Regression to perform influence analysis. Use the Interpreting the Regression Coefficients instructions in Section MG14.1 (repeated below), replacing step 19 of those instructions with the steps 19 through 22 listed below. For example, to perform the Figure analysis of the OmniPower sales data on page, open to the OmniPower worksheet. Select Stat Regression Regression. In the Regression dialog box: 1. Double-click C1 Sales in the variables list to add Sales to the Response box. 2. Double-click C2 Price in the variables list to add Price to the Predictors box. 3. Double-click C3 Promotion in the variables list to add Promotion to the Predictors box. 4. Click Graphs. In the Regression - Graphs dialog box: 5. Click Regular and Individual Plots. 6. Check Histogram of residuals and clear all the other check boxes. 7. Click anywhere inside the Residuals versus the variables box. 8. Double-click C2 Price in the variables list to add Price in the Residuals versus the variables box. 9. Double-click C3 Promotion in the variables list to add Promotion in the Residuals versus the variables box. 10. Click OK. 11. Back in the Regression dialog box, click Results. In the Regression - Results dialog box: 12. Click In addition, the full table of fits and residuals and then click OK. 13. Back in the Regression dialog box, click Options. In the Regression - Options dialog box: 14. Check Fit Intercept. 15. Clear all the Display and Lack of Fit Test check boxes. 16. Enter 79 and 400 in the Prediction intervals for new observations box. 17. Enter 95 in the Confidence level box. 18. Click OK. 19. Back in the Regression dialog box, click Storage. In the Regression - Storage dialog box: 20. Check Deleted t residuals, Hi (leverages), and Cook s distance. 21. Click OK. 22. Back in the Regression dialog box, click OK. 5

Testing the effect of two factors at different levels (two treatments). Examples: Yield of a crop varying with fertilizer type and seedling used.

Testing the effect of two factors at different levels (two treatments). Examples: Yield of a crop varying with fertilizer type and seedling used. ANOVA TWO-WAY Testing the effect of two factors at different levels (two treatments). Examples: Yield of a crop varying with fertilizer type and seedling used. Gas mileage depends on gas additive and tires

More information

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 10 Correlation and Regression 10 1 Review and Preview 10 2 Correlation 10 3 Regression 10 4 Variation and Prediction Intervals

More information

CHAPTER TWO REGRESSION

CHAPTER TWO REGRESSION CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

In many cardiovascular experiments and observational studies,

In many cardiovascular experiments and observational studies, Statistical Primer for Cardiovascular Research Multiple Linear Regression Accounting for Multiple Simultaneous Determinants of a Continuous Dependent Variable Bryan K. Slinker, DVM, PhD; Stanton A. Glantz,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis

More information

CHILD HEALTH AND DEVELOPMENT STUDY

CHILD HEALTH AND DEVELOPMENT STUDY CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in

More information

NORTH SOUTH UNIVERSITY TUTORIAL 2

NORTH SOUTH UNIVERSITY TUTORIAL 2 NORTH SOUTH UNIVERSITY TUTORIAL 2 AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 Correlation Analysis INTRODUCTION In correlation analysis, we estimate

More information

Doctors Fees in Ireland Following the Change in Reimbursement: Did They Jump?

Doctors Fees in Ireland Following the Change in Reimbursement: Did They Jump? The Economic and Social Review, Vol. 38, No. 2, Summer/Autumn, 2007, pp. 259 274 Doctors Fees in Ireland Following the Change in Reimbursement: Did They Jump? DAVID MADDEN University College Dublin Abstract:

More information

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS - CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS SECOND EDITION Raymond H. Myers Virginia Polytechnic Institute and State university 1 ~l~~l~l~~~~~~~l!~ ~~~~~l~/ll~~ Donated by Duxbury o Thomson Learning,,

More information

SW 9300 Applied Regression Analysis and Generalized Linear Models 3 Credits. Master Syllabus

SW 9300 Applied Regression Analysis and Generalized Linear Models 3 Credits. Master Syllabus SW 9300 Applied Regression Analysis and Generalized Linear Models 3 Credits Master Syllabus I. COURSE DOMAIN AND BOUNDARIES This is the second course in the research methods sequence for WSU doctoral students.

More information

Pitfalls in Linear Regression Analysis

Pitfalls in Linear Regression Analysis Pitfalls in Linear Regression Analysis Due to the widespread availability of spreadsheet and statistical software for disposal, many of us do not really have a good understanding of how to use regression

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

This tutorial presentation is prepared by. Mohammad Ehsanul Karim STATA: The Red tutorial STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com STATA: The Red tutorial This tutorial presentation is prepared by

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multiple Regression (MR) Types of MR Assumptions of MR SPSS procedure of MR Example based on prison data Interpretation of

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression! Equation of Regression Line; Residuals! Effect of Explanatory/Response Roles! Unusual Observations! Sample

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do. Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and

More information

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A. Proceedings of the 004 Winter Simulation Conference R G Ingalls, M D Rossetti, J S Smith, and B A Peters, eds TEACHING REGRESSION WITH SIMULATION John H Walker Statistics Department California Polytechnic

More information

Multiple Regression Using SPSS/PASW

Multiple Regression Using SPSS/PASW MultipleRegressionUsingSPSS/PASW The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerablyandisuggest(especiallyifyou reconfused)thatyoureadthischapterinitsentirety.youwillalsoneed

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

Staff Papers Series. Department of Agricultural and Applied Economics

Staff Papers Series. Department of Agricultural and Applied Economics Staff Paper P89-19 June 1989 Staff Papers Series CHOICE OF REGRESSION METHOD FOR DETRENDING TIME SERIES DATA WITH NONNORMAL ERRORS by Scott M. Swinton and Robert P. King Department of Agricultural and

More information

AP Statistics Practice Test Ch. 3 and Previous

AP Statistics Practice Test Ch. 3 and Previous AP Statistics Practice Test Ch. 3 and Previous Name Date Use the following to answer questions 1 and 2: A researcher measures the height (in feet) and volume of usable lumber (in cubic feet) of 32 cherry

More information

Correlation and Regression

Correlation and Regression Dublin Institute of Technology ARROW@DIT Books/Book Chapters School of Management 2012-10 Correlation and Regression Donal O'Brien Dublin Institute of Technology, donal.obrien@dit.ie Pamela Sharkey Scott

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships Objectives: Students will: Construct and interpret a scatterplot for a set of bivariate data. Compute and interpret the correlation, r, between two variables. Demonstrate

More information

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk Pearson Education Limited 2014

More information

ANOVA in SPSS (Practical)

ANOVA in SPSS (Practical) ANOVA in SPSS (Practical) Analysis of Variance practical In this practical we will investigate how we model the influence of a categorical predictor on a continuous response. Centre for Multilevel Modelling

More information

Comparison of Adaptive and M Estimation in Linear Regression

Comparison of Adaptive and M Estimation in Linear Regression IOSR Journal of Mathematics (IOSR-JM) e-issn: 2278-5728, p-issn: 2319-765X. Volume 13, Issue 3 Ver. III (May - June 2017), PP 33-37 www.iosrjournals.org Comparison of Adaptive and M Estimation in Linear

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About 7 Statistical Issues that Researchers Shouldn t Worry (So Much) About By Karen Grace-Martin Founder & President About the Author Karen Grace-Martin is the founder and president of The Analysis Factor.

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression One Categorical Independent Variable with Several Categories Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary

More information

Business Research Methods. Introduction to Data Analysis

Business Research Methods. Introduction to Data Analysis Business Research Methods Introduction to Data Analysis Data Analysis Process STAGES OF DATA ANALYSIS EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS Introduction Preparation of

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). 1. Bayesian Information Criterion 2. Cross-Validation 3. Robust 4. Imputation

More information

1. To review research methods and the principles of experimental design that are typically used in an experiment.

1. To review research methods and the principles of experimental design that are typically used in an experiment. Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise Lab #7 (there was no Lab #6) Treatment for Depression: A Randomized Controlled Clinical Trial Objectives: 1. To review

More information

Answer all three questions. All questions carry equal marks.

Answer all three questions. All questions carry equal marks. UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science and Statistics Postgraduate Diploma in Statistics Trinity Term 2 Introduction to Regression

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

The University of North Carolina at Chapel Hill School of Social Work

The University of North Carolina at Chapel Hill School of Social Work The University of North Carolina at Chapel Hill School of Social Work SOWO 918: Applied Regression Analysis and Generalized Linear Models Spring Semester, 2014 Instructor Shenyang Guo, Ph.D., Room 524j,

More information

Modern Regression Methods

Modern Regression Methods Modern Regression Methods Second Edition THOMAS P. RYAN Acworth, Georgia WILEY A JOHN WILEY & SONS, INC. PUBLICATION Contents Preface 1. Introduction 1.1 Simple Linear Regression Model, 3 1.2 Uses of Regression

More information

Bivariate Correlations

Bivariate Correlations Bivariate Correlations Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 081 753 3962 Bivariate Correlations The Bivariate Correlations procedure computes the

More information

Asian Journal of Economic Modelling

Asian Journal of Economic Modelling Asian Journal of Economic Modelling ISSN(e): 2312-3656/ISSN(p): 2313-2884 URL: www.aessweb.com FORECASTING IN FINANCIAL DATA CONTEXT Mahmoud Dehghan Nayeri 1 --- Ali Faal Ghayoumi 2 --- Malihe Rostami

More information

Section 3 Correlation and Regression - Teachers Notes

Section 3 Correlation and Regression - Teachers Notes The data are from the paper: Exploring Relationships in Body Dimensions Grete Heinz and Louis J. Peterson San José State University Roger W. Johnson and Carter J. Kerk South Dakota School of Mines and

More information

Linear Regression Analysis

Linear Regression Analysis Linear Regression Analysis WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Peter Bloomfield, Noel A. C. Cressie, Nicholas I.

More information

2 Assumptions of simple linear regression

2 Assumptions of simple linear regression Simple Linear Regression: Reliability of predictions Richard Buxton. 2008. 1 Introduction We often use regression models to make predictions. In Figure?? (a), we ve fitted a model relating a household

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

MULTIPLE REGRESSION OF CPS DATA

MULTIPLE REGRESSION OF CPS DATA MULTIPLE REGRESSION OF CPS DATA A further inspection of the relationship between hourly wages and education level can show whether other factors, such as gender and work experience, influence wages. Linear

More information

Chapter 14: More Powerful Statistical Methods

Chapter 14: More Powerful Statistical Methods Chapter 14: More Powerful Statistical Methods Most questions will be on correlation and regression analysis, but I would like you to know just basically what cluster analysis, factor analysis, and conjoint

More information

Introduction to regression

Introduction to regression Introduction to regression Regression describes how one variable (response) depends on another variable (explanatory variable). Response variable: variable of interest, measures the outcome of a study

More information

Bangor University Laboratory Exercise 1, June 2008

Bangor University Laboratory Exercise 1, June 2008 Laboratory Exercise, June 2008 Classroom Exercise A forest land owner measures the outside bark diameters at.30 m above ground (called diameter at breast height or dbh) and total tree height from ground

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

EXECUTIVE SUMMARY DATA AND PROBLEM

EXECUTIVE SUMMARY DATA AND PROBLEM EXECUTIVE SUMMARY Every morning, almost half of Americans start the day with a bowl of cereal, but choosing the right healthy breakfast is not always easy. Consumer Reports is therefore calculated by an

More information

Two-Way Independent Samples ANOVA with SPSS

Two-Way Independent Samples ANOVA with SPSS Two-Way Independent Samples ANOVA with SPSS Obtain the file ANOVA.SAV from my SPSS Data page. The data are those that appear in Table 17-3 of Howell s Fundamental statistics for the behavioral sciences

More information

STAT 201 Chapter 3. Association and Regression

STAT 201 Chapter 3. Association and Regression STAT 201 Chapter 3 Association and Regression 1 Association of Variables Two Categorical Variables Response Variable (dependent variable): the outcome variable whose variation is being studied Explanatory

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis Revised July 2018 Multiple Linear Regression Analysis This set of notes shows how to use Stata in multiple regression analysis. It assumes that you have set Stata up on your computer (see the Getting Started

More information

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Equation of Regression Line; Residuals Effect of Explanatory/Response Roles Unusual Observations Sample

More information

Applications of Regression Models in Epidemiology

Applications of Regression Models in Epidemiology Applications of Regression Models in Epidemiology Applications of Regression Models in Epidemiology Erick Suárez, Cynthia M. Pérez, Roberto Rivera, and Melissa N. Martínez Copyright 2017 by John Wiley

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Interaction Effects: Centering, Variance Inflation Factor, and Interpretation Issues

Interaction Effects: Centering, Variance Inflation Factor, and Interpretation Issues Robinson & Schumacker Interaction Effects: Centering, Variance Inflation Factor, and Interpretation Issues Cecil Robinson Randall E. Schumacker University of Alabama Research hypotheses that include interaction

More information

Class 7 Everything is Related

Class 7 Everything is Related Class 7 Everything is Related Correlational Designs l 1 Topics Types of Correlational Designs Understanding Correlation Reporting Correlational Statistics Quantitative Designs l 2 Types of Correlational

More information

Chapter 11 Multiple Regression

Chapter 11 Multiple Regression Chapter 11 Multiple Regression PSY 295 Oswald Outline The problem An example Compensatory and Noncompensatory Models More examples Multiple correlation Chapter 11 Multiple Regression 2 Cont. Outline--cont.

More information

Applied Linear Regression

Applied Linear Regression Applied Linear Regression Applied Linear Regression Third Edition SANFORD WEISBERG University of Minnesota School of Statistics Minneapolis, Minnesota A JOHN WILEY & SONS, INC., PUBLICATION Copyright

More information

Data Analysis with SPSS

Data Analysis with SPSS Data Analysis with SPSS A First Course in Applied Statistics Fourth Edition Stephen Sweet Ithaca College Karen Grace-Martin The Analysis Factor Allyn & Bacon Boston Columbus Indianapolis New York San Francisco

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

Math 075 Activities and Worksheets Book 2:

Math 075 Activities and Worksheets Book 2: Math 075 Activities and Worksheets Book 2: Linear Regression Name: 1 Scatterplots Intro to Correlation Represent two numerical variables on a scatterplot and informally describe how the data points are

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

STAT445 Midterm Project1

STAT445 Midterm Project1 STAT445 Midterm Project1 Executive Summary This report works on the dataset of Part of This Nutritious Breakfast! In this dataset, 77 different breakfast cereals were collected. The dataset also explores

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

International Journal on Future Revolution in Computer Science & Communication Engineering ISSN: Volume: 4 Issue:

International Journal on Future Revolution in Computer Science & Communication Engineering ISSN: Volume: 4 Issue: Application of the Variance Function of the Difference Between two estimated responses in regulating Blood Sugar Level in a Diabetic patient using Herbal Formula Karanjah Anthony N. School of Science Maasai

More information

General Example: Gas Mileage (Stat 5044 Schabenberger & J.P.Morgen)

General Example: Gas Mileage (Stat 5044 Schabenberger & J.P.Morgen) General Example: Gas Mileage (Stat 5044 Schabenberger & J.P.Morgen) From Motor Trend magazine data were obtained for n=32 cars on the following variables: Y= Gas Mileage (miles per gallon, MPG) X1= Engine

More information

Notes for laboratory session 2

Notes for laboratory session 2 Notes for laboratory session 2 Preliminaries Consider the ordinary least-squares (OLS) regression of alcohol (alcohol) and plasma retinol (retplasm). We do this with STATA as follows:. reg retplasm alcohol

More information

Choosing a Significance Test. Student Resource Sheet

Choosing a Significance Test. Student Resource Sheet Choosing a Significance Test Student Resource Sheet Choosing Your Test Choosing an appropriate type of significance test is a very important consideration in analyzing data. If an inappropriate test is

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78 35. What s My Line? You use the same bar of soap to shower each morning. The bar weighs 80 grams when it is new. Its weight goes down by 6 grams per day on average. What is the equation of the regression

More information

3.2 Least- Squares Regression

3.2 Least- Squares Regression 3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these

More information

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Normal Q Q. Residuals vs Fitted. Standardized residuals. Theoretical Quantiles. Fitted values. Scale Location 26. Residuals vs Leverage

Normal Q Q. Residuals vs Fitted. Standardized residuals. Theoretical Quantiles. Fitted values. Scale Location 26. Residuals vs Leverage Residuals 400 0 400 800 Residuals vs Fitted 26 42 29 Standardized residuals 2 0 1 2 3 Normal Q Q 26 42 29 360 400 440 2 1 0 1 2 Fitted values Theoretical Quantiles Standardized residuals 0.0 0.5 1.0 1.5

More information

AP Stats Chap 27 Inferences for Regression

AP Stats Chap 27 Inferences for Regression AP Stats Chap 27 Inferences for Regression Finally, we re interested in examining how slopes of regression lines vary from sample to sample. Each sample will have it s own slope, b 1. These are all estimates

More information

Small Group Presentations

Small Group Presentations Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA

RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA W. Sibanda 1* and P. Pretorius 2 1 DST/NWU Pre-clinical

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Anale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018

Anale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018 HANDLING MULTICOLLINEARITY; A COMPARATIVE STUDY OF THE PREDICTION PERFORMANCE OF SOME METHODS BASED ON SOME PROBABILITY DISTRIBUTIONS Zakari Y., Yau S. A., Usman U. Department of Mathematics, Usmanu Danfodiyo

More information

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction In this exercise, we will gain experience assessing scatterplots in regression and

More information

Received: 19 November 2012 / Revised: 16 April 2013 / Accepted: 24 April 2013 / Published online: 9 May 2013 Ó Springer-Verlag Wien 2013

Received: 19 November 2012 / Revised: 16 April 2013 / Accepted: 24 April 2013 / Published online: 9 May 2013 Ó Springer-Verlag Wien 2013 Netw Model Anal Health Inform Bioinforma (23) 2:37 46 DOI.7/s372-3-32-z ORIGINAL ARTICLE Comparative study of the application of central composite face-centred (CCF) and Box Behnken designs (BBD) to study

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

CHAPTER 3 RESEARCH METHODOLOGY

CHAPTER 3 RESEARCH METHODOLOGY CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction 3.1 Methodology 3.1.1 Research Design 3.1. Research Framework Design 3.1.3 Research Instrument 3.1.4 Validity of Questionnaire 3.1.5 Statistical Measurement

More information

Chapter 4: Scatterplots and Correlation

Chapter 4: Scatterplots and Correlation Chapter 4: Scatterplots and Correlation http://www.yorku.ca/nuri/econ2500/bps6e/ch4-links.pdf Correlation text exr 4.10 pg 108 Ch4-image Ch4 exercises: 4.1, 4.29, 4.39 Most interesting statistical data

More information

All Possible Regressions Using IBM SPSS: A Practitioner s Guide to Automatic Linear Modeling

All Possible Regressions Using IBM SPSS: A Practitioner s Guide to Automatic Linear Modeling Georgia Southern University Digital Commons@Georgia Southern Georgia Educational Research Association Conference Oct 7th, 1:45 PM - 3:00 PM All Possible Regressions Using IBM SPSS: A Practitioner s Guide

More information