WELCOME! Lecture 11 Thommy Perlinger

Size: px
Start display at page:

Download "WELCOME! Lecture 11 Thommy Perlinger"

Transcription

1 Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger

2 Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression model. The confidence in the interpretations and predictions decreases, the result may e.g. provide: Inappropriate tests of the significance of coefficients (either showing significance when it is not present, or vice versa) Biased and inaccurate predictions of the dependent variables So make sure to analyze your residuals and partial regression plots!

3 Assessing statistical assumptions Rules of thumb Testing assumptions must be done not only for each dependent and explanatory variable, but for the variate as well. Graphical analyses (residual plots, partial regression plots, Normal probability plots) are the most widely used methods of assessing assumptions for the variate. Remedies for problems found in the variate must be accomplished by modifying the dependent variable and/or one or more explanatory variables.

4 From stage 2 A decision process for multiple regression analysis Specification by the researcher Go to Stage 2 Stage 4 Select an estimation technique Specify the regression model? Or utilize a procedure that selects variables to optimize prediction? Selection by procedure Forward/ backward/ stepwise estimation All-possible-subsets Does the regression variate meet the assumptions of regression analysis? No

5 A decision process for multiple regression analysis Stage 1 Select objective(s) Prediction Explanation Select variables From stage 4 Stage 2 Research design Sample size Power Generalizability Additional variables Transformations? Dummy variables? Curvilinear relationships? Interaction terms? Stage 4 Select an estimation technique

6 From stage 2 A decision process for multiple regression analysis Specification by the researcher Go to Stage 2 Stage 4 Select an estimation technique Specify the regression model? Or utilize a procedure that selects variables to optimize prediction? Selection by procedure Forward/ backward/ stepwise estimation All-possible-subsets Does the regression variate meet the assumptions of regression analysis? No Yes Examine statistical and practical sign. Coefficient of determination, R 2 Adjusted coeff. of determination Standard error of the estimate, SE E Sign. of regression coefficients

7 Estimating the statistical significance of our model All samples are affected by random variation. Since we take only one sample and base our predictive model on that, we need to test the hypothesis that our regression model can represent the population and not just our sample. This can be done in two ways: 1) Testing the coefficient of determination R 2 (the variance explained) 2) Testing each regression coefficient

8 Estimating the statistical significance of our model All samples are affected by random variation. Since we take only one sample and base our predictive model on that, we need to test the hypothesis that our regression model can represent the population and not just our sample. This can be done in two ways: 1) Testing the coefficient of determination R 2 (the variance explained) 2) Testing each regression coefficient

9 Example: Happiness The World Database of Happiness is an online registry of scientific research on the subjective appreciation of life. The average happiness score is presented for various nations. This average is based on individual responses from numerous general population surveys to a general life satisfication (well-being) question.

10 Example: Happiness Variables: Happiness (0=dissatisfied to 10=satisfied) GINI index (degree of inequality in the distribution of income, higher score=greater inequality) Degree of corruption in government (higher score=less corruption) Average life expectancy Degree of democracy (higher score = more political liberties) Independent (explanatory) variables Dependent variable

11 Example: Happiness ANOVA a Model Sum of Squares df Mean Square F Sig. Regression 89, ,411 57,335,000 b 1 Residual 26,189 67,391 Total 115, a. Dependent Variable: Happiness b. Predictors: (Constant), Life expectancy, GINI, Democracy, Corruption P-value for test of overall significance of the model This model has overall significance, i.e. R 2 is significantly larger than zero (at least one of the explanatory variables significantly affects happiness).

12 F value The value of the F statistic shows the amount of variation explained by the model compared to how much is explained by using the simple mean of the dependent variable Y. Happiness example: F = 57.3 tells us that, considering the sample used for estimation, we can explain 57.3 times more variation of the happiness variable using the explanatory variables GINI, degree of corruption, degree of democracy, and life expectancy, than when using the simple average of Y=happiness.

13 Estimating the statistical significance of our model If we are comparing different models, we use the adjusted R 2 as a measure of how the additional explanatory variable(s) influence(s) the predictive accuracy of the model. We also examine the standard error of the estimate, SE E. The lower the SE E, the better the predictive accuracy of the model.

14 Estimating the statistical significance of our model All samples are affected by random variation. Since we take only one sample and base our predictive model on that, we need to test the hypothesis that our regression model can represent the population and not just our sample. This can be done in two ways: 1) Testing the coefficient of determination R 2 (the variance explained) 2) Testing each regression coefficient

15 Significance tests of regression coefficients The other way of testing the hypothesis that our regression model can represent the population and not just our sample is to test the significance of each regression coefficient. We already know how to test if the estimated regression coefficients are significantly different from zero. H : 0 0 i H : 0 a i (The variable X i has no linear effect on Y) (The variable X i has a linear effect on Y)

16 Example: Happiness Coefficients a Model Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig. (Constant) -2,720,866-3,141,003 GINI,037,009,255 3,916,000 1 Corruption,186,050,363 3,680,000 Democracy,039,066,052,598,552 Life expectancy,090,011,639 8,080,000 a. Dependent Variable: Happiness The coefficients are significant for all explanatory variables but degree of democracy.

17 Confidence interval for the regression coefficient To get a more likely estimate of the regression coefficient, and to get an estimate that is generalizable to the population, confidence intervals can be created for the regression coefficients. A confidence interval visualizes the impact of the sample size on the result, smaller samples lead to wider confidence intervals and the other way around.

18 Example: Happiness Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. 95,0% Confidence Interval for B B Std. Error Beta Lower Bound Upper Bound (Constant) -2,720,866-3,141,003-4,449 -,991 GINI,037,009,255 3,916,000,018,056 1 Corruption,186,050,363 3,680,000,085,286 Democracy,039,066,052,598,552 -,092,170 Life expectancy,090,011,639 8,080,000,068,113 a. Dependent Variable: Happiness SPSS: Analyze >> Regression >> Linear. Click Statistics, mark Confidence intervals Confidence interval for each regression coefficient β

19 From stage 2 A decision process for multiple regression analysis Specification by the researcher Go to Stage 2 Stage 4 Select an estimation technique Specify the regression model? Or utilize a procedure that selects variables to optimize prediction? Selection by procedure Forward/ backward/ stepwise estimation All-possible-subsets Does the regression variate meet the assumptions of regression analysis? No Yes Examine statistical and practical sign. Coefficient of determination, R 2 Adjusted coeff. of determination Standard error of the estimate, SE E Sign. of regression coefficients Identify influential observations Deletion required?

20 Identifying influential observations Influential observations are observations that have a disproportionate effect on the regression results. The effect on the regression results can either be good, which means that the results are strengthened, or bad, which means that the results are being substantially changed. Any influential observations must be identified to assess their impact.

21 Identifying influential observations There are three basic types of influential observations: 1) Outliers = observations with large residual values. They can be identified only with respect to a specific regression model. 2) Leverage points = observations distinct from the remaining observations based on their explanatory variable values. 3) Other influential observations, that have a disproportionate effect on the regression results.

22 Identifying influential observations Procedures for identifying influential observations are becoming quite widespread, yet still not very well known and not frequently used in regression analysis. A good way to identify residual outliers is to look at standardized residuals exceeding 2.0 (more than 2 standard deviations from the mean of the residuals). SPSS can calculate Leverage, a measure of how far an observation deviates from the mean of that variable. During this course, we ll focus on identifying outliers among the residuals.

23 Identifying influential observations

24 Keeping or deleting influential observations Whether an influential observation should be deleted or kept depends on the type of observation: An error in observations or data entry should be deleted if the data cannot be corrected. A valid but exceptional observation that is explainable by an extraordinary situation should be deleted, unless variables reflecting the extraordinary situation are included in the model.

25 Keeping or deleting influential observations An exceptional observation with no likely explanation have no reasons for deleting the case, but no justification for keeping it. Perform analyses with and without the observations to make a complete assessment. An ordinary observation in its individual characteristics but exceptional in its combination of characteristics should be kept.

26 Statistical significance and influential observations Rules of thumb Always ensure practical significance when using large samples, because the model results and regression coefficients could be deemed irrelevant even when statistically significant due just to the statistical power arising from large sample sizes. Use the adjusted R 2 as your measure of overall model predictive accuracy when comparing models.

27 Statistical significance and influential observations Rules of thumb, cont d Statistical significance is required for a relationship to have validity, but statistical significance without theoretical support does not support validity. Although outliers may be easily identifiable, the other forms of influential observations requiring more specialized diagnostic methods can be equal to or have even more impact on the results

28 From stage 2 A decision process for multiple regression analysis Specification by the researcher Go to Stage 2 Stage 4 Select an estimation technique Specify the regression model? Or utilize a procedure that selects variables to optimize prediction? Selection by procedure Forward/ backward/ stepwise estimation All-possible-subsets Does the regression variate meet the assumptions of regression analysis? No Yes Examine statistical and practical sign. Coefficient of determination, R 2 Adjusted coeff. of determination Standard error of the estimate, SE E Sign. of regression coefficients Delete influential observations Yes Identify influential observations Deletion required? No To stage 5

29 A decision process for multiple regression analysis From stage 4 Stage 5 Interpret the regression variate Evaluate the prediction equation Evaluate the relative importance of the explanatory variables Assess multicollinearity

30 A decision process for multiple regression analysis Stage 5 Stage 5: interpreting the regression variate During this stage it is time to evaluate the estimated regression coefficients for their explanation of the dependent variable.

31 Using the regression coefficients The estimated regression coefficients (the b coefficients) represent both the type of relationship (positive or negative) and the strength of the relationship between explanatory and dependent variables (the value of b). The regression coefficients have two important functions in meeting the objectives of prediction and explanation for any regression analysis.

32 Prediction The estimated regression equation can be used to calculate estimated/predicted values for the dependent variable, based on certain values for the explanatory variable(s). When a regression equation is used for prediction with a set of observations that were not used in the estimation process, it is called forecasting.

33 Confidence intervals for predicted values To get a more likely estimate of a predicted/estimated value of Y based on the regression equation, you can create a confidence interval for the prediction.

34 Confidence intervals around estimated/predicted mean values Form intervals around the estimated/predicted value y to express uncertainty about the value of y for a given x j Confidence Interval for the estimated value of y, given x j Y y y = b 0 +b 1 x j x j X

35 Example: Happiness A simple regression model is estimated aiming to explain happiness, with life expectancy as the single explanatory variable. Y = Happiness (0-10) X = Life expectancy (years)

36 Example: Happiness Coefficients a Model Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig. 1 (Constant) -1,960,724-2,709,008 Life expectancy,114,010,807 11,446,000 a. Dependent Variable: Happiness The regression equation can be used to estimate the average happiness for nations with a life expectancy of e.g. 50 years: Happiness = = 3.74

37 Example: Happiness Estimated value of Happiness for X=49.94 years Predicted/estimated mean values Lower limit of Mean Confidence Interval SPSS: Analyze >> Regression >> Linear. Click Save, mark Mean under Prediction intervals. Upper limit of Mean Confidence Interval

38 Example: Happiness The estimated/predicted average value of happiness for a life expectancy of approx. 50 years (49.94) is 3.7. With 95% confidence, the interval 3.3 to 4.2 covers the true average value of happiness score in the population of nations with a life expectancy of approx. 50 years.

39 Confidence intervals around forecasts When using the regression equation to make forecasts, i.e. predictions of Y for a new set of data, you can create a confidence interval for the forecast. Forecasting predictions not only have the sampling variations from the original sample, but also those of the newly drawn sample. Confidence intervals around forecasts also include both the error associated with future observations, and therefore they are wider than the confidence intervals for estimated/predicted values.

40 Confidence intervals around individual forecasted (new observed) values Form intervals around the forecasted value y to express uncertainty about the value of y for a given x j Confidence Interval for the estimated value of y, given x j Y y y = b 0 +b 1 x j Forecasting Interval for a new observed y, given x j x j X

41 Example: Happiness We can use the regression equation estimated on the data from The World Database of Happiness to predict (forecast) the happiness score for a nation not included in the database. This nation happens to have a life expectancy of 50 years.

42 Example: Happiness Coefficients a Model Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig. 1 (Constant) -1,960,724-2,709,008 Life expectancy,114,010,807 11,446,000 a. Dependent Variable: Happiness The same regression equation can be used to forecast the average happiness for a nation with a life expectancy of e.g. 50 years: Happiness = = 3.74

43 Example: Happiness Predicted/estimated mean values Lower limit of Individual (forecast) Confidence Interval SPSS: Analyze >> Regression >> Linear. Click Save, mark Individual under Prediction intervals. Estimated value of Happiness for X=49.94 years Upper limit of Individual (forecast) Confidence Interval

44 Example: Happiness The forecasted average value of happiness for a life expectancy of approx. 50 years (49.94) is 3.7, the same as the estimated/predicted value. With 95% confidence, the interval 2.0 to 5.2 covers the forecasted value of happiness score in the population of nations with a life expectancy of approx. 50 years.

45 Recap: Using the regression coefficients The estimated regression coefficients (the b coefficients) represent both the type of relationship (positive or negative) and the strength of the relationship between explanatory and dependent variables (the value of b). The regression coefficients have two important functions in meeting the objectives of prediction and explanation for any regression analysis.

46 Explanation The nature and impact of each explanatory variable in making prediction of the dependent variable is often of great interest. An explanation of the relationship between explanatory and dependent variables is gained by examining the relative contributions of each variable. The regression coefficients are indicators of the relative impact and importance of the explanatory variables in the relationship with the dependent variable.

47 Explanation In order to use the regression coefficients for explanation purposes, first ensure that all of the explanatory variables are on comparable scales. Example: If you want to investigate the effect of household income on number of cars in the household, you might include different individuals' income as explanatory variables. Then make sure that all individuals income are measured the same way, e.g. in SEK (and not one persons income in e.g SEK).

48 Explanation Even when all of the explanatory variables are on comparable scales, differences in variability from variable to variable can affect the size of the regression coefficient. To make all explanatory variables comparable in both scale and variability, you can use a modified regression coefficient called the beta coefficient.

49 The beta coefficient The regression coefficients can be standardized, meaning that they are converted to a common scale and variability. When using the standardized beta coefficients you don t have to deal with different units of measurement, they directly reflect the relative impact on the dependent variable of a change in one standard deviation of each variable. Multiple regression provides both the regression coefficients and the standardized beta coefficients.

50 Example: Cheddar cheese As cheddar cheese matures, a variety of chemical processes take place. The taste of matured cheese is related to the concentration of several chemicals in the final product. In a study of cheddar cheese from LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests (n = 30).

51 Example: Cheddar cheese Variables: Taste score (obtained by combining the scores from several tasters) Dependent variable Concentrations of the following chemicals: Acetic acid Hydrogen sulfate Lactic acid Independent (explanatory) variables

52 Example: Cheddar cheese The coefficient for Lactic acid is largest, but the standard error is also largest for that variable Standardized beta coefficients can be used to assess the relative impact of the variables.

53 Cautions when using beta coefficients Beta coefficients should be used as a guide to the relative importance of individual explanatory variables only when collinearity is minimal. Collinearity can distort the contributions of any explanatory variable. The beta values can be interpreted only in the context of the other variables in the equation. A beta value for e.g. Hydrogen sulfide reflects its importance only in relation to Lactic acid and Acetic acid, not in any absolute sense.

54 A decision process for multiple regression analysis From stage 4 Stage 5 Interpret the regression variate Evaluate the prediction equation Evaluate the relative importance of the explanatory variables Assess multicollinearity

55 A decision process for multiple regression analysis From stage 4 Stage 5 Interpret the regression variate Evaluate the prediction equation Evaluate the relative importance of the explanatory variables Assess multicollinearity

56 A decision process for multiple regression analysis From stage 4 Stage 5 Interpret the regression variate Evaluate the prediction equation Evaluate the relative importance of the explanatory variables Assess multicollinearity

57 Assessing multicollinearity Correlation among the explanatory variables may cause problems when interpreting the regression results. Some degree of multicollinearity is however often unavoidable. You need to: Assess the degree of multicollinearity Determine its impact on the results Apply the necessary remedies if needed

58 Multicollinearity affects standard errors Standard errors of the coefficients for the correlated independent/explanatory variables will increase compared to the case of no or low degree of multicollinearity

59 Identifying multicollinearity The simplest and most obvious means of identifying collinearity is an examination of the correlation matrix for the explanatory variables. The presence of strong correlations (generally r 0.90) is the first indication of substantial collinearity. The absence of strong correlations however does not ensure an absence of collinearity. Collinearity may be due to the combined effect of two or more explanatory variables (multicollinearity):

60 Tolerance A direct measure of multicollinearity is tolerance, the amount of variability of an explanatory variable that is not explained by the other explanatory variables. E.g., if the other explanatory variables explain 25% of the variation of the explanatory variable X 1, then the tolerance value of X 1 is 75%. A high tolerance value means a small degree of multicollinearity.

61 Example: Cheddar cheese Tolerance values of 50-54%. Is this good enough? Tolerance measure of multicollinearity SPSS: Analyze >> Regression >> Linear. Click Statistics, mark Collinearity diagnostics

62 Variance inflation factor (VIF) A second measure of multicollinearity is the variance inflation factor (VIF), which is simply the inverse of the tolerance value. Higher degrees of multicollinearity are reflected in lower tolerance values and higher VIF values. VIF is the degree to which the standard error has been increased due to multicollinearity.

63 Example: Cheddar cheese VIF= This means that the standard error for Lactic acid has increased 1.4 times due to multicollinearity. Variance Inflation Factor measure of multicollinearity

64 How much multicollinearity is too much? Small tolerance values (and thus large VIF values) denote high collinearity. A common cutoff threshold is a tolerance value of 0.10, which corresponds to a VIF value of 10. With a VIF value of 10, this tolerance corresponds to standard errors being tripled ( ) Most recommended thresholds still allow for substantial collinearity. You may wish to be more restrictive, especially with small sample sizes.

65 How much multicollinearity is too much? Some suggested guidelines: Bivariate correlations of even 0.70 can impact both the explanation and estimation of the regression results. Even weaker correlations can have an impact if the correlation between explanatory variables is greater than either explanatory variable s correlation with the dependent variable. The suggested cutoff for the tolerance value is When values at this level are encountered, multicollinearity problems are almost certain.

66 Example: Cheddar cheese Correlations Hydrogen Taste score Acetic acid sulfide Lactic acid Taste score Pearson Correlation 1,550 **,756 **,704 ** Sig. (2-tailed),002,000,000 N Acetic acid Pearson Correlation,550 ** 1,618 **,604 ** Sig. (2-tailed),002,000,000 N Hydrogen sulfide Pearson Correlation,756 **,618 ** 1,645 ** Sig. (2-tailed),000,000,000 N Lactic acid Pearson Correlation,704 **,604 **,645 ** 1 Sig. (2-tailed),000,000,000 N **. Correlation is significant at the 0.01 level (2-tailed). Correlations around 0.6 between the explanatory variables, and with the dependent variable.

67 Example: Cheddar cheese All tolerance values far above 0.10 Following the guidelines, we don t have a multicollinearity problem with regards to tolerance values.

68 Remedies for multicollinearity Once the degree of multicollinearity has been determined, you have a number of options: 1) Omit one or more highly correlated explanatory variables and identify other variables to help the prediction (if possible). 2) Use the model with the highly correlated explanatory variables for prediction only (don t interpret the regression coefficients), but be aware of the lowered level of overall predictive ability

69 Remedies for multicollinearity 1) 2) 3) Use the simple correlations between each explanatory variable and the dependent variable to understand the different relationships. 4) Use a more sophisticated method of analysis to obtain a model that more clearly reflects the simple effects of the explanatory variables (not included in this course).

70 Interpreting the regression variate Rules of thumb Interpret the impact of each explanatory variable relative to the other variables in the model, because model respecification can have a profound effect on the remaining variables o Use standardized beta coefficients when comparing relative importance among explanatory variables

71 Interpreting the regression variate Rules of thumb, cont d Multicollinearity is generally viewed as harmful because increases in multicollinearity: o reduce the overall R 2 that can be achieved o o confound estimation of the regression coefficients negatively affect the statistical significance tests of regression coefficients

72 Interpreting the regression variate Rules of thumb, cont d Generally accepted levels of multicollinearity (tolerance values up to 0.10, corresponding to a VIF of 10) almost always indicate problems with multicollinearity, but these problems may also be seen at much lower levels of collinearity and multicollinearity: o Bivariate correlations of 0.70 or higher may result in problems, and even lower correlations may be problematic if they are higher than the correlations between the explanatory and dependent variables

73 Interpreting the regression variate Rules of thumb, cont d o Values much lower than the suggested thresholds (VIF values of even 3 to 5) may result in interpretation or estimation problems, particularly when the relationships with the dependent measure are weaker

74 A decision process for multiple regression analysis From stage 4 Stage 5 Interpret the regression variate Evaluate the prediction equation Evaluate the relative importance of the explanatory variables Assess multicollinearity Stage 6 Validate the results Split-sample analysis PRESS statistic

75 A decision process for multiple regression analysis Stage 6 Stage 6: validation of the results The final stage is to ensure that the regression model represents the general population (generalizability) and is appropriate for the situations in which it will be used (transferability). The best guideline is to which extent the model matches an existing theoretical model, or set of previously validated results on the same topic. If prior results or theory are not available, empirical validation approaches can be used.

76 Additional or split samples The most appropriate empirical validation is to test the regression model on an additional sample drawn from the population. This can be done in several ways: The original model can be used to predict values in the new sample and the predicted values are compared to the actual values of the dependent variable in that sample. A separate model can be estimated with the new sample and then compared with the original equation.

77 Additional or split samples When you don t have the possibility to draw a new sample, you can split the existing sample into two parts. One part for creating the regression model and a second part used for validation of the equation. This is appropriate only when you have large samples! No matter if you use additional or split samples, you will often find differences between the original model and the validation efforts. Your role is then to look for the best model across all samples. No regression model, unless estimated from the entire population, is the final and absolute model.

78 The PRESS statistic An alternative approach to obtaining additional samples for validation purposes is to calculate the PRESS statistic, a measure of the predictive accuracy of the estimated regression model (similar to R 2 ). The PRESS statistic (Predicted Residual Sum of Squares) is based on the model being fitted, repeatedly, leaving out one observation each time. In each repetition the model is used to predict the observation that was left out. The PRESS statistic won t be used during this course.

79 Forecasting with the model Once you have your final validated model, you might want to use it to make predictions or forecasts. When forecasting, i.e. applying the estimated model to a new data set to calculate predicted values for the dependent variable, there are several factors that can have a serious impact on the quality of the new predictions.

80 Forecasting with the model 1) The predictions not only have the sampling variations from the original sample, but also those of the newly drawn sample. Always calculate the confidence intervals of your predictions in addition to the point estimate. 2) Make sure that the conditions and relationships measured at the time the original sample was taken have not changed substantially. 3) Don t use the model to estimate beyond the range of explanatory variables found in the sample.

CHILD HEALTH AND DEVELOPMENT STUDY

CHILD HEALTH AND DEVELOPMENT STUDY CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

Study Guide #2: MULTIPLE REGRESSION in education

Study Guide #2: MULTIPLE REGRESSION in education Study Guide #2: MULTIPLE REGRESSION in education What is Multiple Regression? When using Multiple Regression in education, researchers use the term independent variables to identify those variables that

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multiple Regression (MR) Types of MR Assumptions of MR SPSS procedure of MR Example based on prison data Interpretation of

More information

Correlation and Regression

Correlation and Regression Dublin Institute of Technology ARROW@DIT Books/Book Chapters School of Management 2012-10 Correlation and Regression Donal O'Brien Dublin Institute of Technology, donal.obrien@dit.ie Pamela Sharkey Scott

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression One Categorical Independent Variable with Several Categories Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary

More information

CHAPTER TWO REGRESSION

CHAPTER TWO REGRESSION CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect

More information

Multiple Regression Using SPSS/PASW

Multiple Regression Using SPSS/PASW MultipleRegressionUsingSPSS/PASW The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerablyandisuggest(especiallyifyou reconfused)thatyoureadthischapterinitsentirety.youwillalsoneed

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Small Group Presentations

Small Group Presentations Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

Chapter 10: Moderation, mediation and more regression

Chapter 10: Moderation, mediation and more regression Chapter 10: Moderation, mediation and more regression Smart Alex s Solutions Task 1 McNulty et al. (2008) found a relationship between a person s Attractiveness and how much Support they give their partner

More information

Business Research Methods. Introduction to Data Analysis

Business Research Methods. Introduction to Data Analysis Business Research Methods Introduction to Data Analysis Data Analysis Process STAGES OF DATA ANALYSIS EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS Introduction Preparation of

More information

Example of Interpreting and Applying a Multiple Regression Model

Example of Interpreting and Applying a Multiple Regression Model Example of Interpreting and Applying a Multiple Regression We'll use the same data set as for the bivariate correlation example -- the criterion is 1 st year graduate grade point average and the predictors

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

This tutorial presentation is prepared by. Mohammad Ehsanul Karim STATA: The Red tutorial STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com STATA: The Red tutorial This tutorial presentation is prepared by

More information

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS - CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS SECOND EDITION Raymond H. Myers Virginia Polytechnic Institute and State university 1 ~l~~l~l~~~~~~~l!~ ~~~~~l~/ll~~ Donated by Duxbury o Thomson Learning,,

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Correlation SPSS procedure for Pearson r Interpretation of SPSS output Presenting results Partial Correlation Correlation

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug? MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference

More information

HPS301 Exam Notes- Contents

HPS301 Exam Notes- Contents HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Skala Stress. Putaran 1 Reliability. Case Processing Summary. N % Excluded a 0.0 Total

Skala Stress. Putaran 1 Reliability. Case Processing Summary. N % Excluded a 0.0 Total Skala Stress Putaran 1 Reliability Case Processing Summary N % Cases Valid Excluded a 0.0 Total a. Listwise deletion based on all variables in the procedure. Reliability Statistics Cronbach's Alpha N of

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups Survey Methods & Design in Psychology Lecture 10 ANOVA (2007) Lecturer: James Neill Overview of Lecture Testing mean differences ANOVA models Interactions Follow-up tests Effect sizes Parametric Tests

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Intro to SPSS. Using SPSS through WebFAS

Intro to SPSS. Using SPSS through WebFAS Intro to SPSS Using SPSS through WebFAS http://www.yorku.ca/computing/students/labs/webfas/ Try it early (make sure it works from your computer) If you need help contact UIT Client Services Voice: 416-736-5800

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS C8552 THE UNIVERSITY OF SUSSEX BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS Do not, under any circumstances, remove the question paper, used or unused, from the examination

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Item-Total Statistics

Item-Total Statistics 64 Reliability Case Processing Summary N % Cases Valid 46 00.0 Excluded a 0.0 46 00.0 a. Listwise deletion based on all variables in the procedure. Reliability Statistics Cronbach's Alpha N of Items.869

More information

Regression Including the Interaction Between Quantitative Variables

Regression Including the Interaction Between Quantitative Variables Regression Including the Interaction Between Quantitative Variables The purpose of the study was to examine the inter-relationships among social skills, the complexity of the social situation, and performance

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Modern Regression Methods

Modern Regression Methods Modern Regression Methods Second Edition THOMAS P. RYAN Acworth, Georgia WILEY A JOHN WILEY & SONS, INC. PUBLICATION Contents Preface 1. Introduction 1.1 Simple Linear Regression Model, 3 1.2 Uses of Regression

More information

isc ove ring i Statistics sing SPSS

isc ove ring i Statistics sing SPSS isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements

More information

Introduction to Quantitative Methods (SR8511) Project Report

Introduction to Quantitative Methods (SR8511) Project Report Introduction to Quantitative Methods (SR8511) Project Report Exploring the variables related to and possibly affecting the consumption of alcohol by adults Student Registration number: 554561 Word counts

More information

Lecture 12 Cautions in Analyzing Associations

Lecture 12 Cautions in Analyzing Associations Lecture 12 Cautions in Analyzing Associations MA 217 - Stephen Sawin Fairfield University August 8, 2017 Cautions in Linear Regression Three things to be careful when doing linear regression we have already

More information

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order: 71 CHAPTER 4 RESULTS 4.1 INTRODUCTION In this chapter the results of the empirical research are reported and discussed in the following order: (1) Descriptive statistics of the sample; the extraneous variables;

More information

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:

More information

Media, Discussion and Attitudes Technical Appendix. 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan

Media, Discussion and Attitudes Technical Appendix. 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan Media, Discussion and Attitudes Technical Appendix 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan 1 Contents 1 BBC Media Action Programming and Conflict-Related Attitudes (Part 5a: Media and

More information

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: 1 MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: Predicting State Rates of Robbery per 100K We know that robbery rates vary significantly from state-to-state in the United States. In any given state, we

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

CHAPTER 4: FINDINGS 4.1 Introduction This chapter includes five major sections. The first section reports descriptive statistics and discusses the

CHAPTER 4: FINDINGS 4.1 Introduction This chapter includes five major sections. The first section reports descriptive statistics and discusses the CHAPTER 4: FINDINGS 4.1 Introduction This chapter includes five major sections. The first section reports descriptive statistics and discusses the respondent s representativeness of the overall Earthwatch

More information

Section 6: Analysing Relationships Between Variables

Section 6: Analysing Relationships Between Variables 6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Prediction of sheep milk chemical composition using ph, electrical conductivity and refractive index

Prediction of sheep milk chemical composition using ph, electrical conductivity and refractive index Prediction of sheep milk chemical composition using ph, electrical conductivity and refractive index A.I. Gelasakis *, R. Giannakou *, A. Kominakis, G. Antonakos, G. Arsenos * * Department of Animal Production,

More information

2 Assumptions of simple linear regression

2 Assumptions of simple linear regression Simple Linear Regression: Reliability of predictions Richard Buxton. 2008. 1 Introduction We often use regression models to make predictions. In Figure?? (a), we ve fitted a model relating a household

More information

The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring

The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring Volume 31 (1), pp. 17 37 http://orion.journals.ac.za ORiON ISSN 0529-191-X 2015 The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression

More information

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About 7 Statistical Issues that Researchers Shouldn t Worry (So Much) About By Karen Grace-Martin Founder & President About the Author Karen Grace-Martin is the founder and president of The Analysis Factor.

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis Revised July 2018 Multiple Linear Regression Analysis This set of notes shows how to use Stata in multiple regression analysis. It assumes that you have set Stata up on your computer (see the Getting Started

More information

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA

RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA RESPONSE SURFACE MODELING AND OPTIMIZATION TO ELUCIDATE THE DIFFERENTIAL EFFECTS OF DEMOGRAPHIC CHARACTERISTICS ON HIV PREVALENCE IN SOUTH AFRICA W. Sibanda 1* and P. Pretorius 2 1 DST/NWU Pre-clinical

More information

POL 242Y Final Test (Take Home) Name

POL 242Y Final Test (Take Home) Name POL 242Y Final Test (Take Home) Name_ Due August 6, 2008 The take-home final test should be returned in the classroom (FE 36) by the end of the class on August 6. Students who fail to submit the final

More information

AP Statistics Practice Test Ch. 3 and Previous

AP Statistics Practice Test Ch. 3 and Previous AP Statistics Practice Test Ch. 3 and Previous Name Date Use the following to answer questions 1 and 2: A researcher measures the height (in feet) and volume of usable lumber (in cubic feet) of 32 cherry

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

EMPOWERMENT INDEX AND FACTORS INFLUENCING RURAL WOMEN EMPOWERMENT

EMPOWERMENT INDEX AND FACTORS INFLUENCING RURAL WOMEN EMPOWERMENT CHAPTER VI EMPOWERMENT INDEX AND FACTORS INFLUENCING RURAL WOMEN EMPOWERMENT Contents 6.1 Introduction 6.2 Empowerment Index: Theoritical and Empirical Bases 6.3 Empowerment Measures 6.4 Major Factors

More information

Part 8 Logistic Regression

Part 8 Logistic Regression 1 Quantitative Methods for Health Research A Practical Interactive Guide to Epidemiology and Statistics Practical Course in Quantitative Data Handling SPSS (Statistical Package for the Social Sciences)

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression March 2013 Nancy Burns (nburns@isr.umich.edu) - University of Michigan From description to cause Group Sample Size Mean Health Status Standard Error Hospital 7,774 3.21.014

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

Effects of Nutrients on Shrimp Growth

Effects of Nutrients on Shrimp Growth Data Set 5: Effects of Nutrients on Shrimp Growth Statistical setting This Handout is an example of extreme collinearity of the independent variables, and of the methods used for diagnosing this problem.

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE Spring 20 11, Volume 1, Issue 1 THE STATSWHISPERER The StatsWhisperer Newsletter is published by staff at StatsWhisperer. Visit us at: www.statswhisperer.com Introduction to this Issue The current issue

More information

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression! Equation of Regression Line; Residuals! Effect of Explanatory/Response Roles! Unusual Observations! Sample

More information

Advanced ANOVA Procedures

Advanced ANOVA Procedures Advanced ANOVA Procedures Session Lecture Outline:. An example. An example. Two-way ANOVA. An example. Two-way Repeated Measures ANOVA. MANOVA. ANalysis of Co-Variance (): an ANOVA procedure whereby the

More information

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 10 Correlation and Regression 10 1 Review and Preview 10 2 Correlation 10 3 Regression 10 4 Variation and Prediction Intervals

More information

Performance of Median and Least Squares Regression for Slightly Skewed Data

Performance of Median and Least Squares Regression for Slightly Skewed Data World Academy of Science, Engineering and Technology 9 Performance of Median and Least Squares Regression for Slightly Skewed Data Carolina Bancayrin - Baguio Abstract This paper presents the concept of

More information

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research Chapter 11 Nonexperimental Quantitative Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) Nonexperimental research is needed because

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Carrying out an Empirical Project

Carrying out an Empirical Project Carrying out an Empirical Project Empirical Analysis & Style Hint Special program: Pre-training 1 Carrying out an Empirical Project 1. Posing a Question 2. Literature Review 3. Data Collection 4. Econometric

More information

SPSS output for 420 midterm study

SPSS output for 420 midterm study Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,

More information

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page 1 15.6 Influence Analysis FIGURE 15.16 Minitab worksheet containing computed values for the Studentized deleted residuals, the hat matrix elements, and

More information

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:

More information

Chapter 9: Comparing two means

Chapter 9: Comparing two means Chapter 9: Comparing two means Smart Alex s Solutions Task 1 Is arachnophobia (fear of spiders) specific to real spiders or will pictures of spiders evoke similar levels of anxiety? Twelve arachnophobes

More information

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) UNIT 4 OTHER DESIGNS (CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) Quasi Experimental Design Structure 4.0 Introduction 4.1 Objectives 4.2 Definition of Correlational Research Design 4.3 Types of Correlational

More information

Multiple Regression Models

Multiple Regression Models Multiple Regression Models Advantages of multiple regression Parts of a multiple regression model & interpretation Raw score vs. Standardized models Differences between r, b biv, b mult & β mult Steps

More information

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Overview 1. Survey research and design 1. Survey research 2. Survey design 2. Univariate

More information

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance The SAGE Encyclopedia of Educational Research, Measurement, Multivariate Analysis of Variance Contributors: David W. Stockburger Edited by: Bruce B. Frey Book Title: Chapter Title: "Multivariate Analysis

More information

Section 3 Correlation and Regression - Teachers Notes

Section 3 Correlation and Regression - Teachers Notes The data are from the paper: Exploring Relationships in Body Dimensions Grete Heinz and Louis J. Peterson San José State University Roger W. Johnson and Carter J. Kerk South Dakota School of Mines and

More information

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival* LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

Subescala D CULTURA ORGANIZACIONAL. Factor Analysis

Subescala D CULTURA ORGANIZACIONAL. Factor Analysis Subescala D CULTURA ORGANIZACIONAL Factor Analysis Descriptive Statistics Mean Std. Deviation Analysis N 1 3,44 1,244 224 2 3,43 1,258 224 3 4,50,989 224 4 4,38 1,118 224 5 4,30 1,151 224 6 4,27 1,205

More information

In many cardiovascular experiments and observational studies,

In many cardiovascular experiments and observational studies, Statistical Primer for Cardiovascular Research Multiple Linear Regression Accounting for Multiple Simultaneous Determinants of a Continuous Dependent Variable Bryan K. Slinker, DVM, PhD; Stanton A. Glantz,

More information

Chapter 11 Multiple Regression

Chapter 11 Multiple Regression Chapter 11 Multiple Regression PSY 295 Oswald Outline The problem An example Compensatory and Noncompensatory Models More examples Multiple correlation Chapter 11 Multiple Regression 2 Cont. Outline--cont.

More information

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4. Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation

More information

Survey research (Lecture 1)

Survey research (Lecture 1) Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation

More information

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll  you Reminders/Comments Thanks for the quick feedback I ll try to put HW up on Saturday and I ll email you Final project will be assigned in the last week of class You ll have that week to do it Participation

More information

THE EFFECTIVENESS OF VARIOUS TRAINING PROGRAMMES 1. The Effectiveness of Various Training Programmes on Lie Detection Ability and the

THE EFFECTIVENESS OF VARIOUS TRAINING PROGRAMMES 1. The Effectiveness of Various Training Programmes on Lie Detection Ability and the THE EFFECTIVENESS OF VARIOUS TRAINING PROGRAMMES 1 The Effectiveness of Various Training Programmes on Lie Detection Ability and the Role of Sex in the Process THE EFFECTIVENESS OF VARIOUS TRAINING PROGRAMMES

More information