Simple Linear Regression the model, estimation and testing Lecture No. 05
Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
Example 1 dependent variable random error (residual) intercept independent variable slope
Simple Linear Regression the model The goal of a regression analysis is to obtain predictions of one variable using the known values of another
Simple Linear Regression Three assumptions: The ε term is assumed to be random variable that: 1. Has a mean of 0 2. Is normally distributed 3. Has constant variance at every value of X (Homoscedastic)
Simple Linear Regression Three assumptions: For any given value of x, the y values are assumed to be normally distributed about the population regression line and to have the same standard deviation σ The regression line based on sample data is an estimate of this true line.
Example 1 Sample regression line
The Least-Squares Criterion The least-squares criterion requires that the sum of the squared deviations between y values in the scatter diagram and y values predicted by the equation be minimized. In symbolic terms:
Determining the Least-Squares Regression Line
Example 1
Example 1
Example 1 - Point Estimates Using the Regression Line If a job applicant were to score x = 15 on the manual dexterity test, we would predict this person would be capable of producing 64.2 units per hour on the assembly line.
Estimation of standard error To develop interval estimates for the dependent variable, we must first determine the standard error of estimate. This is a standard deviation describing the dispersion of data points above and below the regression line. The formula for the standard error of estimate is shown below and is very similar to that for determining a sample standard deviation s:
Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
Example 1 Now calculate the standard error of estimate as
Confidence and prediction Interval for the mean of y given a specific x value Given a specific value of x, we can make two kinds of interval estimates regarding y: (1) a confidence interval for the (unknown) true mean of y, and (2) a prediction interval for an individual y observation.
Confidence interval for the mean of y given a specific x value
Example 1 Confidence Interval For persons scoring x = 15 on the dexterity test, what is the 95% confidence interval for their mean productivity? For the 95% level of confidence and df=n-2=3, t =3.182 and the 95% confidence interval can now be calculated as Based on these calculations, we have 95% confidence that the mean productivity for persons scoring x = 15 on the dexterity test will be between 59.919 and 68.481 units per hour.
Prediction Interval for an Individual y Observation For a given value of x, the estimation interval for an individual y observation is called the prediction interval. Prediction interval for an individual y, given a specific value of x: additional 1
Example 1 Prediction Interval A prospective employee has scored x = 15 on the dexterity test. What is the 95% prediction interval for his productivity? For this applicant, we have 95% confidence that his productivity as an employee would be between 54.436 and 73.964 units per hour.
Example 1 Prediction Interval The 95% prediction interval for individual y values becomes slightly wider whenever the interval is based on x values that are farther away from the mean of x.
Testing and Estimation for the Slope
Testing and Estimation for the Slope
Example 1 Testing and Estimation for the Slope An equivalent method of testing the significance of the linear relationship is to examine whether the slope β 1 of the population regression line could be zero. For the dexterity test data, the slope of the sample regression line was b 1 = 3.0. 1. Using the 0.05 level of significance, examine whether the slope of the population regression line could be zero. 2. Construct the 95% confidence interval for the slope of the population regression line.
Example 1 Testing and Estimation for the Slope
Example 1 Testing and Estimation for the Slope p value We reject the null hypothesis
Confidence interval for the Slope
Example 1 Testing and Estimation for the Slope 95% Confidence Interval for the Slope of the Population Regression Line
Example 2 50 randomly selected students took a math aptitude test before they began their statistics course. The Statistics Department has three questions. What linear regression equation best predicts statistics performance, based on math aptitude scores? If a student made an 80 on the aptitude test, what grade would we expect him to make in statistics? Make a confidence prediction interval for x=80 using 0.05 level of significance
Example 2 Solution in Excel
Example 2
Example 2
Example 2
Example 2 Solution in STATISTICA
Example 2 1 2 3
Example 2 1 2 3
Example 2
Example 2 another way to plot the graphs 1 2 3 4
Example 2 another way to plot the graphs
Example 2 another way to plot the graphs Regression bands Prediction intervals Confidence intervals
Example 2 1 2 3
Example 2 If a student made an 80 on the aptitude test, what grade would we expect him to make in statistics? Make a confidence prediction interval for x=80 using 0.05 level of significance.
Example 2