Simple Linear Regression the model, estimation and testing


 Tyler Casey
 11 months ago
 Views:
Transcription
1 Simple Linear Regression the model, estimation and testing Lecture No. 05
2 Example 1 A production manager has compared the dexterity test scores of five assemblyline employees with their hourly productivity.
3 Example 1 dependent variable random error (residual) intercept independent variable slope
4 Simple Linear Regression the model The goal of a regression analysis is to obtain predictions of one variable using the known values of another
5 Simple Linear Regression Three assumptions: The ε term is assumed to be random variable that: 1. Has a mean of 0 2. Is normally distributed 3. Has constant variance at every value of X (Homoscedastic)
6 Simple Linear Regression Three assumptions: For any given value of x, the y values are assumed to be normally distributed about the population regression line and to have the same standard deviation σ The regression line based on sample data is an estimate of this true line.
7 Example 1 Sample regression line
8 The LeastSquares Criterion The leastsquares criterion requires that the sum of the squared deviations between y values in the scatter diagram and y values predicted by the equation be minimized. In symbolic terms:
9 Determining the LeastSquares Regression Line
10 Example 1
11 Example 1
12 Example 1  Point Estimates Using the Regression Line If a job applicant were to score x = 15 on the manual dexterity test, we would predict this person would be capable of producing 64.2 units per hour on the assembly line.
13 Estimation of standard error To develop interval estimates for the dependent variable, we must first determine the standard error of estimate. This is a standard deviation describing the dispersion of data points above and below the regression line. The formula for the standard error of estimate is shown below and is very similar to that for determining a sample standard deviation s:
14 Example 1 A production manager has compared the dexterity test scores of five assemblyline employees with their hourly productivity.
15 Example 1 Now calculate the standard error of estimate as
16 Confidence and prediction Interval for the mean of y given a specific x value Given a specific value of x, we can make two kinds of interval estimates regarding y: (1) a confidence interval for the (unknown) true mean of y, and (2) a prediction interval for an individual y observation.
17 Confidence interval for the mean of y given a specific x value
18 Example 1 Confidence Interval For persons scoring x = 15 on the dexterity test, what is the 95% confidence interval for their mean productivity? For the 95% level of confidence and df=n2=3, t =3.182 and the 95% confidence interval can now be calculated as Based on these calculations, we have 95% confidence that the mean productivity for persons scoring x = 15 on the dexterity test will be between and units per hour.
19 Prediction Interval for an Individual y Observation For a given value of x, the estimation interval for an individual y observation is called the prediction interval. Prediction interval for an individual y, given a specific value of x: additional 1
20 Example 1 Prediction Interval A prospective employee has scored x = 15 on the dexterity test. What is the 95% prediction interval for his productivity? For this applicant, we have 95% confidence that his productivity as an employee would be between and units per hour.
21 Example 1 Prediction Interval The 95% prediction interval for individual y values becomes slightly wider whenever the interval is based on x values that are farther away from the mean of x.
22 Testing and Estimation for the Slope
23 Testing and Estimation for the Slope
24 Example 1 Testing and Estimation for the Slope An equivalent method of testing the significance of the linear relationship is to examine whether the slope β 1 of the population regression line could be zero. For the dexterity test data, the slope of the sample regression line was b 1 = Using the 0.05 level of significance, examine whether the slope of the population regression line could be zero. 2. Construct the 95% confidence interval for the slope of the population regression line.
25 Example 1 Testing and Estimation for the Slope
26 Example 1 Testing and Estimation for the Slope p value We reject the null hypothesis
27 Confidence interval for the Slope
28 Example 1 Testing and Estimation for the Slope 95% Confidence Interval for the Slope of the Population Regression Line
29 Example 2 50 randomly selected students took a math aptitude test before they began their statistics course. The Statistics Department has three questions. What linear regression equation best predicts statistics performance, based on math aptitude scores? If a student made an 80 on the aptitude test, what grade would we expect him to make in statistics? Make a confidence prediction interval for x=80 using 0.05 level of significance
30 Example 2 Solution in Excel
31 Example 2
32 Example 2
33 Example 2
34 Example 2 Solution in STATISTICA
35 Example
36 Example
37 Example 2
38 Example 2 another way to plot the graphs
39 Example 2 another way to plot the graphs
40 Example 2 another way to plot the graphs Regression bands Prediction intervals Confidence intervals
41 Example
42 Example 2 If a student made an 80 on the aptitude test, what grade would we expect him to make in statistics? Make a confidence prediction interval for x=80 using 0.05 level of significance.
43 Example 2