3.2 Least- Squares Regression

Size: px
Start display at page:

Download "3.2 Least- Squares Regression"

Transcription

1 3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships. When a scatterplot shows a linear relationship, we d like to summarize the overall pattern by drawing a line on the scatterplot. A regression line summarizes the relationship between two variables, but only in a specific setting: when one of the variables helps explain or predict the other Regression, unlike correlation, requires that we have an explanatory variable and a response variable. Regression line - A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value ofy for a given value of x. Example Does Fidgeting Keep You Slim? Regression lines as models Some people don t gain weight even when they overeat. Perhaps fidgeting and other nonexercise activity (NEA) explains why some people may spontaneously increase nonexercise activity when fed more. Researchers deliberately overfed 16 healthy young adults for 8 weeks. They measured fat gain (in kilograms) as the response variable and change in energy use (in calories) from activity other than deliberate exercise fidgeting, daily living, and the like as the explanatory variable. Here are the data: Do people with larger increases in NEA tend to gain less fat? The figure below is a scatterplot of these data. The plot shows a moderately strong, negative linear association between NEA change and fat gain with no outliers. The correlation is r= The line on the plot is a regression line for predicting fat gain from change in NEA

2 3.2.1 Interpreting a Regression Line To regress means to go backward. Why are statistical methods for predicting a response from an explanatory variable called regression? Sir Francis Galton ( ) looked at data on the heights of children versus the heights of their parents. He found that the taller- than- average parents tended to have children who were also taller than average but not as tall as their parents. Galton called this fact regression toward the mean, and the name came to be applied to the statistical method. A regression line is a model for the data, much like density curves. The equation of a regression line gives a compact mathematical description of what this model tells us about the relationship between the response variable y and the explanatory variable x. Regression line - Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form: In this equation, (read y hat ) is the predicted value of the response variable y for a given value of the explanatory variable x. b is the slope, the amount by which y is predicted to change when x increases by one unit. a is the y intercept, the predicted value of y when x = 0. Although you are probably used to the form y = mx + b for the equation of a line from algebra, statisticians have adopted a different form for the equation of a regression line. Some use. We prefer for two reasons: (1) it s simpler (2) your calculator uses this form Don t get so caught up in the symbols that you lose sight of what they mean! The coefficient of x is always the slope, no matter what symbol is used.

3 Example Does Fidgeting Keep You Slim? Interpreting the slope and y intercept The regression line for the figure to the right is shown below: Identify the slope and y intercept of the regression line. Interpret each value in context. The slope of a regression line is an important numerical description of the relationship between the two variables. Although we need the value of the y intercept to draw the line, it is statistically meaningful only when the explanatory variable can actually take values close to zero, as in this setting. Does a small slope mean that there s no relationship? For the NEA and fat gain regression line, the slope b = is a small number. This does not mean that change in NEA has little effect on fat gain. The size of the slope depends on the units in which we measure the two variables. In this setting, the slope is the predicted change in fat gain in kilograms when NEA increases by 1 calorie. There are 1000 grams in a kilogram. If we measured fat gain in grams, the slope would be 1000 times larger, b = You can t say how important a relationship is by looking at the size of the slope of the regression line.

4 3.2.2 Prediction Example Does Fidgeting Keep You Slim? Predicting with a regression line For the NEA and fat gain data, the equation of the regression line is: If a person s NEA increases by 400 calories when she overeats, substitute x = 400 in the equation. The predicted fat gain is: The accuracy of predictions from a regression line depends on how much the data scatter about the line. In this case, fat gains for similar changes in NEA show a spread of 1 or 2 kilograms. The regression line summarizes the pattern but gives only roughly accurate predictions. Can we predict the fat gain for someone whose NEA increases by 1500 calories when she overeats? We can certainly substitute 1500 calories into the equation of the line. The prediction is: Extrapolation - Extrapolation is the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate. Few relationships are linear for all values of the explanatory variable. Don t make predictions using values of x that are much larger or much smaller than those that actually appear in your data.

5 CHECK YOUR UNDERSTANDING Some data were collected on the weight of a male white laboratory rat for the first 25 weeks after its birth. A scatterplot of the weight (in grams)and time since birth (in weeks) shows a fairly strong, positive linear relationship. The linear regression equation models the data fairly well. 1. What is the slope of the regression line? Explain what it means in context. 2. What s the y intercept? Explain what it means in context. 3. Predict the rat s weight after 16 weeks. Show your work. 4. Should you use this line to predict the rat s weight at age 2 years? Use the equation to make the prediction and think about the reasonableness of the result. (There are 454 grams in a pound.)

6 3.2.3 Residuals and the Least- Squares Regression Line In most cases, no line will pass exactly through all the points in a scatterplot. Because we use the line to predict y from x, the prediction errors we make are errors in y, the vertical direction in the scatterplot. A good regression line makes the vertical distances of the points from the line as small as possible. Look at the following example describing the relationship between body weight and backpack weight for a group of 8 hikers. The figure below shows a scatterplot of the data with a regression line added. The prediction errors are marked as bold segments in the graph. These vertical deviations represent leftover variation in the response variable after fitting the regression line. For that reason, they are called residuals. Residual - A residual is the difference between an observed value of the response variable and the value predicted by the regression line. That is:

7 Example Back to the Backpackers Finding a residual Find and interpret the residual for the hiker who weighed 187 pounds. AP EXAM TIP There s no firm rule for how many decimal places to show for answers on the AP exam. Our advice: Give your answer correct to two or three nonzero decimal places. Exception: If you re using one of the tables in the back of the book, give the value shown in the table. The line shown in the figure above makes the residuals for the 8 hikers as small as possible. But what does that mean? Maybe this line minimizes the sum of the residuals. Actually, if we add up the prediction errors for all 8 hikers, the positive and negative residuals cancel out. That s the same issue we faced when we tried to measure deviation around the mean. We ll solve the current problem in much the same way: by squaring the residuals. The regression line we want is the one that minimizes the sum of the squared residuals. That s what the line shown in the above figure does for the hiker data, which is why we call it the least- squares regression line.

8 Least- squares regression line - The least- squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible. The figure at the right gives a geometric interpretation of the least- squares idea for the hiker data. The least- squares regression line shown minimizes the sum of the squared prediction errors, No other regression line would give a smaller sum of squared residuals.

9 CHECK YOUR UNDERSTANDING It s time to practice your calculator regression skills. Using the familiar hiker data in the table below, calculate the least-squares regression line on your calculator. You should get as the equation of the regression line.

10 3.2.4 Calculating the Equation of the Least- Squares Line Another reason for studying the least- squares regression line is that the problem of finding its equation has a simple answer. We can give the equation of the least- squares regression line in terms of the means and standard deviations of the two variables and their correlation. Equation of the least- squares regression line We have data on an explanatory variable x and a response variable y for n individuals From the data, calculate the means and and the standard deviations s x and s y of the two variables and their correlation r. The least- squares regression line is the line with slope and y intercept AP EXAM TIP The formula sheet for the AP exam uses different notation for these equations: and That s because the least- squares line is written as. We prefer our simpler versions without the subscripts.

11 What does the slope of the least- squares line tell us? The figure below shows the regression line in black for the hiker data. We have added four more lines to the graph: a vertical line at the mean body weight a vertical line at + s x (one standard deviation above the mean body weight) a horizontal line at the mean pack weight a horizontal line at + s y (one standard deviation above the mean pack weight) Note that the regression line passes through (, ) as expected. From the graph, the slope of the line is: From the definition box, we know that the slope is Setting the two formulas equal to each other, we have So the unknown distance?? above must be equal to r s y. In other words, for an increase of one standard deviation in the value of the explanatory variable x, the least- squares regression line predicts an increase of r standard deviations in the response variable y.

12 There is a close connection between correlation and the slope of the least- squares line. The slope is This equation says that along the regression line, a change of one standard deviation in x corresponds to a change of r standard deviations in y. When the variables are perfectly correlated (r = 1 or r = 1), the change in the predicted response is the same (in standard deviation units) as the change in x. Otherwise, because 1 r 1, the change in is less than the change in x. As the correlation grows less strong, the prediction moves less in response to changes in x. Example Fat Gain and NEA Calculating the least- squares regression line Refer to the data from the example below: The mean and standard deviation of the 16 changes in NEA are calories (cal) and s x = cal. For the 16 fat gains, the mean and standard deviation are and s y = kg. The correlation between fat gain and NEA change is r = (a) Find the equation of the least- squares regression line for predicting fat gain from NEA change. Show your work.

13 (b) What change in fat gain does the regression line predict for each additional cal of NEA? Explain. What happens if we standardize both variables? Standardizing a variable converts its mean to 0 and its standard deviation to 1. Doing this to both x and y will transform the point ( ) to (0, 0). So the least- squares line for the standardized values will pass through (0, 0). What about the slope of this line? From the formula, it s? b = rs y /s x. Since we standardized, s x = s y = 1. That means b = r. In other words, the slope is equal to the correlation. The Fathom screen shot confirms these results.it shows that r 2 = 0.63, so.

14 3.2.5 How Well the Line Fits the Data: Residual Plots Example Does Fidgeting Keep You Slim? Examining Residuals Let s return to the fat gain and NEA study involving 16 young people who volunteered to overeat for 8 weeks. Those whose NEA rose substantially gained less fat than others. We confirmed that the least- squares regression line for these data is. The calculator screen shot above shows a scatterplot of the data with the least- squares line added. One subject s NEA rose by 135 cal. That subject gained 2.7 kg of fat. (This point is marked in the screen shot with an X.) The predicted fat gain for 135 cal is: The residual for this subject is therefore: This residual is negative because the data point lies below the line. The 16 data points used in calculating the least- squares line produce 16 residuals. Rounded to two decimal places, they are Because the residuals show how far the data fall from our regression line, examining the residuals helps assess how well the line describes the data. Although residuals can be calculated from any model that is fitted to the data, the residuals from the least- squares line have a special property: the mean of the least- squares residuals is always zero. You can check that the sum of the residuals in the above example is The sum is not exactly 0 because we rounded to two decimal places.

15 You can see the residuals in the scatterplot of (a) by looking at the vertical deviations of the points from the line. The residual plot in (b) makes it easier to study the residuals by plotting them against the explanatory variable, change in NEA. Because the mean of the residuals is always zero, the horizontal line at zero in (b) helps orient us. This residual = 0 line corresponds to the regression line in (a). Residual plot - A residual plot is a scatterplot of the residuals against the explanatory variable. Residual plots help us assess how well a regression line fits the data.

16 CHECK YOUR UNDERSTANDING Refer to the data below: 1. Find the residual for the subject who increased NEA by 620 calories. Show your work. 2. Interpret the value of this subject s residual in context. 3. For which subject did the regression line overpredict fat gain by the most? Justify your answer.

17 Examining residual plots A residual plot in effect turns the regression line horizontal. It magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. If the regression line captures the overall pattern of the data, there should be no pattern in the residuals. Figure (a) shows a residual plot with a clear curved pattern. A straight line is not an appropriate model for these data, as Figure (b) confirms. Here are two important things to look for when you examine a residual plot. 1. The residual plot should show no obvious pattern. Ideally, the residual plot will look something like the one in the figure to the right below. This graph shows an unstructured (random) scatter of points in a horizontal band centered at zero. A curved pattern in a residual plot shows that the relationship is not linear. Another type of pattern is shown in the figure to the left. This residual plot reveals increasing spread about the regression line as x increases. Predictions of y using this line will be less accurate for larger values of x. 2. The residuals should be relatively small in size. A regression line that fits the data well should come close to most of the points. That is, the residuals should be fairly small. How do we decide whether the residuals are small enough? We consider the size of a typical prediction error.

18 In the figure above, for example, most of the residuals are between 0.7 and 0.7. For these individuals, the predicted fat gain from the least- squares line is within 0.7 kilogram (kg) of their actual fat gain during the study. That sounds pretty good. But the subjects gained only between 0.4 and 4.2 kg, so a prediction error of 0.7 kg is relatively large compared with the actual fat gain for an individual. The largest residual, 1.64,corresponds to a prediction error of 1.64 kg. This subject s actual fat gain was 3.8 kg, but the regression line predicted a fat gain of only 2.16 kg. That s a pretty large error, especially from the subject s perspective! Standard deviation of the residuals We have already seen that the average prediction error (that is, the mean of the residuals) is 0 whenever we use a least- squares regression line. That s because the positive and negative residuals balance out. But that doesn t tell us how far off the predictions are, on average. Instead, we use the standard deviation of the residuals: For the NEA and fat gain data, the sum of the squared residuals is So the standard deviation of the residuals is: Standard deviation of the residuals - If we use a least- squares line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (s) is given by:

19 CHECK YOUR UNDERSTANDING The graph shown is a residual plot for the least-squares regression of pack weight on body weight for the 8 hikers. 1. The residual plot does not show a random scatter. Describe the pattern you see. 2. For this regression, s = Interpret this value in context.

20 3.2.6 How Well the Line Fits the Data: The Role of r 2 in Regression A residual plot is a graphical tool for evaluating how well a regression line fits the data. The standard deviation of the residuals, s, gives us a numerical estimate of the average size of our prediction errors from the regression line. There is another numerical quantity that tells us how well the least- squares line predicts values of the response variable y. It is r 2, the coefficient of determination. Some computer packages call it R- sq. You may have noticed this value in some of the calculator and computer regression output that we showed earlier. Although it s true that r 2 is equal to the square of r, there is much more to this story. Example Pack weight and body weight How can we predict y if we don t know x? Suppose a new student is assigned at the last minute to our group of 8 hikers. What would we predict for his pack weight? The figure above shows a scatterplot of the hiker data that we have studied throughout this chapter. The least- squares line is drawn on the plot in green. Another line has been added in blue: a horizontal line at the mean y- value,. If we don t know this new student s body weight, then we can t use the regression line to make a prediction. What should we do? Our best strategy is to use the mean pack weight of the other 8 hikers as our prediction. The figure above (a) shows the prediction errors if we use the average pack weight as our prediction for the original group of 8 hikers. We can see that the sum of the squared residuals for this line is SST measures the total variation in the y- values.

21 If we learn our new hiker s body weight, then we could use the least- squares line to predict his pack weight. How much better does the regression line do at predicting pack weights than simply using the average pack weight y of all 8 hikers? Figure (b) reminds us that the sum of squared residuals for the least- squares line is Σ residual 2 = We ll call this SSE, for sum of squared errors. The ratio SSE/SST tells us what proportion of the total variation in y still remains after using the regression line to predict the values of the response variable. In this case, This means that 36.8% of the variation in pack weight is unaccounted for by the least- squares regression line. Taking this one step further, the proportion of the total variation in y that is accounted for by the regression line is We interpret this by saying that 63.2% of the variation in backpack weight is accounted for by the linear model relating pack weight to body weight. For this reason, we define Coefficient of determination - The coefficient of determination r 2 is the fraction of the variation in the values of y that is accounted for by the least- squares regression line of y on x. We can calculater 2 using the following formula: where SSE = Σ residual 2 and. It seems pretty remarkable that the coefficient of determination is actually the correlation squared. This fact provides an important connection between correlation and regression. When you report a regression, give r 2 as a measure of how successful the regression was in explaining the response. When you see a correlation, square it to get a better feel for the strength of the linear relationship.

22 CHECK YOUR UNDERSTANDING 1. For the least-squares regression of fat gain on NEA, r 2 = Which of the following gives a correct interpretation of this value in context? (a) 60.6% of the points lie on the least-squares regression line. (b) 60.6% of the fat gain values are accounted for by the least-squares line. (c) 60.6% of the variation in fat gain is accounted for by the least-squares line. (d) 77.8% of the variation in fat gain is accounted for by the least-squares line. 2. A recent study discovered that the correlation between the age at which an infant first speaks and the child s score on an IQ test given upon entering elementary school is A scatterplot of the data shows a linear form. Which of the following statements about this finding is correct? (a) Infants who speak at very early ages will have higher IQ scores by the beginning of elementary school than those who begin to speak later. (b) 68% of the variation in IQ test scores is explained by the least-squares regression of age at first spoken word and IQ score. (c) Encouraging infants to speak before they are ready can have a detrimental effect later in life, as evidenced by their lower IQ scores. (d) There is a moderately strong, negative linear relationship between age at first spoken word and later IQ test score for the individuals in this study.

23 3.2.7 Interpreting Computer Regression Output The figure above displays the basic regression output for the NEA data from two statistical software packages: Minitab and JMP. Other software produces very similar output. Each output records the slope and y intercept of the least- squares line. The software also provides information that we don t yet need (or understand!), although we will use much of it later. Be sure that you can locate the slope, the y intercept, and the values of s and r 2 on both computer outputs. Once you understand the statistical ideas, you can read and work with almost any software output. AP EXAM TIP Students often have a hard time interpreting the value ofr 2 on AP exam questions. They frequently leave out key words in the definition. Our advice: Treat this as a fill-in-the-blank exercise. Write % of the variation in [response variable name] is accounted for by the regression line.

24 Example Beer and Blood Alcohol Interpreting regression output How well does the number of beers a person drinks predict his or her blood alcohol content (BAC)? Sixteen volunteers with an initial BAC of 0 drank a randomly assigned number of cans of beer. Thirty minutes later, a police officer measured their BAC. Least- squares regression was performed on the data. A scatterplot with the regression line added, a residual plot, and some computer output from the regression are shown below. (a) What is the equation of the least- squares regression line that describes the relationship between beers consumed and blood alcohol content? Define any variables you use. (b) Interpret the slope of the regression line in context.

25 (c) Find the correlation. (d) Is a line an appropriate model to use for these data? What information tells you this? (e) What was the BAC reading for the person who consumed 9 beers? Show your work.

26 3.2.8 Correlation and Regression Wisdom Correlation and regression are powerful tools for describing the relationship between two variables. When you use these tools, you should be aware of their limitations 1. The distinction between explanatory and response variables is important in regression. This isn t true for correlation: switching x and y doesn t affect the value of r. Least- squares regression makes the distances of the data points from the line small only in the y direction. If we reverse the roles of the two variables, we get a different least- squares regression line. Example Predicting Fat Gain, Predicting NEA Two different regression lines Figure a repeats the scatterplot of the NEA data with the least- squares regression line for predicting fat gain from change in NEA added. We might also use the data on these 16 subjects to predict the NEA change for another subject from that subject s fat gain when overfed for 8 weeks. Now the roles of the variables are reversed: fat gain is the explanatory variable and change in NEA is the response variable. Figure b shows a scatterplot of these data with the least- squares line for predicting NEA change from fat gain. The two regression lines are very different. However, no matter which variable we put on the x axis, r 2 = and the correlation is r = Correlation and regression lines describe only linear relationships. You can calculate the correlation and the least- squares line for any relationship between two quantitative variables, but the results are useful only if the scatterplot shows a linear pattern. Always plot your data!

27 3. Correlation and least- squares regression lines are not resistant. You already know that the correlation r is not resistant. One unusual point in a scatterplot can greatly change the value of r. Is the least- squares line resistant? Not surprisingly, the answer is no. The following example sheds some light on this issue. Example Gesell Scores Dealing with unusual points in regression Does the age at which a child begins to talk predict a later score on a test of mental ability? A study of the development of young children recorded the age in months at which each of 21 children spoke their first word and their Gesell Adaptive Score, the result of an aptitude test taken much later. The data appear in the table below. STATE: Can we use a child s age at first word to predict his or her Gesell score? How accurate will our predictions be? PLAN: Let s start by making a scatterplot with age at first word as the explanatory variable and Gesell score as the response variable. If the graph shows a linear form, we ll fit a least- squares line to the data. Then we should make a residual plot. The residuals, r 2, and s will tell us how well the line fits the data and how large our prediction errors will be. DO: The figure below shows a scatterplot of the data. Children 3 and 13, and also Children 16 and 21, have identical values of both variables. We used a different plotting symbol to show that one point stands for two individuals. The scatterplot shows a negative association. That is, children who begin to speak later tend to have lower test scores than early talkers. The overall pattern is moderately linear (a calculator gives r = 0.640). There are two outliers on the scatterplot: Child 18 and Child 19. These two children are unusual in different ways. Child 19 is an outlier in the y direction, with a Gesell score so high that we should check for a mistake in recording it. (In fact, the score is correct.) Child 18 is an outlier in the x direction. This child began to speak much later than any of the other children.

28 We used a calculator to perform least- squares regression. The equation of the least- squares line is We added this line to the scatterplot in figure a above. The slope suggests that for every month older a child is when she first speaks, her Gesell score is predicted to decrease by points. Since a child isn t going to speak her first word at age 0 months, the y intercept of this line has no statistical meaning. How well does the least- squares line fit the data? Figure b above shows a residual plot. The graph shows a fairly random scatter of points around the residual = 0 line with one very large positive residual (Child 19). Most of the prediction errors (residuals) are 10 points or fewer on the Gesell score. We calculated the standard error of the residuals to be s = This is roughly the size of an average prediction error using the regression line. Since r 2 = 0.41, 41% of the variation in Gesell scores is accounted for by the least- squares regression of Gesell score on age at first spoken word. That leaves 59% of the variation in Gesell scores unaccounted for by the linear relationship for these data. CONCLUDE: We can use the equation (age) to predict a child s score on the Gesell test from the age at which the child first speaks. Our predictions may not be very accurate, though. On average, we ll be off by about 11 points on the Gesell score. Also, most of the variation in Gesell score from child to child is not accounted for by this linear model. We should hesitate to use this model to make predictions, especially until we better understand the effect of the two outliers on the regression results. In the previous example, Child 18 and Child 19 were identified as outliers in the scatterplot of figure a. These points are also marked in the residual plot of figure b. Child 19 has a very large residual because this point lies far from the regression line. However, Child 18 has a pretty small residual. That s because Child 18 s point is close to the line. How do these two outliers affect the regression? The figure below shows the results of removing each of these points on the correlation and the regression line. The graph adds two more regression lines, one calculated after leaving out Child 18 and the other after leaving out Child 19. You can see that removing the point for Child 18 moves the line quite a bit. (In fact, the equation of the new least- squares line is ). Because of Child 18 s extreme position on the age scale, this point has a strong influence on the position of the regression line. However, removing Child 19 has little effect on the regression line.

29 Outliers and influential observations in regression An outlier is an observation that lies outside the overall pattern of the other observations. Points that are outliers in the y direction but not the x direction of a scatterplot have large residuals. Other outliers may not have large residuals. An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least- squares regression line. We finish with our most important caution about correlation and regression. 4. Association does not imply causation. When we study the relationship between two variables, we often hope to show that changes in the explanatory variable cause changes in the response variable. A strong association between two variables is not enough to draw conclusions about cause and effect. Sometimes an observed association really does reflect cause and effect. A household that heats with natural gas uses more gas in colder months because cold weather requires burning more gas to stay warm. In other cases, an association is explained by lurking variables, and the conclusion that x causes y is not valid. Example Does Having More Cars Make You Live Longer Association, not causation A serious study once found that people with two cars live longer than people who own only one car. Owning three cars is even better, and so on. There is a substantial positive correlation between number of cars x and length of life y. The basic meaning of causation is that by changing x we can bring about a change in y. Could we lengthen our lives by buying more cars? No. The study used number of cars as a quick indicator of wealth. Well- off people tend to have more cars. They also tend to live longer, probably because they are better educated, take better care of themselves, and get better medical care. The cars have nothing to do with it. There is no cause- and- effect tie between number of cars and length of life.\ Correlations such as those in the previous example are sometimes called nonsense correlations. The correlation is real. What is nonsense is the conclusion that changing one of the variables causes changes in the other. A lurking variable such as personal wealth in this example that influences both x and y can create a high correlation even though there is no direct connection between x and y. Remember: It only makes sense to talk about the correlation between two quantitative variables. If one or both variables are categorical, you should refer to the association between the two variables. To be safe, you can use the more general term association when describing the relationship between any two variables.

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

3.2A Least-Squares Regression

3.2A Least-Squares Regression 3.2A Least-Squares Regression Linear (straight-line) relationships between two quantitative variables are pretty common and easy to understand. Our instinct when looking at a scatterplot of data is to

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships Objectives: Students will: Construct and interpret a scatterplot for a set of bivariate data. Compute and interpret the correlation, r, between two variables. Demonstrate

More information

STAT 201 Chapter 3. Association and Regression

STAT 201 Chapter 3. Association and Regression STAT 201 Chapter 3 Association and Regression 1 Association of Variables Two Categorical Variables Response Variable (dependent variable): the outcome variable whose variation is being studied Explanatory

More information

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression! Equation of Regression Line; Residuals! Effect of Explanatory/Response Roles! Unusual Observations! Sample

More information

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Equation of Regression Line; Residuals Effect of Explanatory/Response Roles Unusual Observations Sample

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

This means that the explanatory variable accounts for or predicts changes in the response variable.

This means that the explanatory variable accounts for or predicts changes in the response variable. Lecture Notes & Examples 3.1 Section 3.1 Scatterplots and Correlation (pp. 143-163) Most statistical studies examine data on more than one variable. We will continue to use tools we have already learned

More information

Regression. Regression lines CHAPTER 5

Regression. Regression lines CHAPTER 5 CHAPTER 5 NASA/GSFC Can scientists predict in advance how many hurricanes the coming season will bring? Exercise 5.44 has some data. Regression IN THIS CHAPTER WE COVER... Linear (straight-line) relationships

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

AP Statistics Practice Test Ch. 3 and Previous

AP Statistics Practice Test Ch. 3 and Previous AP Statistics Practice Test Ch. 3 and Previous Name Date Use the following to answer questions 1 and 2: A researcher measures the height (in feet) and volume of usable lumber (in cubic feet) of 32 cherry

More information

Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation)

Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation) Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation) Investigating relationships between variables is central to what we do in statistics. Why is it important to investigate and

More information

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78 35. What s My Line? You use the same bar of soap to shower each morning. The bar weighs 80 grams when it is new. Its weight goes down by 6 grams per day on average. What is the equation of the regression

More information

Exemplar for Internal Assessment Resource Mathematics Level 3. Resource title: Sport Science. Investigate bivariate measurement data

Exemplar for Internal Assessment Resource Mathematics Level 3. Resource title: Sport Science. Investigate bivariate measurement data Exemplar for internal assessment resource Mathematics 3.9A for Achievement Standard 91581 Exemplar for Internal Assessment Resource Mathematics Level 3 Resource title: Sport Science This exemplar supports

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships CHAPTER 3 Describing Relationships 3.1 Scatterplots and Correlation The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Reading Quiz 3.1 True/False 1.

More information

Chapter 4: Scatterplots and Correlation

Chapter 4: Scatterplots and Correlation Chapter 4: Scatterplots and Correlation http://www.yorku.ca/nuri/econ2500/bps6e/ch4-links.pdf Correlation text exr 4.10 pg 108 Ch4-image Ch4 exercises: 4.1, 4.29, 4.39 Most interesting statistical data

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 10 Correlation and Regression 10 1 Review and Preview 10 2 Correlation 10 3 Regression 10 4 Variation and Prediction Intervals

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll  you Reminders/Comments Thanks for the quick feedback I ll try to put HW up on Saturday and I ll email you Final project will be assigned in the last week of class You ll have that week to do it Participation

More information

Lesson 1: Distributions and Their Shapes

Lesson 1: Distributions and Their Shapes Lesson 1 Name Date Lesson 1: Distributions and Their Shapes 1. Sam said that a typical flight delay for the sixty BigAir flights was approximately one hour. Do you agree? Why or why not? 2. Sam said that

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell q 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included in

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60 M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 1-10 10 11 3 12 4 13 3 14 10 15 14 16 10 17 7 18 4 19 4 Total 60 Multiple choice questions (1 point each) For questions

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups. Activity 1 Examining Data From Class Background Download

More information

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018 [a] Welcome Back! Please pick up a new packet Get a Chrome Book Complete the warm up Choose points on each graph and find the slope of the line. [b] Agenda 05 MIN Warm Up 25 MIN Notes Correlation 15 MIN

More information

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion What proportion of the beads are yellow? In your groups, you will find a 95% confidence interval for the true proportion

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

STATISTICS & PROBABILITY

STATISTICS & PROBABILITY STATISTICS & PROBABILITY LAWRENCE HIGH SCHOOL STATISTICS & PROBABILITY CURRICULUM MAP 2015-2016 Quarter 1 Unit 1 Collecting Data and Drawing Conclusions Unit 2 Summarizing Data Quarter 2 Unit 3 Randomness

More information

12.1 Inference for Linear Regression. Introduction

12.1 Inference for Linear Regression. Introduction 12.1 Inference for Linear Regression vocab examples Introduction Many people believe that students learn better if they sit closer to the front of the classroom. Does sitting closer cause higher achievement,

More information

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Name: Class: Date: Chapter 3 Review Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 3-1 The height (in feet) and volume (in cubic feet) of usable

More information

Unit 3 Lesson 2 Investigation 4

Unit 3 Lesson 2 Investigation 4 Name: Investigation 4 ssociation and Causation Reports in the media often suggest that research has found a cause-and-effect relationship between two variables. For example, a newspaper article listed

More information

7) Briefly explain why a large value of r 2 is desirable in a regression setting.

7) Briefly explain why a large value of r 2 is desirable in a regression setting. Directions: Complete each problem. A complete problem has not only the answer, but the solution and reasoning behind that answer. All work must be submitted on separate pieces of paper. 1) Manatees are

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013 STATISTICS 201 Survey: Provide this Info Outline for today: Go over syllabus Provide requested information on survey (handed out in class) Brief introduction and hands-on activity Name Major/Program Year

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

Eating and Sleeping Habits of Different Countries

Eating and Sleeping Habits of Different Countries 9.2 Analyzing Scatter Plots Now that we know how to draw scatter plots, we need to know how to interpret them. A scatter plot graph can give us lots of important information about how data sets are related

More information

Test 1C AP Statistics Name:

Test 1C AP Statistics Name: Test 1C AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. At the beginning of the school year, a high-school teacher asks every student in her classes

More information

Introduction to regression

Introduction to regression Introduction to regression Regression describes how one variable (response) depends on another variable (explanatory variable). Response variable: variable of interest, measures the outcome of a study

More information

REVIEW PROBLEMS FOR FIRST EXAM

REVIEW PROBLEMS FOR FIRST EXAM M358K Sp 6 REVIEW PROBLEMS FOR FIRST EXAM Please Note: This review sheet is not intended to tell you what will or what will not be on the exam. However, most of these problems have appeared on or are very

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell KEY 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? 15 What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included

More information

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50 Statistics: Interpreting Data and Making Predictions Interpreting Data 1/50 Last Time Last time we discussed central tendency; that is, notions of the middle of data. More specifically we discussed the

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 STAB22H3 Statistics I, LEC 01 and LEC 02 Duration: 1 hour and 45 minutes Last Name: First Name:

More information

Estimation. Preliminary: the Normal distribution

Estimation. Preliminary: the Normal distribution Estimation Preliminary: the Normal distribution Many statistical methods are only valid if we can assume that our data follow a distribution of a particular type, called the Normal distribution. Many naturally

More information

Pitfalls in Linear Regression Analysis

Pitfalls in Linear Regression Analysis Pitfalls in Linear Regression Analysis Due to the widespread availability of spreadsheet and statistical software for disposal, many of us do not really have a good understanding of how to use regression

More information

A response variable is a variable that. An explanatory variable is a variable that.

A response variable is a variable that. An explanatory variable is a variable that. Name:!!!! Date: Scatterplots The most common way to display the relation between two quantitative variable is a scatterplot. Statistical studies often try to show through scatterplots, that changing one

More information

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion Conditions for Estimating p These are the conditions you are expected to check before calculating a confidence interval

More information

about Eat Stop Eat is that there is the equivalent of two days a week where you don t have to worry about what you eat.

about Eat Stop Eat is that there is the equivalent of two days a week where you don t have to worry about what you eat. Brad Pilon 1 2 3 ! For many people, the best thing about Eat Stop Eat is that there is the equivalent of two days a week where you don t have to worry about what you eat.! However, this still means there

More information

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation Chapter 4 More On Bivariate Data Chapter 3 discussed methods for describing and summarizing bivariate data. However, the focus was on linear relationships. In this chapter, we are introduced to methods

More information

Lecture 12 Cautions in Analyzing Associations

Lecture 12 Cautions in Analyzing Associations Lecture 12 Cautions in Analyzing Associations MA 217 - Stephen Sawin Fairfield University August 8, 2017 Cautions in Linear Regression Three things to be careful when doing linear regression we have already

More information

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet Standard Deviation and Standard Error Tutorial This is significantly important. Get your AP Equations and Formulas sheet The Basics Let s start with a review of the basics of statistics. Mean: What most

More information

Chapter 4: More about Relationships between Two-Variables Review Sheet

Chapter 4: More about Relationships between Two-Variables Review Sheet Review Sheet 4. Which of the following is true? A) log(ab) = log A log B. D) log(a/b) = log A log B. B) log(a + B) = log A + log B. C) log A B = log A log B. 5. Suppose we measure a response variable Y

More information

Statisticians deal with groups of numbers. They often find it helpful to use

Statisticians deal with groups of numbers. They often find it helpful to use Chapter 4 Finding Your Center In This Chapter Working within your means Meeting conditions The median is the message Getting into the mode Statisticians deal with groups of numbers. They often find it

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Practice First Midterm Exam

Practice First Midterm Exam Practice First Midterm Exam Statistics 200 (Pfenning) This is a closed book exam worth 150 points. You are allowed to use a calculator and a two-sided sheet of notes. There are 9 problems, with point values

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Homework #3 Name Due Due on on February Tuesday, Due on February 17th, Sept Friday 28th 17th, Friday SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Fill

More information

Lesson 11 Correlations

Lesson 11 Correlations Lesson 11 Correlations Lesson Objectives All students will define key terms and explain the difference between correlations and experiments. All students should be able to analyse scattergrams using knowledge

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

14.1: Inference about the Model

14.1: Inference about the Model 14.1: Inference about the Model! When a scatterplot shows a linear relationship between an explanatory x and a response y, we can use the LSRL fitted to the data to predict a y for a given x. However,

More information

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Chapter 3: Investigating associations between two variables Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Extract from Study Design Key knowledge

More information

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk Pearson Education Limited 2014

More information

Statistics Coursework Free Sample. Statistics Coursework

Statistics Coursework Free Sample. Statistics Coursework Statistics Coursework For my initial investigation I am going to compare results on the following hypothesis, to see if people s intelligence affects their height and their ability to memorise a certain

More information

Brad Pilon & John Barban

Brad Pilon & John Barban Brad Pilon & John Barban 1 2 3 To figure out how many calorie you need to eat to lose weight, you first need to understand what makes up your Metabolic rate. Your metabolic rate determines how many calories

More information

Undertaking statistical analysis of

Undertaking statistical analysis of Descriptive statistics: Simply telling a story Laura Delaney introduces the principles of descriptive statistical analysis and presents an overview of the various ways in which data can be presented by

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA 15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA Statistics does all kinds of stuff to describe data Talk about baseball, other useful stuff We can calculate the probability.

More information

CHILD HEALTH AND DEVELOPMENT STUDY

CHILD HEALTH AND DEVELOPMENT STUDY CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in

More information

Chapter 12. The One- Sample

Chapter 12. The One- Sample Chapter 12 The One- Sample z-test Objective We are going to learn to make decisions about a population parameter based on sample information. Lesson 12.1. Testing a Two- Tailed Hypothesis Example 1: Let's

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 3 Some Key Ingredients for Inferential Statistics Some Key Ingredients for Inferential Statistics Psychologists conduct research to test a theoretical principle

More information

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible Pledge: 10/4/2007 MATH 171 Name: Dr. Lunsford Test 1 100 Points Possible I. Short Answer and Multiple Choice. (36 points total) 1. Circle all of the items below that are measures of center of a distribution:

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Chapter 10 Between Measurements Variables Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. Thought topics Price of diamonds against weight Male vs female age for dating Animals

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape.

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. MODULE 02: DESCRIBING DT SECTION C: KEY POINTS C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. C-2:

More information

Welcome to OSA Training Statistics Part II

Welcome to OSA Training Statistics Part II Welcome to OSA Training Statistics Part II Course Summary Using data about a population to draw graphs Frequency distribution and variability within populations Bell Curves: What are they and where do

More information

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013# UF#Stats#Club#STA##Exam##Review#Packet# #Fall## The following data consists of the scores the Gators basketball team scored during the 8 games played in the - season. 84 74 66 58 79 8 7 64 8 6 78 79 77

More information

GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS

GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS GCSE PSYCHOLOGY UNIT 2 SURVEYS SURVEYS SURVEY = is a method used for collecting information from a large number of people by asking them questions, either

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES CHAPTER SIXTEEN Regression NOTE TO INSTRUCTORS This chapter includes a number of complex concepts that may seem intimidating to students. Encourage students to focus on the big picture through some of

More information

DON'T WORK AND WHAT YOU CAN DO ABOUT IT. BY DR. RHONA EPSTEIN

DON'T WORK AND WHAT YOU CAN DO ABOUT IT. BY DR. RHONA EPSTEIN 5 REASONS WHY DIETS DON'T WORK...... AND WHAT YOU CAN DO ABOUT IT. BY DR. RHONA EPSTEIN Note: This is an excerpt from Food Triggers: End Your Cravings, Eat Well, and Live Better. The book is written for

More information

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice Chapter 14 Inference for Regression Our!nal topic of the year involves inference for the regression model. In Chapter 3 we learned how to!nd the Least Squares Regression Line for a set of bivariate data.

More information

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference 10.1 Estimating with Confidence Chapter 10 Introduction to Inference Statistical Inference Statistical inference provides methods for drawing conclusions about a population from sample data. Two most common

More information

Math 124: Module 2, Part II

Math 124: Module 2, Part II , Part II David Meredith Department of Mathematics San Francisco State University September 15, 2009 What we will do today 1 Explanatory and Response Variables When you study the relationship between two

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

Math 075 Activities and Worksheets Book 2:

Math 075 Activities and Worksheets Book 2: Math 075 Activities and Worksheets Book 2: Linear Regression Name: 1 Scatterplots Intro to Correlation Represent two numerical variables on a scatterplot and informally describe how the data points are

More information

Statistics and Probability

Statistics and Probability Statistics and a single count or measurement variable. S.ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). S.ID.2: Use statistics appropriate to the shape

More information

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information