Chapter 4: More about Relationships between Two-Variables

Similar documents
Chapter 4: More about Relationships between Two-Variables Review Sheet

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

STAT 201 Chapter 3. Association and Regression

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

10. Introduction to Multivariate Relationships

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

STATISTICS INFORMED DECISIONS USING DATA

Chapter 3: Describing Relationships

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

3.4 What are some cautions in analyzing association?

Section I: Multiple Choice Select the best answer for each question.

full file at

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018

BIVARIATE DATA ANALYSIS

STOR 155 Section 2 Midterm Exam 1 (9/29/09)

Name: Class: Date: 1. Use Scenario 4-6. Explain why this is an experiment and not an observational study.

Examining Relationships Least-squares regression. Sections 2.3

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

Chapter 3: Examining Relationships

Chapter 1: Exploring Data

Chapter 01 What Is Statistics?

Lesson 1: Distributions and Their Shapes

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell

MULTIPLE REGRESSION OF CPS DATA

STATISTICS & PROBABILITY

Section 3.2 Least-Squares Regression

Unit 1 Exploring and Understanding Data

CHAPTER 3 Describing Relationships

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

3.2A Least-Squares Regression

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

CP Statistics Sem 1 Final Exam Review

Business Statistics Probability

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

AP Stats Review for Midterm

AP Statistics. Semester One Review Part 1 Chapters 1-5

Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation)

INTERPRET SCATTERPLOTS

Welcome to OSA Training Statistics Part II

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Causation. Victor I. Piercey. October 28, 2009

Chapter 3 CORRELATION AND REGRESSION

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

3.2 Least- Squares Regression

Simplify the expression and write the answer without negative exponents.

Review Questions Part 2 (MP 4 and 5) College Statistics. 1. Identify each of the following variables as qualitative or quantitative:

Ch. 1 Collecting and Displaying Data

3. For a $5 lunch with a 55 cent ($0.55) tip, what is the value of the residual?

4.2 Cautions about Correlation and Regression

Homework Linear Regression Problems should be worked out in your notebook

AP Statistics Practice Test Ch. 3 and Previous

Homework 2 Math 11, UCSD, Winter 2018 Due on Tuesday, 23rd January

Regression. Lelys Bravo de Guenni. April 24th, 2015

1.4 - Linear Regression and MS Excel

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

q3_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

AP Stats Chap 27 Inferences for Regression

Pre-Test Unit 9: Descriptive Statistics

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Chapter 4: Scatterplots and Correlation

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

STATS Relationships between variables: Correlation

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

STAT 503X Case Study 1: Restaurant Tipping

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

Lesson 11 Correlations

Still important ideas

Unit 3 Lesson 2 Investigation 4

Chapter 6 Measures of Bivariate Association 1

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Scatter Plots and Association

CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS

Multiple Choice Questions

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Chapter 1 Where Do Data Come From?

Comparing Different Studies

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

Understandable Statistics

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Statistical Methods Exam I Review

How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

Chapter 14: More Powerful Statistical Methods

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

Lecture 12 Cautions in Analyzing Associations

Unit 8 Bivariate Data/ Scatterplots

Transcription:

1. Which of the following scatterplots corresponds to a monotonic decreasing function f(t)? A) B) C) D)

G Chapter 4: More about Relationships between Two-Variables E) 2. Which of the following transformations is monotonic increasing? A) I transform gas mileage in miles per gallon to gallons per mile. B) I transform a salary from dollars per week to the time required to earn a dollar. C) I transform the outside temperature from degrees Fahrenheit to degrees Centigrade. D) I transform the number of seconds it takes runners to run 100 yards to the number of yards each runs per second. E) All of the above. 3. The transformation displayed in the scatterplot below is A) concave up. D) an example of linear growth. B) concave down. E) an example of exponential growth. C) an example of logarithmic growth. 4. Which of the following is true? A) log(ab) = log A log B. D) log(a/b) = log A log B. B) log(a + B) = log A + log B. E) All of the above. C) log A B = log A log B. 47

5. Suppose we measure a response variable Y at each of several times. A scatterplot of log Y versus time of measurement looks approximately like a positively sloping straight line. We may conclude that A) the correlation between time of measurement and Y is negative, since logarithms of positive fractions (such as correlations) are negative. B) the rate of growth of Y is positive, but slowing down over time. C) a logarithmic growth model would approximately describe the relationship between Y and the time of measurement. D) a mistake has been made. It would have been better to plot Y versus the logarithm of the time of measurement. E) an exponential growth model would approximately describe the relationship between Y and time of measurement. 6. Using least-squares regression, I determine that the (base 10) logarithm of the population of a country is approximately described by the equation [y-hat]log(population)[y-hat] = 13.5 + 0.01 (year) Based on this equation, the population of the country in the year 2010 should be about A) 6.6. B) 735. C) 2,000,000. D) 3,981,072. E) 33,000,000. 7. Which of the following would provide evidence that a power law model describes the relationship between a response variable y and an explanatory variable x? A) A scatterplot of y versus x looks approximately linear. B) A scatterplot of log y versus x looks approximately linear. C) A scatterplot of y versus log x looks approximately linear. D) A scatterplot of log y versus log x looks approximately linear. E) A scatterplot of the square root of y versus x looks approximately linear. 48

8. Which of the following scatterplots would indicate that Y is growing exponentially over time? A) C) B) D) E) 9. A scatterplot of a response variable Y versus an explanatory variable X is given below. Which of the following is true? A) There is a nonlinear relationship between Y and X. B) There is a very strong positive correlation between Y and X because there is an obvious relationship between these variables. C) There is a monotonic relationship between Y and X. D) There is a strong quadratic relationship between Y and X. E) All of the above. 49

10. Suppose the relationship between a response variable y and a predictor variable x is approximately y = 2.7 10 0.5x Which of the following plots would approximately follow a straight line? A) A plot of y against x. D) A plot of 10 y against x. B) A plot of y against log x. E) A plot of log y against log x. C) A plot of log y against x. 11. A scatterplot of the world record time for women in the 10,000-meter run versus the year in which the record was set appears below. Note that the time is in seconds and the data are for the period 1965 1995. Based on this plot, we can expect A) that by 2005, the world record time for women will be well below 1500 seconds. B) that by 2005, the world record time should level out at about 1700 seconds. C) that about every decade, we can expect the world record time to decrease by about 50 seconds. D) that about every decade, we can expect the world record time to decrease by at least 100 seconds. E) none of the above. 50

12. Researchers studied a sample of 100 adults between the ages of 25 and 35 and found a strong negative correlation between the amount of vitamin C an individual consumed and the number of pounds the individual was overweight. Which of the following may we conclude? A) This is strong, but not conclusive, evidence that large amounts of vitamin C inhibit weight gain. B) If the amount of vitamin C consumed and the number of pounds overweight for each individual in this study were plotted on a scatterplot, the points would lie close to a negatively sloping straight line. C) If a larger sample of adults between the ages of 25 and 35 had been studied, the correlation would have been even stronger. D) If people consumed more vitamin C, they would likely lose more weight. E) All of the above. 13. The owner of a chain of supermarkets notices that there is a positive correlation between the sales of beer and the sales of ice cream over the course of the previous year. During seasons when sales of beer were above average, sales of ice cream also tended to be above average. Likewise, during seasons when sales of beer were below average, sales of ice cream also tended to be below average. Which of the following would be a valid conclusion from these facts? A) The sales records must be in error. There should be no association between beer and ice cream sales. B) Temperature is clearly a lurking variable when considering sales of beer and ice cream. C) A scatterplot of monthly ice cream sales versus monthly beer sales would show that a straight line describes the pattern in the plot, but it would have to be a horizontal line. D) Evidently, for a significant proportion of customers of these supermarkets, drinking beer causes a desire for ice cream or eating ice cream causes a thirst for beer. E) None of the above. 51

14. A researcher studies the relationship between the total SAT score (SAT math score plus SAT verbal score) and the grade point average (GPA) of college students at the end of their freshman year. In order to use a relatively homogeneous group of students, the researcher only examines data from high school valedictorians (students who graduated at the top of their high school class) who have completed their first year of college. The researcher finds the correlation between total SAT score and GPA at the end of the freshman year to be very close to 0. Which of the following would be a valid conclusion from these facts? A) Since the group of students studied is a very homogeneous group of students, the results should give a very accurate estimate of the correlation the researcher would find if all college students who have completed their freshman year were studied. B) The correlation we would find if all college students who have completed their freshman year were studied would be even smaller than that found by the researcher. By restricting to valedictorians, the researcher is examining a group that will be more informative than those students who have only completed their freshman year. C) The researcher made a mistake. Correlation cannot be calculated (that is, the formula for correlation is invalid) unless all students who completed their freshman year are included. D) Since the SAT score involves three separate measures, it is not possible to determine a correlation between SAT score and GPA. E) None of the above. 15. When exploring very large sets of data involving many variables, which of the following is true? A) The correlation coefficient will be close to 1 due to the large sample size. B) Associations will be stronger than would be seen in a much smaller subset of the data. C) A strong association is good evidence for causation because it is based on a large quantity of information. D) Extrapolation is safe because it is based on a greater quantity of evidence. E) None of the above. 52

Use the following to answer questions 16 and 17: The scatterplot below plots, for each of the 50 states, the percent of 18-year-olds in the state Y in 1990 that graduated from high school versus the state s infant mortality rate (deaths per 1,000 births) X in 1990. 16. For the data above, the correlation between X and Y is r = 0.54. If instead of plotting these variables for each of the 50 states we plotted the values of these variables for each county in the United States, we would expect the value of the correlation r to be A) exactly the same. B) smaller. C) 0.54 (the magnitude is the same, but the sign changes). D) much higher and probably near 1 since there are many more counties than states. E) much smaller and probably near 0 since there are many more counties than states. 53

17. Referring to the information above, the least-squares regression line was fitted to the data in the scatterplot and the residuals were computed. A plot of the residuals versus the 1990 population in the state is given below. This plot suggests A) that states with larger populations have lower infant mortality rates due to superior hospital facilities. B) that high infant mortality rates imply low nutrition and thus higher dropout rates later in life, but only for states with small populations. C) that population may be a lurking variable in understanding the association between infant mortality rate and percent graduating from high school. D) that high infant mortality rates imply low nutrition and thus higher dropout rates later in life, but only for states with large populations. E) none of the above. 18. Two variables, an explanatory variable x and a response variable y, are measured on each of several individuals. The correlation between these variables is found to be 0.88. To help us interpret this correlation, we should do which of the following? A) Compute the least-squares regression line of y on x and consider whether the slope is positive or negative. B) Interchange the roles of x and y (i.e., treat x as the response variable and y as the explanatory variable) and recompute the correlation. C) Plot the data. D) Determine whether x or y has larger values before computing the residuals. E) All of the above. 54

19. A researcher computed the average SAT math score of all high school seniors who took the SAT exam for each of the 50 states. The researcher also computed the average salary of high school teachers in each of these states and plotted these average salaries against the average SAT math scores. The plot showed a distinct negative association between average SAT math scores and average teacher salaries. A second researcher conducted a similar study, but computed the average SAT math score for each school district in the nation and plotted these against the average salary of high school teachers in each district. The association between average SAT math score and average teacher salaries in the plot of the second researcher will most likely be A) about the same as that seen by the first researcher. B) much weaker than that seen by the first researcher (close to 0). C) much stronger than that seen by the first researcher, but with the opposite sign. D) a little weaker than that seen by the first researcher. E) much stronger than that seen by the first researcher, but with the same sign. 20. Consider the following scatterplot. From this plot we can conclude A) that there is evidence of a modest cause-and-effect relation between X and Y, with increases in X causing increases in Y. B) that there is an outlier in the plot. C) that there is a strongly influential point in the plot. D) that removing the outlier will cause the slope to increase. E) all of the above. 55

21. According to the 1990 census, those states that had an above-average number X of people who failed to complete high school tended to have an above-average number Y of infant deaths. In other words, there was a positive association between X and Y. The most plausible explanation for this association is that A) populations were used instead of rates. B) Y causes X. Therefore, programs that reduce infant deaths will ultimately reduce the number of high school dropouts. C) changes in X and Y are due to a common response to other variables. For example, states with large populations will have both larger numbers of people who fail to complete high school and larger numbers of infant deaths. D) the association between X and Y is purely coincidental. It is implausible to believe the observed association could be anything other than accidental. E) X causes Y. Therefore, programs to keep teens in school will help reduce the number of infant deaths. 22. When possible, the best way to establish that an observed association is the result of a cause-and-effect relation is by means of A) the least-squares regression line. B) the correlation coefficient. C) randomization to select the data variables. D) a well-designed experiment. E) examining z-scores rather than the original variables. 23. Which of the following would be necessary to establish a cause-and-effect relation between two variables? A) Strong association between the variables. B) A well-designed experiment. C) Plausibility of the alleged cause. D) An association between the variables observed in many different settings. E) All of the above. 56

24. Recent data show that states that spend an above-average amount of money X per pupil in high school tend to have below-average mean SAT verbal scores Y of all students taking the SAT in the state. In other words, there is a negative association between X (spending per pupil) and Y (mean SAT verbal score). High spending per pupil and low mean SAT verbal scores are particularly common in states that have a large percentage of all high school students taking the exam. Such states also tend to have larger populations. The most plausible explanation for the observed association between X and Y is that A) the association between X and Y is causal since more money spent on education leads to higher averages for each state. B) Y causes X. Low SAT scores create concerns about the quality of education. This inevitably leads to additional spending to help solve the problem. C) changes in X and Y are due to a common response to other variables. If a higher percentage of students take the exam, the average score will be lower. Also, states with larger populations have large urban areas where the cost of living is higher and more money is needed for expenses. D) the association between X and Y is purely coincidental. It is implausible to believe the observed association could be anything other than accidental. E) X causes Y. Overspending generally leads to extra, unnecessary programs, diverting attention from basic subjects. Inadequate training in these basic subjects generally leads to lower SAT scores. 25. A researcher observes that, on average, the number of divorces in cities with major league baseball teams is larger than in cities without major league baseball teams. The most plausible explanation for this observed association is that A) the presence of a major league baseball team causes the number of divorces to rise (perhaps husbands are spending too much time at the ballpark). B) the high number of divorces is responsible for the presence of a major league baseball team (more single men means potentially more fans at the ballpark, making it attractive for an owner to relocate to such cities). C) the association is due to common response (major league teams tend to be in large cities with more people and thus with a greater number of divorces). D) the observed association is purely coincidental. It is implausible to believe the observed association could be anything other than accidental. E) the presence of a major league baseball team in a city will increase the mean income (some wives would expect that their husbands would have more money to spend on them). 57

Use the following to answer questions 26-27: An article in the student newspaper of a large university with the headline A s Swapped for Evaluations? included the following: According to a new study, teachers may be more inclined to give higher grades to students, hoping to gain favor with the university administrators who grant tenure. The study examined the average grade and teaching evaluation in a large number of courses given in 1997 in order to investigate the effects of grade inflation on evaluations. I am concerned with student evaluations because instruction has become a popularity contest for some teachers, said Professor Smith, who recently completed the study. Results showed higher grades directly corresponded to a more positive evaluation. 26. The underlined statement means that the study found A) that course grade is positively associated with teaching evaluation. B) that higher evaluations were the direct result of higher grades. C) that there was a perfect positive correlation between course grade and teaching evaluation. D) that teaching evaluation is negatively associated with course grade. E) all of the above. 27. Which of the following would be a valid conclusion to draw from the study cited in the article? A) Teachers who give higher grades are more likely to gain tenure. B) A good teacher, as measured by teaching evaluations, helps students learn better, resulting in higher grades. C) Teachers of courses in which the mean grade is above average apparently tend to have above-average teaching evaluations. D) A teacher can improve his or her teaching evaluations by giving good grades. E) All of the above. 58

28. A researcher computed the average SAT math score of all high school seniors who took the SAT exam for each of the 50 states. The researcher also computed the average salary of high school teachers in each of these states and plotted these average salaries against the average SAT math scores for each state. The plot showed a distinct negative association between average SAT math scores and teacher salaries. The researcher may legitimately conclude which of the following? A) Increasing the average salary of teachers will cause the average of SAT math scores to decrease, but it is not correct to conclude that increasing the salaries of individual teachers causes the SAT math scores of individual students to increase. B) States that pay teachers high salaries tend to do a poor job of teaching mathematics, on average. C) As the pay for an individual teacher increases, the teacher s students are more likely to do poorly on the SAT math. D) The data used by the researcher do not provide evidence that increasing the salaries of teachers will cause the performance of students on the SAT math to get worse. E) States in which students tend to perform poorly in mathematics probably have a higher proportion of problem students and thus need to pay teachers higher salaries in order to attract them to teach in those states. 29. The average number of home runs hit by major league baseball players is greater now than it was three decades ago. A researcher suspects that the reason may be that baseballs are livelier now than they were 30 years ago. To check this he tested two baseballs, one that was manufactured 30 years ago (but never used) and one that was new. He noticed that the new baseball bounced higher than the older ball when both were dropped from the same height; that is, the new baseball was livelier than the old one. The researcher can legitimately conclude A) that there is a positive association between the liveliness of the balls tested and the average number of home runs hit in the year that the ball was manufactured. B) that newer baseballs are livelier than older baseballs. C) that there is good evidence that the increase in the liveliness of baseballs has caused the increase in home runs. This is because there is a positive association between liveliness of baseballs and average number of home runs hit and because there is a plausible theory for the observed association. D) that baseballs have been gradually getting livelier over the last three decades. E) all of the above. 30. A researcher notices that in a sample of adults, those that take larger amounts of vitamin C have fewer illnesses. However, those that take larger amounts of vitamin C also tend to exercise more. As explanations for having fewer illnesses, the variables amount of vitamin C taken and amount of exercise are A) skewed. B) confounded. C) common responses. D) symmetric. E) linked. 59

31. In 1982 Kennesaw, Georgia, passed a law requiring all citizens to own at least one gun. Although the law was never enforced, six months after the law was passed the number of burglaries in that month was less than in the month prior to passage of the law. We may conclude which of the following? A) Gun ownership and burglary rates are negatively associated. B) Gun ownership causes a reduction in crime. This is because there is a negative association between gun ownership and burglary rates and because there is a plausible explanation for this association (gun ownership acts as a deterrent to crime). C) Criminals are more likely to avoid homes in towns where guns are more prevalent. D) All of the above. E) None of the above. 32. A study of the salaries of full professors at Upper Wabash Tech shows that the median salary for female professors is considerably less than the median male salary. However, further investigation shows that the median salaries for male and female full professors are about the same in every department (English, physics, etc.) of the university. This apparent contradiction is an example of A) extrapolation. B) Simpson's paradox. C) confounded responses. D) correlation. E) causation. 33. The reversal of the direction of an association when lurking variables are taken into account is called A) Simpson s paradox. D) a residual plot. B) least-squares regression. E) negative association. C) confounding. 34. The two-way table below categorizes suicides committed in 1983 by the sex of the victim and the method used. Method Male Female Firearms 13,959 2,641 Poison 3,148 2,469 Hanging 3,222 709 Other 1,457 690 Which of the following statements is consistent with the table? A) There is absolutely no evidence of a relation between the sex of the victim and the method of suicide used. B) More women commit suicide than men. C) Men display a greater tendency to use firearms to commit suicide than do women. D) The correlation between method of suicide and sex of the victim is clearly positive. E) The proportion of men who use poison to commit suicide is higher than the proportion of women who use poison to commit suicide. 60

35. In a study of the link between high blood pressure and cardiovascular disease, a group of white males ages 35 to 64 was followed for five years. At the beginning of the study, each man had his blood pressure measured; the blood pressure was classified as either low systolic blood pressure (less than 140 mmhg) or high systolic blood pressure (140 mmhg or higher). The following table gives the number of men in each blood pressure category and the number of deaths from cardiovascular disease during the fiveyear period. Blood Pressure Deaths Total Low 10 2000 High 50 3500 Based on the data given here, which of the following statements is correct? A) These data are consistent with the idea that there is a link between high blood pressure and death from cardiovascular disease. B) More men have high blood pressure, so it is reasonable to expect more deaths among men due to cardiovascular disease. C) These data probably understate the link between high blood pressure and death from cardiovascular disease, since men will tend to understate their true blood pressure. D) The mortality rate (proportion of deaths) for men with high blood pressure is five times that of men with low blood pressure. E) All of the above. 36. X and Y are two categorical variables. The best way to determine whether there is a relationship between them is to A) compute the least-squares regression line between X and Y. B) draw a scatterplot of the X and Y values. C) make a two-way table of the X and Y values. D) calculate the correlation between X and Y. E) do all of the above. Use the following to answer questions 37 through 39: A business has two types of employees, managers and workers. Managers earn either $100,000 or $200,000 per year. Workers earn either $10,000 or $20,000 per year. The number of male and female managers at each salary level and the number of male and female workers at each salary level are given in the tables below. Managers Workers Male Female Male Female $100,000 80 20 $10,000 30 20 $200,000 20 30 $20,000 20 80 61

37. The proportion of male managers who make $200,000 per year is A) 0.067. B) 0.133. C) 0.200. D) 0.400. E) 0.600. 38. The proportion of female managers who make $200,000 per year is A) 0.100. B) 0.200. C) 0.300. D) 0.400. E) 0.600. 39. From these data, we may conclude A) that the mean salary of female managers is greater than that of male managers. B) that the proportion of female managers earning $200,000 per year is greater than the proportion of male managers earning $200,000 per year. C) that the mean salary of female workers is greater than that of male workers. D) that the mean salary of males in this business is greater than the mean salary of females. E) all of the above. Use the following to answer questions 40 through 43: A review of voter registration records in a small town yielded the following table of the number of males and females registered as Democrat, Republican, or some other affiliation. Male Female Democrat 300 600 Republican 500 300 Other 200 100 62

40. Which of the following bar graphs represents the distribution of Democrats, Republicans, and other affiliations in this town? A) B) C) D) E) None of the above. 41. The proportion of males that are registered as Democrats is A) 300. B) 30. C) 0.33. D) 0.30. E) 0.15. 42. The proportion of registered Democrats who are male is A) 300. B) 33. C) 0.33. D) 0.30. E) 0.15. 43. The proportion of all voters who are male and registered Democrats is A) 300. B) 15. C) 0.33. D) 0.30. E) 0.15. 63

The U.S. Population: An Exponential Regression Here is the U.S. Population, in millions, from 1790 until 1990: Year Pop Year Pop Year Pop 1790 3.9 1860 31.4 1930 122.8 1800 5.3 1870 39.8 1940 131.7 1810 7.2 1880 50.2 1950 151.3 1820 9.6 1890 62.9 1960 179.3 1830 12.9 1900 76 1970 203.3 1840 17.1 1910 92 1980 226.5 1850 23.2 1920 105.7 1990 248.7 Use the table above to answer questions 1-9. 1) Make a scatterplot of the data. Does the data appear to be linear? 2) Write down the linear regression line and r. How would you classify this value of r? 3) Straighten out the data. What did you do? Does it now look more linear? 4) At about what year in U.S. history did the slope change, i.e., did population appear to slow down? 5) Create a new linear regression line. Write out the new r. How would you classify this value of r? 6) Perform an inverse transformation to de-log your y s. Estimate the population for 1935, 1835, 1845. 7) What do you notice about the line and its proximity to the points for the dates 1835 and 1845? 8) Hit 2 nd Quit on your calculator. Hit STAT, Arrow right to CALC. Arrow down to EXP REG. Hit ENTER. Input L1, L2, Y3. Write out the resulting r. Compare it with r in Step 5. What do you notice? 9) What is b and what is the in-context meaning of b? 10) A review of voter registration for a small town reveals that voters are either Democrats, Republicans or Other. There are 600 Democrat, 300 Republican and 100 Other women. There are 1000 males of which 200 are Other and 500 are Republican. Create and label a two way table for this information (include marginal distributions). 11) Using the above information answer the following questions: a) What is the conditional distribution of males given that they are democrat? 64

b) What is the conditional distribution of Democrats given that they are male? c) What is the difference between the two? 12) Define Simpson s paradox. 13) You are given some bivariate data. You use your calculator to get the regression of (ln x, ln y) Your y1 =.2 +.4x is in your calculator. When x = 2, what is the correct prediction for y? 14) List the lurking variables in the following situations: a) Neighborhoods with station wagons tend to have more playgrounds b) Beaches with more sand than rocks tend to be older c) Towns with more teachers have higher sales of floor wax and cat litter 65