REVIEW PROBLEMS FOR FIRST EXAM

Similar documents
c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

AP Statistics Practice Test Ch. 3 and Previous

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Chapter 4 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

3.2 Least- Squares Regression

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Chapter 2: The Normal Distributions

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

Section 3.2 Least-Squares Regression

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

3. For a $5 lunch with a 55 cent ($0.55) tip, what is the value of the residual?

Math 075 Activities and Worksheets Book 2:

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Review Questions Part 2 (MP 4 and 5) College Statistics. 1. Identify each of the following variables as qualitative or quantitative:

Ch. 1 Collecting and Displaying Data

Chapter 1: Exploring Data

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Chapter 3 CORRELATION AND REGRESSION

Chapter 5: Producing Data Review Sheet

3.2A Least-Squares Regression

Business Statistics Probability

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

Moore, IPS 6e Chapter 03

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013

(a) 50% of the shows have a rating greater than: impossible to tell

Test 1: Professor Symanzik Statistics

Caffeine & Calories in Soda. Statistics. Anthony W Dick

STOR 155 Section 2 Midterm Exam 1 (9/29/09)

Unit 3 Lesson 2 Investigation 4

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Statistical Methods Exam I Review

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

AP Statistics Practice Test Unit Seven Sampling Distributions. Name Period Date

AP Statistics Chapter 5 Multiple Choice

Practice First Midterm Exam

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

1.4 - Linear Regression and MS Excel

Multiple Choice Questions

Unit 1 Exploring and Understanding Data

(a) 50% of the shows have a rating greater than: impossible to tell

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

12.1 Inference for Linear Regression. Introduction

Correlation & Regression Exercises Chapters 14-15

AP Statistics Exam III Multiple Choice Questions

Test 1C AP Statistics Name:

Ch 4 Practice Test. Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 4-1

AQA (A) Research methods. Model exam answers

Sociology 4 Winter PART ONE -- Based on Baker, Doing Social Research, pp , and lecture. Circle the one best answer for each.

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

AP Statistics Exam Review: Strand 2: Sampling and Experimentation Date:

Statistics 571: Statistical Methods Summer 2003 Final Exam Ramón V. León

V. Gathering and Exploring Data

DISCRETE RANDOM VARIABLES: REVIEW

AP Stats Review for Midterm

Problem set 2: understanding ordinary least squares regressions

Chapter 3: Examining Relationships

Homework #2 is due next Friday at 5pm.

bivariate analysis: The statistical analysis of the relationship between two variables.

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

STATISTICS INFORMED DECISIONS USING DATA

STAT 503X Case Study 1: Restaurant Tipping

COUNTY LEVEL DATA FROM PWB POLLING BOULDER

7) Briefly explain why a large value of r 2 is desirable in a regression setting.

Sampling Reminders about content and communications:

(Weighted sample of 98 respondents) How serious are these issues to Boulder residents? Extremely serious Very serious Somewhat serious 38% 44% 31%

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

14.1: Inference about the Model

Exam 2 Solutions: Monday, April 2 8:30-9:50 AM

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Regression Including the Interaction Between Quantitative Variables

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Measuring the Illegal Drug Economy of Australia in a National Accounts Framework: Some Experimental Estimates Drug Policy Modelling Program Symposium

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

Still important ideas

Homework 2 Math 11, UCSD, Winter 2018 Due on Tuesday, 23rd January

Review for Test 2 STA 3123

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice

Homework Linear Regression Problems should be worked out in your notebook

COUNTY LEVEL DATA FROM PWB POLLING JEFFERSON COUNTY

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

IAPT: Regression. Regression analyses

CP Statistics Sem 1 Final Exam Review

5 To Invest or not to Invest? That is the Question.

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

COUNTY LEVEL DATA FROM PWB POLLING BROOMFIELD COUNTY

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Lesson 1: Distributions and Their Shapes

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

EWMBA 296 (Fall 2015)

Transcription:

M358K Sp 6 REVIEW PROBLEMS FOR FIRST EXAM Please Note: This review sheet is not intended to tell you what will or what will not be on the exam. However, most of these problems have appeared on or are very similar to problems that have appeared on previous exams in this course. Remember that on the exam you will be expected to give reasons and explain your work clearly. 1. A researcher wishes to study how the average weight y (in kilograms) of children changes during the first year of life. He gathers data from a suitable sample of children, recording each child s weight each month for a year after they are born. He then plots mean weight y against age x (in months) and fits a least squares regression line to the data, with x as the explanatory variable and y as the response variable. He computes the following quantities: r (= correlation between x and y) =.9 x-bar (= mean of the values of x) = 6.5 y-bar (= mean of the values of y) = 6.6 s x (= standard deviation of the values of x) = 3.6 s y (= standard deviation of the values of y) = 1.2 a. Based on this information, how much would you expect mean weight to change as age increases one month in this age range? (Explain) b. What percent of the variation in mean weight is accounted for by regression on age, for this age range? (Explain briefly) c. If instead of mean weights, we considered the individual weights of all the children in the sample (but still in the same age range), how would you expect the correlation between weight and age to compare with the correlation between average weight and age given here? (Explain ) d. If you just knew x-bar, y-bar, s x and s y (i.e., not r or the original data), you would still know one point on the regression line. What is that point? e. Find an equation for the least squares regression line. f. A second researcher collects data on another sample of children four months old. She finds their mean weight to be 5 kg. Find the predicted mean weight of four

month old children from the first sample, and find the residual for the observed mean weight in the second sample. 2. The distribution of actual weights of chocolate bars produced by a certain machine is normal with a mean of 8.1 oz. and a standard deviation of.1 oz. What weight should be put on the chocolate bar wrappers so that only 1% of bars are underweight? (Explain) 3. The temperature at randomly chosen locations in a kiln used in the manufacture of bricks is normally distributed with a mean of F and a standard deviation of F. If bricks are fired at a temperature above 1125 F, they will crack. If they are fired at a temperature below F, they will discolor. If bricks are placed randomly throughout the kiln when fired, what proportion will be good (i.e., neither cracked nor discolored)? 4. One hundred volunteers who suffer from severe depression are available for a study. The study will compare the effectiveness of a new drug for treating severe depression with an existing drug for treating this condition. A psychiatrist will evaluate the symptoms of all volunteers after two months to determine if there has been a substantial improvement in the severity of the depression. Describe an appropriate design for the study. Explain why your design is appropriate. 5. True or false. If the statement is false, explain why. (Depending on the statement, this may need to be done by an example showing that the statement is false.) a. A sample chosen in such a way that every unit in the population has the same chance of being selected is a simple random sample. b. A normal quantile plot is formed by plotting points (z,x) where z is the normal score (x-µ)/σ of x. c. If the mean of a distribution equals the median of the distribution, then the distribution is normal. d. If we regress x on y, then solve for y, we get the same line as when we regress y on x. them. e. If two variables have correlation zero, then there is no relationship between f. The largest value in a data set is an outlier. g. If the correlation between people s memories of what they ate and what they actually ate is.22, then 22% of the people remembered accurately what they ate. h. If the correlation between the actual weight of a sample of test weights and the weight indicated by a certain scale is 1, then the scale is accurate.

6. For each of the following, state which of correlation, regression, a well designed experiment, or none of these would be most appropriate to study the question. Explain why. a. Determining how well height at age four predicts adult height. b. Determining whether a new drug causes blood pressure to lower. c. Determining the average per person beef consumption in Texas. 7. A marketing research firm wishes to determine if the adult men in a certain small city would be interested in a new upscale men s clothing store. From a list of all 95,481 residential addresses in the city, the firm selects a simple random sample of and mails a brief questionnaire to each of these addresses. a. What is the population of this study? b. What is the sample in the study? c. What is the sampling frame in this study? 8. We have seen the when you do a least squares regression, the sum of the residuals e i is zero. What can you say about the mean e of the residuals? 9. In each of the following examples, explain how the sample chosen or the wording of the question asked probably biased the results of the study. a. A 1992 Roper poll reported that 22% of Americans say that the Holocaust may not have happened. The actual question asked in the poll was Does it seem possible or impossible to you that the Nazi extermination of the Jews never happened? Twenty-two percent of participants in the poll responded possible. b. A study sponsored by American Express Co. and the French Government tourist office reported that old stereotypes about French unfriendliness weren t true. The respondents were over Americans who had visited France more than once for pleasure over the past two years. c. A simple random sample of adult Americans was selected; each person in the sample was asked the following question: In light of the huge national deficit, should the government at this time spend additional money to establish a national system of health insurance?

Thirty-nine of those responding answered yes.. Discuss difference between uses of the following words in ordinary usage and in statistics: a. Correlation b. Predict c. Experiment 11. Compare and contrast: the correlation coefficient of a set of two variable data, and the slope of the least squares regression line of one of the variables on the other. 12. A researcher is interested in studying college students perceptions about whether student drinking is a problem. She will sample students from UT Austin (which has about, students) and Rice University (which has about students). a. i. Suppose simple random samples of students from each school are taken. Will the variability of the sample proportion for the two schools differ? If so, how? Explain briefly. ii. Suppose that instead simple random samples of 5% of the students from each school are taken. Will the variability of the sample proportion for the two schools differ? If so, how? Explain b. Suppose % of UT students actually believe that student drinking is a problem. Suppose UT students are asked their opinion and % of them say they think student drinking is a problem. Identify the parameter and the statistic in this situation. 13. A random sample of records of sales of 117 homes during a certain period in 1993 in Albuquerque, New Mexico gave price (in thousands of dollars) and size (in square feet). The resulting least squares regression equation (with y = price and x = size) was ˆ y = 47.82 +.61x Explain what the slope of the least squares line says about housing prices and sizes. Be as precise as possible given the above information. 14. For each statement, identify and explain the error in understanding and/or interpretation of correlation: a. The correlation of -.72 between average household income and infant mortality rate shows that there is almost no association between these two variables. b. The correlation of.83 between gross domestic product and literacy rate shows that countries wanting to increase their standard of living should invest heavily in education.

15. Here is part of the output Minitab will give you if you enter the 1996 mean SAT verbal scores in the states plus the District of Columbia and ask for descriptive statistics: N Mean Median StDev Min Max Q1 Q3 51 531. 525. 33.76 4. 596. 1. 565. a. Use this information to construct a boxplot these data. b. Based on the Minitab output and your boxplot, what can you tell about the shape of the distribution of these data? 16. A few years ago, a group of students in this class was interested in the question of whether people who make illegal right turns on red signal for the turn. They narrowed their project down to study people making illegal right turns on red on weekdays from 8 AM to 5 PM from west-bound twenty-first street onto north-bound Guadalupe. In a preliminary study, they found that six cars made illegal right turns on red at the intersection in a fifteen minute observation period, so they decided that fifteen minute observation intervals were appropriate for collecting their data. They divided the times from 8 AM to 5 PM in one week of weekdays (Monday through Friday) into fifteenminute intervals and excluded the time slots when no one in the group was able to collect data. (This turned out to be exactly the times when the course met.) They numbered the remaining time intervals and used a random number generator to select 12 of them. They chose an average week that, to the best of their ability to determine, had no unusual events that might disrupt normal traffic patterns. During observations, the observer sat near the sidewalk somewhat out of sight of drivers to prevent any influence on drivers behavior. a. What is (are) the population(s) studied in this project? b. What is (are) the variable(s) studied in this project? c. What is (are) the parameter(s) studied in this project? d. What is (are) the sample(s) used in this project? e. What is (are) the statistic(s) calculated in this project? f. What is (are) the sampling frame(s) used in this project? g. Comment on the data collection process. Could it be improved?

17. Below are three plots of data points each: x1 x2 x2 A B C Without doing any calculations, answer each of the following questions, and explain how you know. a. How does s x compare for the three plots? b. How does s y compare for the three plots? c. How does the correlation coefficient r compare for the three plost? d. How does the slope of the regression line compare for the three plots? Additional review problems from the textbook: 1.131, 1.137, 1.143 2.9, 2.111 3.81, 3.83, 3.85, 3.89, 3.95