Quiz 4.1C AP Statistics Name: 1. The school s newspaper has asked you to contact 100 of the approximately 1100 students at the school to gather information about student opinions regarding food at your school s cafeteria. (a) With as much precision as possible, describe the population for your study. (b) You are pretty sure that there is a big difference between the opinions of males and females when it comes to cafeteria food. Describe a study design that takes into account this potentially important variable. Explain the advantage of this method. (c) You decide to conduct a survey about the quality of food served in the school cafeteria by randomly selecting students as they leave the cafeteria after lunch on a specific day next week. Describe a source of bias that may result from using this method. Be sure to use the correct terminology, and indicate the direction of the potential bias. 164 The Practice of Statistics, 4/e- Chapter 4 2011 BFW Publishers
2. A university s financial aid office wants to know how much it can expect students to earn from summer employment. This information will be used to set the level of financial aid. The population contains 3478 students who have completed at least one year of study but have not yet graduated. A questionnaire will be sent to an SRS of 100 of these students, drawn from an alphabetized list. (a) Describe how you will select the sample. (b) Starting at line 135, use the portion of the random digits table below to select the first three students in the sample. 135 66925 55658 39100 78458 11206 19876 87151 31260 136 08421 44753 77377 28744 75592 08563 79140 92454 137 53645 66812 61421 47836 12609 15373 98481 14592 2011 BFW Publishers The Practice of Statistics, 4/e- Chapter 4 165
Quiz 4.2B AP Statistics Name: Agricultural scientists for a chemical company want to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study, they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24. 1. Describe the design of a completely randomized, controlled experiment to test the whether the new fertilizer produces heavier tomatoes. 2. Use the section from the random digits table below to carry out the randomization required by your design and list the outcome of the randomization. 27816 78416 18329 21337 35213 37741 04312 68508 66925 55658 39100 78458 11206 19876 87151 31260 08421 44753 77377 28744 75592 08563 79140 92454 53645 66812 61421 47836 12609 15373 98481 14592 66831 68908 40772 21558 47781 33586 79177 06928 168 The Practice of Statistics, 4/e- Chapter 4 2011 BFW Publishers
3. The customer service call center for a major electronics manufacturer is trying to determine how to keep customers who are on hold as happy as possible. They want to examine whether the type of music they play while customers are on hold and whether or not there is a periodically-repeated recorded message ( Thank you for you patience, we ll be with you as soon as possible. ) have an impact on customer satisfaction. They plan to randomly select customers who are on hold and play one of three different types of music ( smooth jazz, classical, or Broadway show tunes) and either play recorded messages or not. After the entire call is over, they will ask the customers to rate their overall customer service experience. (a) Suppose the company plans to conduct a completely randomized design. List the experimental units, factors and treatments in this experimental design. (b) Suppose the company is concerned that the time when the call is made (daytime versus evening) will have an impact on which combination of music and messages is most effective. How might they alter the design of this experiment to take this into account? 4. Many utility companies have introduced programs to encourage energy conservation among their customers. An electric company considers placing electronic meters in households to show what the cost would be if the electricity use at that moment continued for a month. It gives these meters to 100 of its customers for a year and then compares the average electricity use in these customers homes this year to the previous year. Result: These customers average electricity use decreased by 10%. Explain why this is not strong evidence that the use of the electronic meters caused customers to decrease their electricity use. 2011 BFW Publishers The Practice of Statistics, 4/e- Chapter 4 169
Quiz 4.2C AP Statistics Name: 1. An article in a women s magazine says that women who choose to nurse their babies feel warmer and more receptive toward the infants than mothers who bottle-feed. The author concludes that nursing has desirable effects on the mother s attitude toward the child. Explain why asserting a causal relationship based on this information is suspect, and give another plausible explanation for the association between the decision to bottle-feed or nurse and mothers attitudes toward their children. 2. A cookie manufacturer is trying to determine how long cookies stay fresh on store shelves, and the extent to which the type of packaging and the store s temperature influences how long the cookies stay fresh. He designs a completely randomized experiment involving low (64 Fº) and high (75 Fº) temperatures and two types of packaging plastic and waxed cardboard. List the experimental units, factors, and treatments in this experiment. 3. An experiment to investigate the effectiveness of white-willow-bark capsules as a remedy for back pain plans to compare the reduction in pain for people using this herbal remedy to a control group. Is it possible to conduct this experiment in a double-blind? Explain. 170 The Practice of Statistics, 4/e- Chapter 4 2011 BFW Publishers
4. The Brigham Young University statistics department is conducting a series of randomized comparative experiments to compare teaching methods. Response variables include students final-exam scores and a measure of their attitude toward statistics. One study compares two levels of technology for large-group lectures: standard (overhead projectors and chalk) and multimedia. The experimental units in the study are the 8 lecture sections in a basic statistics course. There are four instructors, each of whom teaches two sections. Because the lecturers differ, their lectures form four blocks. Suppose the sections and lecturers are as follows: Section 1 2 3 4 5 6 7 8 Lecturer Hilton Christensen Hadfield Hadfield Tolley Hilton Tolley Christensen (a) Outline the design of an experiment using blocking to determine which lecture method is most effective. Be sure to explain how you will randomly assign the treatments, using the random digits table below. 71487 09984 29077 14863 61683 47052 62224 51025 13873 81598 95052 90908 73592 75186 87136 95761 54580 81507 27102 56027 55892 33063 41842 81868 71035 09001 43367 49497 72719 96758 27611 91596 (b) Explain why a randomized block design is better than a completely randomized design in this case. 2011 BFW Publishers The Practice of Statistics, 4/e- Chapter 4 171
Chapter 4 Solutions Quiz 4.1A 1. (a) Population is eating-and-drinking establishments in the large city that advertise in the telephone directory. (b). Population is those constituents who are inclined to write letters to the Congressman (probably a subset of the entire constituent population). 2. Only those listeners with strong opinions are likely to call in. The poll probably overestimates opposition to the increase. This is bias arising from voluntary response. 3. The wording of the questions is different enough to produce different responses: mentioning bribery may cause a more negative reaction than not mentioning it, or some subjects might not even know about the accusations. 4. (a) No. Not every group of 35 seniors is equally likely to be selected. It s impossible, for example, to have a group that is all girls. This is a stratified random sample. (b) Answers may vary. Make sure assigned digits are all the same length and that sampling is done without replacement. If we selected the first three 3-digit numbers between 001 and 200, they will be 179, 090, an 009. Quiz 4.1B 1. (a) Population is all auto claims filed in a given month for this insurance company. (b) Population is all applicants to this particular college (a subset of all high school seniors). 2. This is bias arising from the wording of a question. Knowledge of how many troops were going to be deployed increased people s concerns about troop safety. 3. Sampling only during workday hours meant that only people without regular daytime jobs were available to answer the door the poll suffered from undercoverage of people who were employed. Since those who are not employed may be more likely to have time to volunteer, the poll probably overestimated the proportion of potential volunteers. There is also potential response bias: a people is likely to say he or she will volunteer to look like a good person. 4. (a) No. Not every group of locations is equally likely to be selected. It s impossible, for example, to have a sample of locations that are all on Oahu. This is a stratified random sample. (b) Answers may vary. Make sure assigned digits are all the same length and that sampling is done without replacement. If we selected the first three 3-digit numbers between 001 and 476, they will be 354, 239, and 421. Quiz 4.1C 1. (a) Population is the 1100 students in the school. (b) Take a stratified random sample, randomly selecting males and females in proportion to their relative abundance at the school. The principle advantage is that there will be much less variation from sample to sample, since the proportion of boys and girls in the sample is fixed you can t get a sample that is all, or nearly all, one sex or the other. (c) Two possible solutions (others are possible): 1) You may have undercoverage if students who don t like the food don t come to the cafeteria at all, which would mean you would overestimate how much people like the food. 2) You might get response bias if the day you choose to conduct you survey is one when something particularly good or bad is served. This could overestimate or underestimate how much people like the food, depending on what is served that day. 2. (a) Assign a unique 4-digit number between 0001 and 3478 to each student. Choose 4-digit numbers from the random digits table, ignoring repeats and numbers outside the designated range. The first 100 numbers chosen will be a simple random sample. (b) First three 4-digit numbers between 0001 and 3478 are 1007, 1120, and 1513. 192 The Practice of Statistics, 4/e- Chapter 4 2011 BFW Publishers
Quiz 4.2A 1. Use a random number table to choose ten 2-digit numbers from 01 to 20, ignoring repeats. The patients with these numbers will receive the beta-blocker during their operation. The remaining 10 people will act as a control group and will not receive the beta blocker. Measure pulse rate of all patients at the specified point in the operation, and compare the difference in mean pulse rate for the two groups. 2. The first 10 patient numbers are 18, 19, 10, 03, 06, 08, 11, 15, 13, 09. These patients will constitute the treatment group. The remaining 10 patients will be in the control group. 3. (a) Experimental units: the 60 restaurants. Factors: menu description and price. Treatments: Healthy-High price, Healthy-Medium price, Healthy-Low price, Value-High price, Value-Medium price, Value-Low price. (b) Block for building type: randomly assign the six different treatments to the 30 free-standing buildings, the do the same to the 30 mall restaurants, so that there are exactly five of each building type assigned to each treatment. 4. Since this was an observational study, we cannot establish cause and effect. It s possible, for instance, that people with more self-confidence are more likely to choose to exercise. That is, people s personality might be a confounding variable. Quiz 4.2B 1. Use a random number table to choose twelve 2-digit numbers from 01 to 24, ignoring repeats. The tomato plants with these numbers will receive the new fertilizer. The remaining 12 plants will act as a control group and will receive the old fertilizer. Measure the total weight of tomatoes produced by each plant, and compare the mean weight in the two groups. 2. The first 12 plant numbers are 16, 18, 13, 21, 04, 08, 10, 07, 11, 20, 15, 12. These tomato plants will constitute the treatment group. The remaining 12 plants will be in the control group. 3. (a) Experimental units: customers who are put on hold. Factors: type of music, presence of recorded message. Treatments: jazz-message, classical-message, show tunes-message, jazz-no message, classical-no message, show tunes-no message. (b) Block for the time of day when the call is made: randomly assign equal numbers of customers who call during the daytime and customers who call in the evening to each treatment type, so that there are roughly the same number of customers from each time period in each group. 4. The effect of the electric meters is confounded with year, so we can t be sure that the meters were the cause of the reduced electricity use. Perhaps the second year was not a cold as the first, so less electricity was used for heating. In this case, temperature would be a confounding variable. Quiz 4.2C 1. The impact of mothers nursing their infants cannot be separated from possible lurking variables in this observational study. For example, perhaps mothers who bottle-feed must do so because they work outside the home. That work might make them more tired and irritable when they return home. Tiredness is thus confounded with bottle-feeding. 2. Experimental units: packages of cookies. Factors: Temperature and packaging. Treatments: Low temp and plastic, high temp and plastic, low temp and waxed cardboard, high temp and waxed cardboard. 3. Yes. If the subjects in the control group receive a placebo that appears to be similar to the willow bark so that they do not know which treatment they are getting, the experiment is blind. If the researchers who interact directly with the subjects and discuss their pain reduction do not know which group each subject is in, then it s double-blind. 2011 BFW Publishers The Practice of Statistics, 4/e- Chapter 4 193
Quiz 4.2C Continued 4. (a) Treat each teacher as a separate block, and randomly assign one of each teacher s sections to each of the two methods. Since each section already has a number associated with it, we can randomize by choosing 1-digit numbers from 1 to 8 and letting the first section of each instructor that is selected be taught using multimedia and the other section by the standard method. The first 4 numbers are 7, 1, 4, 8, so those four sections, each taught by a different instructor, will use multimedia. At the end of the semester, final exam scores and student attitudes can be compared. (b) Blocking allows us to compensate for variability in scores arising from differences in the instructors teaching ability, since each teacher can act as his or her own control. (Thanks to Jay Windley from A.B. Miller High School in California for suggesting revisions to this question.) Quiz 4.3A 1. (a) Random assignment cause and effect can be inferred. No random sampling Cannot generalize beyond the subjects of the study. (b) No random assignment cause and effect cannot be inferred. No random sampling Cannot generalize beyond the subjects of the study. (c) No random assignment cause and effect cannot be inferred. Random sampling Can generalize to population from which the random sample was selected. 2. Answers will vary. Good general categories: establish a strong association between caffeine consumption and miscarriages in a wide variety of studies; establish a plausible mechanism for the impact of caffeine on miscarriages; show the association exists in studies that stratify for possible lurking variables, such as other health factors that may be confounding with caffeine consumption. Quiz 4.3B 1. (a) No random assignment cause and effect cannot be inferred. No random sampling Cannot generalize beyond the subjects of the study. (b) No random assignment cause and effect cannot be inferred. Random sampling Can generalize to population from which the random sample was taken. (c) Random assignment cause and effect can be inferred. No random sampling (in fact it was a census of all students at the school Cannot generalize beyond the students at the school. 2. Answers will vary. Good general categories: establish a strong association between proximity to power lines and cancer in a wide variety of studies; establish a plausible mechanism for the impact of power lines on cancer; show the association exists in studies that stratify for possible lurking variables, such as other environmental factors that may be confounded with proximity to power lines. Quiz 4.3C 1. (a) Since packaging type is confounded with time, cause and effect cannot be inferred: we cannot separate the effect of packaging from differences in sales from last month to this month. We can, however, make inferences about the population of all stores in this city, since random sample of stores was used. (b) Random assignment within matched pairs cause and effect can be inferred. Random sampling of cars Can generalize to population of all cars. (c) No random assignment cause and effect cannot be inferred. No random sampling Cannot generalize beyond the subjects of the study. 2. Since the cars were driven on a track for 1000 miles, one could argue that fuel efficiency characteristics of cars would be different in normal use, because of traffic, topography, and road conditions. 194 The Practice of Statistics, 4/e- Chapter 4 2011 BFW Publishers