STATWAY STUDENT HANDOUT STUDENT NAME DATE INTRODUCTION The United States Census Bureau collects large amounts of data on the American public. The two-way table below gives the current marital status by gender for Americans age 18 and older. Each of the numbers is in millions, so 32.4 means 32,400,000. Each number is rounded to the nearest 100,000. Marital Status Gender Male Female Total Never Married 32.4 26.7 59.1 Married 64.8 65.5 130.3 Divorced 9.9 13.3 23.2 Widowed 2.8 11.4 14.2 Total 109.9 116.9 226.8 Here are a few questions about this data set. 1 What proportion of American adults are male? 2 What proportion of American adults are currently married? We calculated these two proportions using numbers in the margins (the total column and row) of the table so they are called marginal proportions. Since each of these proportions only involves one of the variables, they do not help us determine if there is a relationship between gender and marital status. If we choose one American adult at random from the entire population, how likely is it that that adult is currently married?
STATWAY STUDENT HANDOUT 2 3 What percent of American adults are currently married? 4 So if we choose 100 American adults, about how many would you expect to be married? We say there is a 57.5% chance that a randomly selected American adult is married. Another way we can say the same thing is that the probability of randomly choosing an American adult who is married is about.575. When we refer to the probability of some outcome, we are talking about how likely that outcome is. Note that the probability is exactly the same value as the marginal proportion. For individuals selected from sample or population, the probabilities are the same as the proportions. In this case, the probability we have calculated is the marginal probability. Thus marginal probabilities are calculated from the totals in the margins of the table. Since these data are for the entire U.S. population over the age of 18, any probabilities we calculate are for one person selected from the 226.8 million U.S. adults. Before we move on, we will simplify things with a bit of notation. We use P to represent probability. Thus the probability that a randomly selected American adult is married can be written: P(married). 5 If we choose one American adult at random, determine P(widowed). For the data on marital status, marginal probabilities involve only categories for one variable, such as the probability that a randomly chosen adult is female or the probability that a randomly chosen adult has never been married. We can also calculate probabilities that involve both variables. 6 What is the probability that a randomly chosen adult is male and has never been married? Symbolically we are looking for P(male and never married). Since this probability is about combining the two variables, we call it a joint probability.
STATWAY STUDENT HANDOUT 3 The next step is to calculate conditional probabilities such as the probability that a randomly chosen female is widowed or the probability that a randomly chosen divorced adult is male. These are similar to the conditional percentages that we calculated in the previous lesson. 7 Let s consider the first probability, that a randomly chosen female is widowed. Our starting point is that the adult chosen is female. A How many females are in the population? B Since our probability is just about females, how many of the females are widowed? C Now determine the probability that a randomly chosen female is widowed. A conditional probability is one that is based on some given condition. In this case the given condition is that the randomly chosen American adult is a female. We can reword the probability in this way: the probability that a randomly chosen adult is widowed given that the adult is female. Symbolically we write: P(widowed female) The vertical line replaces the word given. When we calculate conditional probabilities, the denominator is always the total of the given condition, such as total females in the last exercise. 8 Find the probability that a randomly chosen divorced adult is male. Write the probability symbolically first, then calculate the value of the probability.
STATWAY STUDENT HANDOUT 4 9 The newspaper headline Drinking Coffee Reduces the Risk of Dementia by 65% summarizes the findings of a study described in the paper, Caffeine as a Protective Factor in Dementia and Alzheimer s Disease. 1 The study followed 1,409 adults for 21 years. During that time, 61 adults developed dementia. The researchers classified these people into three categories based on how much coffee they consumed in a typical day: Low (0 to 2 cups per day), Medium (3 to 5 cups per day) and High (6 or more cups per day). Information in the paper was used to construct the following data table: Developed dementia Did not develop dementia Low coffee consumption Medium coffee consumption High coffee consumption Total 20 20 21 61 204 622 522 1,348 Total 224 642 543 1,409 A What percentage of the subjects in the study developed dementia? B The number of subjects who developed dementia in each of the three different coffee consumption groups was very close (20, 20, and 21). Does this mean that the risk of developing dementia is about the same for each group? Explain your reasoning. C What we need to compare is the conditional proportions. What proportion of those in the low coffee consumption group developed dementia? D What proportion of those in the medium coffee consumption group developed dementia? 1 Eskelinen, M. H., & Kivipelto, M. (2010). Caffeine as a protective factor in dementia and alzheimer s disease. Journal of Alzheimer s Disease, 20, 167 174.
STATWAY STUDENT HANDOUT 5 E Which group has the lower risk of developing dementia? To calculate the percent decrease in risk we take the difference and divide it by the higher number. To calculate the percent increase in risk we take the difference and divide it by the lower number. In this instance the decrease in risk of dementia for the medium coffee consumption group over the lower coffee consumption group is: which tells us that the headline is accurate. F The proportions in C and D can also be considered as probabilities for the two-way table. What type of probabilities are they?
STATWAY STUDENT HANDOUT 6 TAKE IT HOME 1 Polio is a very severe illness that can cause paralysis in its victims. The most famous Polio victim was Franklin D. Roosevelt. In 1954 a randomized experiment was conducted on the effectiveness of a vaccine to prevent Polio. The real vaccine was given to 200,745 children and a fake vaccine was given to 201,229 children. The results showed that 33 of the children given the vaccine developed polio whereas 115 of the children not given the vaccine developed polio. A Draw a two way table that compares vaccine type and whether or not the children developed polio. B Calculate the conditional proportion that a vaccinated child developed polio. C Calculate the conditional proportion that a child given the fake vaccine developed polio. D What was the reduction in risk of polio for the children given the real vaccine? E Even though the risk of developing polio for unvaccinated children was very small, the vaccine was considered a huge success. Explain why.
STATWAY STUDENT HANDOUT 7 2 Suppose two treatments (Treatment A and Treatment B) have been recommended for treating a particular medical condition. Both treatments have two potential side effects (severe skin rash and blurred vision). The report of a recent study of side effects associated with these treatments included the following information: 75 people who received Treatment A and 150 people who received Treatment B experienced severe skin rash. 5 people who received Treatment A and 11 people who received Treatment B experienced blurred vision. A A family member with this medical condition has asked for your advice about which treatment to choose. Do you have enough information to make a recommendation? If so, which treatment would you recommend? If not, what additional information do you need? B The study had 650 participants, and 200 of them received Treatment A. The others received Treatment B. Do you now have enough information to make a recommendation? If so, which treatment would you recommend and why? +++++ This lesson is part of STATWAY, A Pathway Through College Statistics, which is a product of a Carnegie Networked Improvement Community that seeks to advance student success. Version 1.0, A Pathway Through Statistics, Statway was created by the Charles A. Dana Center at the University of Texas at Austin under sponsorship of the Carnegie Foundation for the Advancement of Teaching. This version 1.5 and all subsequent versions, result from the continuous improvement efforts of the Carnegie Networked Improvement Community. The network brings together community college faculty and staff, designers, researchers and developers. It is an open-resource research and development community that seeks to harvest the wisdom of its diverse participants in systematic and disciplined inquiries to improve developmental mathematics instruction. For more information on the Statway Networked Improvement Community, please visit carnegiefoundation.org. For the most recent version of instructional materials, visit Statway.org/kernel. +++++
STATWAY STUDENT HANDOUT 8 STATWAY and the Carnegie Foundation logo are trademarks of the Carnegie Foundation for the Advancement of Teaching. A Pathway Through College Statistics may be used as provided in the CC BY license, but neither the Statway trademark nor the Carnegie Foundation logo may be used without the prior written consent of the Carnegie Foundation.