Page 1 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis CCM6+7+ Unit 12 Data Collection and Analysis Big Ideas Page(s) What is data/statistics? 2-4 Measures of Reliability and Variability: Sampling, Box Plots and Dot Plots, Histograms and MAD (and some review) Measures of Center: Mean, Median, Mode Shapes of Distributions 5-7 8-13 14-17 18-23 24-38 Analyzing and Comparing Data Distributions 39-54 Study Guide/Review 55-58 1
Page 2 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 2
Page 3 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 3
Page 4 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 4
Page 5 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Random Samples and Surveys 5
Page 6 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 6
Page 7 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 31 Reviewing Good Sampling: 7
Page 8 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 8
Page 9 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 9
Page 10 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 29 10
Page 11 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 30 11
Page 12 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 30 12
Page 13 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 32 13
Page 14 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis The histogram is a popular graphing tool. It is used to summarize discrete or continuous data that are measured on an interval scale. It is often used to illustrate the major features of the distribution of the data in a convenient form. A histogram divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group, and an area proportional to the number of observations falling into that group. This means that the rectangles will be drawn of non-uniform height. A histogram has an appearance similar to a vertical bar graph, but when the variables are continuous, there are no gaps between the bars. When the variables are discrete, however, gaps should be left between the bars. Figure 1 is a good example of a histogram. A vertical bar graph and a histogram differ in these ways: In a histogram, frequency is measured by the area of the column. In a vertical bar graph, frequency is measured by the height of the bar. Histogram characteristics Generally, a histogram will have bars of equal width, although this is not the case when class intervals vary in size. Choosing the appropriate width of the bars for a histogram is very important. As you can see in the example above, the histogram consists simply of a set of vertical bars. Values of the variable being studied are measured on an arithmetic scale along the horizontal x-axis. The bars are of equal width and correspond to the equal class intervals, while the height of each bar corresponds to the frequency of the class it represents. The histogram is used for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets (greater than 100 observations). A histogram can also help detect any unusual observations (outliers) or any gaps in the data. 14
Page 15 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Histographs A histograph, or frequency polygon, is a graph formed by joining the midpoints of histogram column tops. These graphs are used only when depicting data from the continuous variables shown on a histogram. A histograph smoothes out the abrupt changes that may appear in a histogram, and is useful for demonstrating continuity of the variable being studied. Figure 2 and 3 are good examples of histographs. Unlike Figure 2, this histograph has spaces between the bars. By just looking at this illustration, the reader can immediately tell that the spaces mean the variables are discrete. In this way, histographs make it easier for the readers to determine what type of variables has been used. 15
Page 16 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 16
Page 17 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 17
Page 18 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 18
Page 19 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 19
Page 20 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 20
Page 21 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 32 21
Page 22 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 33 22
Page 23 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Common Core Additional Investigations Grade Seven page 33 23
Page 24 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Data Distributions CMP2 Grade 7 Lesson 2.2 pages 32-35 You can think of the MEAN as the balance point in a distribution of data. It acts like the fulcrum of a see-saw. Here is a set of data showing the amount of sugar in a serving of each of ten cereals, in grams: 1, 3, 6, 6, 6, 6, 6, 6, 10, 10 Make a line plot to show the distribution. What is the mean for these data? Make one or more changes to the data so that the mean is 7 and the range is: a) the same as the range of the original data set b) greater than the range of the original data set c) less than the range of the original data set Below is a dot plot showing the sugar per serving in nine cereals. What is indicated by the arrows on each side of the line marking the mean? 24
Page 25 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Graph A and Graph B show two different data distributions. Latoya guesses that each distribution has a mean of 5 grams of sugar per serving. For each distribution, answer questions 1-3. 1. Find the difference from Latoya s guess of 5 for each data value that is greater than 5. What is their sum? 2. Find the difference from Latoya s guess of 5 for each data value that is less than the mean. What is their sum? 3. Is Latoya correct that the mean is 5? Does the distribution balance? If so, explain. If not, change one or more data values to make it balance. 25
Page 26 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 26
Page 27 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 27
Page 28 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis **A few more MEAN problems!** 5. Send It Quick Mail House mailed five packages with a mean weight of 6.7 pounds. Suppose the mean weight of four of these packages is 7.2 pounds. What is the weight, in pounds, of the fifth package? A) 3.35 B) 4.7 C) 6.95 D) 8.7 6. Del Kenya s test scores are 100, 83, 88, 96, and 100. His teacher tells the class that they can choose the measure of center she will use to determine final grades. Which measure should Del Kenya use? A) mean B) median C) mode D) range 7. a) A gymnast receives these scores from five judges: 7.6, 8.2, 8.5, 8.2, 8.9 What happens to the mean of the scores when you multiply each data value by 2? By 2? By 0.2? 3 b) Why do you think the mean changes as it does in each situation? 8. A student gets 40 points out of 100 on a test. Her teacher announces that this test and next week s test will be averaged together for her grade. The student wonders if she can still get a 70% if she gets a 100 on the next test. She reasons, I think my average (mean) would be 70 because half of 40 is 20 and half of 100 is 50. That is a 70% because 20 plus 50 is 70. Does her method always work? Explain your thinking. 28
Page 29 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis CMP2 Data Distributions Grade 7 Lesson 2.3 pages 36-39 The MODE is the data value that occurs the MOST frequently in a set of data. The MEDIAN is the MIDPOINT in an ordered distribution. If the graph of a distribution, data values are located below, above, or at this midpoint. The graph below shows numerical data about the numbers of pets each of the 26 students has. 29
Page 30 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Using the graph above, answer the following questions. 1. What is the range of this data? 2. What is the mean and where is it located? 3. How does the location of the median compare to the location of the mean? 4. Why do you think this is so? 5. Do the data seem to cluster in some parts of the distribution? 6. Does clustering of the data appear to be related to the locations of the median and the mean? 30
Page 31 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis Repeated Values in a Distribution A. Jorge is ordering pizza for a party. Tamika shows Jorge the graph below. She tells him to order only thin-crust pizzas because thin-crust is the mode. Do you agree or disagree? Explain. B. The data from 70 cereals are shown below. Which option do you suggest using to find the typical amount of sugar in a serving of cereal? Explain. Option 1 Use the mode, 3 grams. The typical amount of sugar in a serving of cereal Is 3 grams. Option 2 Use the median, 7.5 grams. The typical amount of sugar in a serving of cereal is 7.5 grams. Option 3 Use clusters. There are several cereals that have either 3 or 6 grams of sugar per serving. 40% of the data seem to be evenly spread between 8 and 12 grams of sugar. Option 4 Use something else. Write your own statement about what you consider to be the typical amount of sugar in a serving of cereal. 31
Page 32 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis C. An advertiser wants more people to listen to a phone message. He uses the graph below. Which of the options below should the advertiser use to decide how long the message should be? Explain. Option 1 Use the mode. The most frequent amount of time spent listening to the phone advertisement was 3 minutes. Option 2 Use the mean. Listening times lasted, on average, 1.51 minutes per person. Option 3 Use clusters. One third of the people listened less than 1 minute and more than half listened less than 1.5 minutes. Only 20% listened for 3 minutes. Option 4 Use something else. Write your own response. 32
Page 33 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis D. 1. In the plots below, the data for the 70 cereals in Question B are organized by the cereals locations on the shelves in a supermarket. Use means, medians, clusters, or other strategies to compare the three distributions. Explain your reasoning. 2. Use the information from part 1 to make a prediction about the sugar content of a cereal based on its shelf location. 33
Page 34 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis CMP2 Data Distributions Grade 7 Lesson 2.4 pages 40-43 Unlike with categorical data, the mode is not always useful with numerical data. Sometimes there is no mode and sometimes there is more than one mode. Sometimes the mean and median of a distribution are located close together. The graph below shows the distribution of the amount of sugar in cereals located on the bottom shelf in a supermarket. The mean and the median are marked. The median is 7 grams and the mean is 6.9 grams. The overall shape of a distribution is determined by where the data cluster, where there are repeated values, and how spread out are the data. The shape of a distribution influences where the median and mean are located. In the next problem, you will experiment with making changes to distributions. Observe what these changes do to the locations of the mean and median in a distribution. 34
Page 35 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis For Questions A-C, predict what will happen. Then do the computation to see whether you are correct. A. The graph below shows the distribution of the amount of sugar in 20 cereals found on the top shelf. The sum of the values in this distribution is 91 grams. Use sticky notes or blocks to make a copy of the distribution. Note the location of the mean at 4.55 grams of sugar and the median at 3 grams of sugar. 1. Suppose you remove the three cereals with 6 grams of sugar per serving and add three new cereals, each with 9 grams of sugar per serving. What happens to the mean and the median? Why do you think this happens? 2. a) Use the new distribution from part 1. Suppose you remove a cereal with 3 grams of sugar and add a cereal with 8 grams of sugar. How do the mean and median change? b) Suppose you remove another cereal with 3 grams of sugar and add another cereal with 8 grams of sugar. How do the mean and median change? c) Suppose you remove a third cereal with 3 grams of sugar and add a third cereal with 8 grams of sugar. How do the mean and median change? 35
Page 36 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis B. Use the new distribution from Question A, part 2. Experiment with removing data values and replacing them with new data values. 1. How does replacing smaller data values with larger data values affect the mean and the median? 2. How does replacing larger data values with smaller data values affect the mean and the median? 3. How does replacing larger and smaller data values with values that are closer to the middle of the distribution affect the mean and the median? 36
Page 37 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 37
Page 38 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 38
Page 39 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis CMP2 Data Distributions Grade 7 Lessons 3.3 and 3.4 pages 60-61 39
Page 40 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 40
Page 41 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis REACTION TIME CARDS ARE ON THIS AND THE FOLLOWING PAGES feel free to pull them out of the packet and cut them apart to help group/categorize/compare them! 41
Page 42 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 42
Page 43 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 43
Page 44 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 44
Page 45 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 45
Page 46 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 46
Page 47 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 47
Page 48 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 48
Page 49 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 49
Page 50 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 50
Page 51 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 51
Page 52 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 52
Page 53 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis COMPARING DISTRIBUTIONS: Unequal Numbers of Data Values 53
Page 54 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 1. Based on visual inspection of the dotplots, which group appears to have the larger average height? Which group appears to have the greater variability in the heights? 2. Compute the mean and mean absolute deviation (MAD) for each group. Do these values support your answers in #1? 3. How many of the 12 basketball players are shorter than the tallest field hockey player? 4. Imagine that an athlete from one of the two teams told you she needs to go to practice. You estimate that she is about 65 inches tall. If you had to pick, would you think that she was a field hockey player or a basketball player? Explain your reasoning. 5. The women on the Maryland field hockey team are not a random sample of all female college field hockey players. Similarly, the women on the Maryland basketball team are not a random sample of all female college basketball players. However, for purposes of this task, suppose that these two groups can be regarded as random samples of all female college field hockey players and all female college basketball players, respectively. If these were random samples, would you think that female college basketball players are typically taller than female college field hockey players? Explain your decision using answers to questions 1-4 or any other additional analysis. 54
Page 55 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 55
Page 56 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 56
Page 57 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis This is an old test review from the old pre-algebra curriculum see how you do! 57
Page 58 CCM6+7+ Unit 12 Packet: Statistics and Data Analysis 58