Chapter 5 Producing data Observational study Observes individuals and measures variables of interest but does not attempt to influence the responses. Experiment Deliberately imposes some treatment on individuals in order to observe their response. Observational study is a poor way to gauge the effect of an intervention. When looking for cause effect relationships you MUST have an experiment. Look at example 5.1 on page 270. Simulation way to produce data when you cannot impose a treatment. 5.1 Designing Samples Population entire group of individuals we want information about. Sample part of the population that we actually examine in order to gather information. Sampling studying a part in order to gain information about the whole. Census- attempts to contact every individual in the population. Design method used to choose the sample from the population. Poor designs produce misleading conclusions. Ex. Page 272.
Bad Sample Designs Voluntary Response Sample consists of people who choose themselves by responding to a general appeal. Biased because people with strong opinions, especially negative opinions, are most likely to respond Convenience sample Chooses the individuals easiest to reach Bias design is biased if it systematically favors certain outcomes Good Sample Designs Simple Random Sample - Choosing n individuals from a population where every individual has the same chance of being selected. Random digits Table of random digits (0 9) Table B (talk about # of places very important) Random number generator calculator Choosing an SRS 1) Label- assign numerical value to each individual 2) Use Table B to select labels at random 3) Random digit on calculator Math Prob Randint (start #, End #, How many you need) Probability sample- sample chosen by chance Must know what samples are possible and what chance, or probability each possible sample has Stratified random sample Divide population into groups called strata. Choose a separate SRS in each stratum and then combine to form a full sample.
Cautions about sample surveys Under-coverage can t get a complete list of population Creates bias because the poor are most often not included in this list Non-response individuals can t be contacted or do not cooperate. Response bias behavior of the respondent or the interviewer may bias sample (body language of the interviewer or the inflection in how they ask the question The respondent may lie) Wording of question this may influence the response see page 282 diaper example Larger random samples give more accurate results than smaller samples. Pages 269-288 problems 1-12 and 19-23 Methods of sampling worksheet and 24 29 pages 285-288
5.2 Designing experiments Experimental units the individuals on which the experiment is done Subjects when individuals are humans Treatments specific experimental condition Explanatory variables called factors See examples pages 290, 291 Experiments can give good evidence for causation Experiment advantage Allows us to study specific factors while controlling lurking variables See example 5.11 page 292 Control group receives a placebo
Example 5.12 Group 1 Treatment 1 15 rats New diet Random assignment Compare Wt gain Group 2 Treatment 2 15 rats standard diet Need to explain how you are randomly assigning rats to the groups. Names in a hat, random numbers, etc Need to decide what treatment each group receives. Roll a die, flip a coin, etc. Please be very specific in explaining an experimental design!!! Principles of experimental design 1. Control the effects of lurking variables on the response, most simply by comparing two or more treatments. (Many times adding a placebo to use as a control) 2. Randomize- use impersonal chance to assign individuals to treatment 3. Replicate each treatment on many units to reduce chance variation Statistically significant An observed effect so large that it would rarely occur by chance Homework read pages 290 297 do problems 31-39 Why randomize worksheet Monday
Cautions about experimentation *****Double blind****** Neither the subjects nor the people who have contact with the subjects (administrators) know which treatment each subject received. Lack of realism the treatment, subjects, or setting may not realistically duplicate the conditions we really want to study. See example 5.15 on page 300 Matched pair Design Matched pair is when we need to have the same subject be exposed to different treatments to compare which is best. It is hard to compare what I like to what you like, how well I did on a test to how well you did on a test, etc. Need to randomly assign what treatment each subject receives first. The second form of matched pair is in agriculture. The soil, water content, sunlight, elevation may vary when using different plots. We split a plot in have and use one treatment on one side and a second treatment on the other side. The treatment that is applied needs to be randomly assigned. Subject First treatment Second treatment 1 A B 2 A B 3 B A 4 A B 5 B A 6 B A
Block design This is when we think there is a lurking variable. We categorize first by the possible lurking variable and then we randomly assign with in that group to which treatment each subject receives. Block by gender Treatment A Male (randomly assign) Results (Response) Treatment B Treatment A Female (randomly assign) Results (Response) Treatment B Homework read pages 299-303 do problems 43 48 Soda Pop Challenge Review 49-53, 56, 58 Quiz and capture/recapture worksheet
5.3 Simulating experiments Probability model Develop a model for simulating the procedure (imitate) Couple having a girl among first four children Flip coin Heads -Girl Tails- Boy Do this enough times to represent a large number of families the proportion of times that heads appears within the first 6 flips would be a good estimate Could do this same experiment with a die Even Girl Odd Boy Let s do this experiment 20 times Simulation imitation of chance behavior, based on a model that accurately reflects the experiment. Simulation steps 1. State the problem or describe the experiment 2. 2. State the assumptions (each equally likely to occur, and each independent of the other) 3. Assign digits to represent outcomes, or assign Heads, Tails, or a number from a die. 4. Simulate many repetitions 5. State your conclusion
Assigning digits Example- chose a person at random from a group of which 70% are employed. Employed will be numbers 0 6 or 00-69 or 001 070 Not employed will be numbers 7 9 or 70 99 or 071 100 Another example 50% employed, 20% not employed, 30% not in labor force Employed 0-4 00-49 001 050 Not employed 5-6 50 69 051 070 Not in labor force 7 9 70 99 071 100 I prefer one digit numbers when possible, because it is faster and easier to get the numbers from the random digits table. Ex. Frozen yogurt has the following relative frequencies. 38% chocolate, 42% vanilla, 20% strawberry Task is to simulate sales 00 37 Chocolate 38 79 Vanilla 80 99 Strawberry Use row 115 and let s do 5 simulations
Simulate an experiment to find the probability of having a girl among the first 4 children. Let s use the random table Girl 0,2,4,6,8 Boy 1,2,5,7,9 We stop when we get a girl or once we reach 4 children. Let s do 5 simulations of 10 trials and find the percent for each simulation and then we will average the 5 results. Homework read pages 309-319 do 59-63 and 74 80 review worksheet and 82, 83, 86 Test chapter 5