Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

Statistics and Probability B Ch. 4 Sample Surveys and Experiments Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I Suppose we wanted to investigate if caffeine truly affects ones pulse rate, and we wanted to use our class to investigate. How could we design an experiment? What is the explanatory variable (factor)? What is the response variable? Who will be the experimental units? Here is an initial plan: measure initial pulse rate give each student some caffeine wait for a specified time measure final pulse rate compare final and initial rates What are some problems with this plan? This problem can be addressed by including a control group who does not receive caffeine. A control group (also called a comparison group ) is a group of subjects in an experiment who receives either no treatment or a comparative treatment. The use of control groups allow the experimenter to assess how the response variable behaves when a comparative treatment (or non-treatment) is used. Control groups provide the basis of comparison to evaluate the effectiveness of the experimental treatment. In our experiment, we can accomplish this by using 2 levels of caffeine: no caffeine and some caffeine. For example, we could assign each member one of two treatments: Regular Coke or Caffeine Free Coke. Why don t we give Coke to one group and nothing to the other group? The placebo effect refers to the human phenomenon found in certain experiments (mainly medical) wherein subjects who believe they are receiving special treatment, tend to feel better or improvement, regardless of the special treatment. This belief may cause a change in the response variable which confounds the effect of the treatment. In this case, if one group got Coke and the other group got nothing, it might be difficult to tell if an increase in pulse rate was due to the taking of caffeine (explanatory variable) or due to the excitement/anticipation of drinking Coke (placebo effect). Having every subject receive a treatment ensures that the placebo effect will treat both groups the same. Then, any difference between the pulse rates of the two groups can be attributed to the explanatory variable (drinking caffeine/not drinking caffeine) and not the excitement of being in an experiment. Of course, it is essential that the subjects do not know which treatment they are receiving! When a person doesn t know who is receiving which treatment, that person is blind.

There are two classes of individuals who can influence the results of an experiment: those who take part in the experiment (subjects, treatment administrators, etc.) those who evaluate the results (experimenters) When every individual in one of these classes is blinded, the experiment is called single blind. If every individual in both classes is blinded, then the experiment is double blind. Can our experiment be run in a double-blind manner? But doesn t someone need to know which subjects received which treatments? Four Key Principles of a Good Experiment: THE BIG IDEA--Our goal when designing an experiment is to make the treatment groups as similar as possible, with the exception of the treatments. Then, if there is a change in the response, it can be attributed to the explanatory variable and not any other extraneous variables. An extraneous variable is one that is not of interest in the current study but is thought to affect the response variable. For example, sugar is an extraneous variable since it may affect pulse rates. If one treatment group was given regular Coke and the other treatment group was given caffeine free Diet Coke, then sugar and caffeine would be confounded. If there was a difference in the average pulse rates of the two groups after receiving the treatments, we wouldn t know which variable caused the change, and to what extent. To prevent sugar from becoming a confounding variable, we need to make sure that both treatment groups get the same amount of sugar. This is called direct control. Principle #1: Direct Control means holding extraneous variables constant for all treatment groups so that their effects are not confounded with the explanatory variable. What extraneous variables should we try to hold constant in our caffeine experiment? If we do not control these extraneous variables by making them the same for all treatment groups, they could confound the effects of the caffeine on pulse rates. In other words, we may not be able to tell if it was the caffeine or ( ) that causes the higher pulse rate. Principle #2: Blocking is when subjects are divided into groups (blocks) based on some extraneous variable they may have in common. What if men react to caffeine differently than women? If more men end up in the experimental group and more women end up in the control group, then gender and caffeine will be confounded. We will not know which variable caused the change in pulse rates, gender or caffeine. How can we eliminate this confounding variable? Eliminate one gender from the study, but then we could only draw conclusions about one gender

Make sure there is a representative number of men and women in each treatment. For example, if there are 20 women and 30 men in the experiment, then the experimental group should have10 women and 15 men and the control group should have the same. In this example, we have formed 2 blocks: men and women. Then, we assigned treatments to the subjects within each block. Blocking in experiments is similar to stratification in sampling. Blocking reduces the variability of the results, just like stratifying. Blocks should be chosen like strata: the units within the block should be similar, but different than the units in the other blocks. You should only block when you expect that the subjects in one block will have a different response than subjects in other blocks. What are some other extraneous factors that we can block for in our caffeine experiment? You should try to make the blocks as small as possible. Ideally, the size of the block should be the same as the number of treatments. For example, if there are 3 treatments, then there should be 3 subjects in each block. If each block has only 2 subjects, then the subjects are called a matched pair. Can we create small blocks and use a matched pair design for our caffeine experiment? Principle #3: Randomization is the random assignment of subjects to treatments to ensure that the experiment doesn t systematically favor one treatment over the other. What about all of the other extraneous variables we do not think of? What about the variables we cannot directly control or block for? amount of food eaten before experiment caffeine tolerance If we randomly assign subjects to treatments, this should even out (but not eliminate) the effects of these variables since their effects should be spread equally between the treatment groups. Note: We must ALWAYS randomize since there will always be extraneous variables we do not consider. How do we randomize? Draw names from a hat. The first half chosen are in one group, the remaining names in the other. Number the class from 01-38. Then, generate random numbers without replacement until half are chosen for one group. The remaining names go in the other group. For matched pairs we can flip a coin to determine which subjects go into which group. If its heads, the first person in the pair goes to A and the other to B. If its tails, it s the opposite.

If you do not use blocking when dividing the subjects, the result is a completely randomized design. If you incorporate blocking in your design, it is called a randomized complete block (every subject is assigned to a block based on some characteristics and the members of the block are randomly assigned to the different treatments). Blocking is used to control the factors you can see; randomization helps balance the ones you cannot see. --Dick Schaeffer (statistics scholar) Principle #4: Replication means ensuring that there are an adequate number of observations in each treatment group. If each treatment group only had one experimental unit, then we would not be able to conclude that any changes in the response are due to the treatments. It is also possible that some characteristic of the unit was the cause of the change. Increasing the sample size makes randomization more effective. The more subjects we have, the more balanced our treatment groups will be. For example, if we have 10 subjects and only 2 have a certain unknown characteristic that significantly affect the response variable; it is quite likely that both of those subjects will end up in the same treatment group simply by chance. However, if we have 100 subjects and 20 have the characteristic; it is very unlikely for all 20 to end up in the same group. There is a much better chance that the groups will be close to balanced (10/10, 9/11, 11/9, etc.) when the sample size is larger. Note: Replication also refers to repeating the experiment with different subjects. This can help us feel more confident applying the results of our experiment to a wider population. SUMMARY: With control, blocking, randomization, and replication, each treatment group should be nearly identical, and the effects of unknown or uncontrolled extraneous variables should be the same in each group. Now, if changes in the explanatory variable are associated with changes in the response variable, we can conclude that it is a cause-and-effect relationship. Note: Not all experiments have control groups or use a placebo, as long as there is comparison. For example, if you are testing a new drug, it is usually compared to the currently used drug, not a placebo. Moreover, for non-medical experiments a placebo treatment is unnecessary. Note: There are ethical issues to consider when doing experiments: smoking and lung cancer: we cannot force people to smoke, but that would be the best way to prove smoking causes lung cancer. many medical experiments are ended early if the experimenters discover that one treatment is much more effective (ex. Aspirin study) Note: The results of an experiment are called statistically significant if they are unlikely to occur by random chance. For example, if caffeine really has no effect on pulse rates, then the average pulse

rate of the two groups should be exactly the same. However, because the results will vary depending on which subjects are assigned to which group, the averages will probably differ slightly. Thus, whenever we do an experiment and find a difference between two groups, we need to determine if this difference occurred by chance or because there really is a difference in the treatments. To do that, we need use probability and learn about statistical inference procedures. You can learn about these procedures in a college statistics class.