Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

II. DESIGN OF STUDIES Observational studies and experiments are two types of studies that aim to describe or explain the variation of responses under the hypothesized factors, without or with manipulation. One important advantage of experiments over observational studies is that well designed experiments can provide good evidence for causation. 2.1. Observational Studies Observational studies the studies are based on data in without manipulation of the variables. The variables must be observed, the data must be collected through observation, and the study cannot be interfered in any way. The conclusions are drawn based on the studies. Retrospective Subjects are selected and then their past conditions are studied. Prospective Subjects are followed to observe future outcomes. 62

Two common observational studies are the cohort studies and the case-control studies. Cohort a cohort is a group of subjects who share a defining characteristic. This is like the stratified method in sampling and it is a prospective study. Case-control The studies are to compare the outcome of interest (cases) with no outcome (control). The studies are not as strong as being able to use randomization, as in an experiment, but assigning treatment groups would increase the strength of conclusions of the studies. These studies are sometimes labeled as quasi-experiments, though the studies are retrospective observational studies. 63

Example 2.1.1. Which of the following situations qualifies as an observational study? (A) The girls at your high school are surveyed to determine if they believe there is any sexual stereotyping in the school newspaper. (B) Two flowerpots are planted with the same type of seed. One is given 0.2 cups of water each day while the other is given 0.5 cups of water each day. At the end of one month, the growth of each plant is observed and measured. (C) A team of researchers records the number and type of cars that pass a specific intersection. (D) A student flips a coin 100 times and records the number of heads. (E) None of these are observational studies. Solution: The answer is C. Example 2.1.2. An advantage to using surveys as opposed to experiments is that (A) surveys are generally cheaper to conduct. (B) it is generally easier to conclude cause and effect from surveys. (C) surveys are generally not subject to bias. (D) surveys involve use of randomization. (E) surveys can make use of stratification. Solution: The answer is A. 64

Quick-Check 2.1.1. --- Observational Studies QC 2.1.1.1. Two studies are run to compare the experiences of low income families receiving food stamps to those receiving cash subsidies. The first study interviews 50 families who have been in each government program for at least 2 years, while the second randomly assigns 50 families to each program and interviews them after 2 years. Which of the following is a true statement? (A) Both studies are observational studies because of the time period involved. (B) Both studies are observational studies because there are no control groups. (C) The first study is an observational study; the second is an experiment. (D) The first study is an experiment; the second is an observational study. (E) Both studies are experiments, because in each, families are receiving treatments (food stamps or cash). QC 2.1.1.2. Which of the following can be used to show a cause-and-effect relationship between two variables? (A) A censes (B) A controlled experiment (C) An observational study (D) A sample survey (E) A cross-sectional survey QC 2.1.1.3. [MC1510M] Suppose you toss two fair, four sided dice whose faces are labeled 1, 2, 3, 4. Let X represent the average of the two sides that are facing up. What is P ( X = 1.5)? 65

Answers QC 2.1.1. The answer is C. QC 2.1.2. The answer is B. QC 2.1.3. The outcomes of tossing two dice can be summarized below: D1/D2 1 2 3 4 1 1 1.5 2 2.5 2 1.5 2 2.5 3 3 2 2.5 3 3.5 4 2.5 3 3.5 4 2 P ( X = 1.5) = = 16 1 8 66

2.2. Experiments Experiments the planned studies to manipulate the levels of factors to create treatments. An experiment usually has two basic elements: a control and a treatment. An experiment uses randomization. The basic components involving with experiments are Experimental units the subjects or participants that are experimented on. Treatments the process/intervention applied to the experimental units. Explanatory variables or Factors independent variables in the experiment. Level of Factors the discrete values in domains of the explanatory variables. Response variables dependent variables whose values are compared with different treatments. Experimental units can be identified as the objects that can be randomly assigned during the experiment designs. As an example, let us say that you want to make your own kind of cakes in your cooking class, where ten spots in your classroom can be used to place the ovens. Your experimental units are the 10 spots. The experiments can be graphically displayed below: 69

[MATH] The experiments can also be expressed mathematically. In the above example, Where ( Oven, Sugar, Flour, Eggs) = Cake ( Taste, Color Consistency) Treatment, Experimental units: { Spot 1, Spot 2, } Explanatory variables: { Oven, Sugar, Flour, Eggs} Response variables: { Taste, Color, Consistency} Levels of factors: Oven = { temp1, temp2, }, Sugar = { 1 cup, 2 cups, }, Flour = { 1 cup, 2 cups, }, Eggs = { 1, 2, } Example 2.2.1. Identify the experimental unit, treatment, explanatory and response variables, and level of factors for each of the following experiments. a.) An agricultural experimental station is going to test two varieties of wheat. Each variety will be tested with two types of fertilizers. Each combination will be applied to two plots of land. The yield will be measured for each plot. Experiment units Treatments Explanatory variables Response variables Level of factors 70

b.) Scientists want to study the effect of an anti-bacterial drug in fish lungs. The drug is administered at 3 dose-levels (0, 20, and 40 mg/l). Each dose is administered to a large controlled tank through the filtration system. Each tank has 100 fish. At the end of the experiment, the fish are sacrificed, and the amount of bacteria in each fish is measured. Experiment units Treatments Explanatory variables Response variables Level of factors c.) A study was conducted to examine the crop yield for 3 varieties of corn, under 5 different fertilizers. A field of fifteen rows was available for the experiment. Experiment units Treatments Explanatory variables Response variables Level of factors 71

Solution: a.) Experiment units the two plots of land Treatments the combination of different types of wheat and fertilizer applied to plots Explanatory variables types of wheat and fertilizer Response variables yield Level of factors combination of different types of wheat and fertilizer b.) Experiment units the tanks Treatments apply the anti-bacterial drug to fish Explanatory variables the dose of the anti-bacterial drug Response variables amount of bacteria in each fish lung Level of factors 3 levels of the anti-bacteria drug c.) Experiment units rows Treatments apply fertilizers and corns to the rows. Explanatory variables types of corn and fertilizer Response variables yield Level of factors 3 types of corns by 5 types of fertilizers 72

Quick-Check 2.2.1. --- Experiments QC 2.2.1.1. Identify the experimental unit, treatment, explanatory and response variables, and level of factors for each of the following experiments. The effectiveness of three laundry detergents is being compared. We will do three loads of laundry (all equally dirty), one with Tide, one with Cheer, and the other with Sunlight. After the wash cycle, the amount of stains removed from the clothes is compared for the three detergents. Experiment units Treatments Explanatory variables Response variables Level of factors 73

Answers QC 2.2.1.1. Experiment units loads of laundry/washing machines Treatments different brands of detergents are used. Explanatory variables detergent brand Response variables amount of stains removed Level of factors three brands 74

2.3. Experimental Designs Design of experiments (DOE) is a systematic method to determine the relationship between factors affecting a process and the output of that process. Some of the commonly used terminologies in DOE Confounding variables variables whose effect on response cannot be separated from the effect of the explanatory variables. That is, for the given factor A and the factor B, the ( factor A) ( factor B) is also a factor. Lurking variables variables that are not among the explanatory or response variables in a study but that may influence the response variable. That is, for the given factors and lurking response = f ( factors ) ( lurking B) variable B, ( ) Placebo group a control group that receives a placebo in experiments. Placebo effect a beneficial effect that cannot be attributed to the properties of the placebo itself, but to the patient's belief in that treatment. 77

Single blind experimental units (subjects) do not know which treatment is given. Double blind neither experimental units nor experimenters know which treatment is given. Blocking used to control the effect of known factors such as gender. Blocking is used to reduce the effect of confounding factors. Randomized block design randomized within each block. 78

Matched-pair study Only two treatments are applied within each block, with each subject receiving only one treatment. Four principles of experimental design: Control a group that has no treatment or old treatment and it is used for comparison. Randomization experiment units in treatments are even out the effects that can not be controlled. The principles of randomization and control reduce the potential of bias and prevent confounding by increasing the chance that confounding variables will operate equally on the intervention group and the placebo group. Replication repeat the same treatments to experiment units to get redundant data. It reduces that the results of the experiments will not be dependent on chance variation. Block undermine the effects of uncontrolled attributes of the experiment units. 79

Example 2.3.1. Which of the following is not a requirement of a controlled experiment? (A) control (B) comparison (C) replication (D) randomization (E) All of these are required. Solution: The answer is E. Example 2.3.2. [MC1516M] A matched-pairs design is NOT an appropriate way to analyze data consisting of which of the following? (A) Measurements of annual income taken both before and after a two-year training course for a random sample of 100 people who took the course. (B) Measurements of annual income for each twin for 100 randomly selected pairs of twins. (C) Measurements of annual income for both individuals in pairs formed by matching 100 people from State A and 100 people from State B based on level of education. (D) Measurements of annual income for both individuals in pairs formed by assigning 100 people to pairs at random. (E) Measurements of annual income recorded for both spouses of 100 randomly selected married couples. Solution: The answer is D. Example 2.3.3. The researchers randomly assigns the 1200 subjects into two treatment groups, Group 1 (600 subjects taking Drug X 325 mg) and Group 2 (600 subjects taking placebo). Three hours after taking the treatments, the researchers compare the change in body temperature between the treatment groups. In this example, there are two treatments, Drug X and placebo. The treatment of interest (Drug X) is called an intervention and the Drug X group is called the intervention group. The placebo group is sometimes called the non-intervention group. This is a one-factor experiment, i.e. only one explanatory variable, namely fever reducing medication. The one factor has two levels (Drug X 325 mg and placebo). Outline the design. 80

Solution: 81

Quick-Check 2.3.1 --- Experimental Designs QC 2.3.1.1. An appropriate design for the study is: (A) a blocked designed experiment. (B) a stratified random sample. (C) a completely randomized design. (D) a simple random sample. (E) none of these. QC 2.3.1.2. Which one of the following statements about experiments is true? (A) All experiments must have a control group. (B) Blocking is employed to reduce variation. (C) Random assignment is only critical for treatment groups, as opposed to control groups. (D) Matching can be used in any experiment to eliminate lurking variables. (E) None of these is true. QC 2.3.1.3. The 1200 subjects are assigned to blocks, based on gender with 600 for each block. Then subjects within each block are randomly assigned to the two treatment groups (Drug X 325 mg, and Placebo). The variable of gender is called a blocking variable. Three hours after taking the treatments, the researchers compare the change in body temperature between the treatment groups within each block. Outline the design. 82

Answers QC 2.3.1.1. The answer is A QC 2.3.1.2. The answer is B QC 2.3.1.3. 83