Lesson Plan. Class Policies. Designed Experiments. Observational Studies. Weighted Averages

Similar documents
Study Methodology: Tricks and Traps

What is Statistics? (*) Collection of data Experiments and Observational studies. (*) Summarizing data Descriptive statistics.

AMS 5 EXPERIMENTAL DESIGN

66 Questions, 5 pages - 1 of 5 Bio301D Exam 3

Controlled experiments

Psychology: The Science

MAT Mathematics in Today's World

Collecting Data Example: Does aspirin prevent heart attacks?

Chapter 1 Review Questions

CHAPTER 6. Experiments in the Real World

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Randomized Controlled Studies. Joshua D. Naranjo Department of Statistics Western Michigan University

20. Experiments. November 7,

lab exam lab exam Experimental Design Experimental Design when: Nov 27 - Dec 1 format: length = 1 hour each lab section divided in two

Applied Analysis of Variance and Experimental Design. Lukas Meier, Seminar für Statistik

Controlled Variables

How Bad Data Analyses Can Sabotage Discrimination Cases

TEXT: Freedman-Pisani-Purves, Statistics, 3rd edition, W.W. Norton & Co. Bring book to class, WHICH WILL BE HELD IN 344 EVANS, Tu Th

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

PATIENT AND FAMILY SATISFACTION. Dr. James Stallcup, M.D.

Responsibilities for diabetes care. What care to expect and how to prepare for a consultation?

Overview: Part I. December 3, Basics Sources of data Sample surveys Experiments

Observational Studies and Experiments. Observational Studies

Experimental Design There is no recovery from poorly collected data!

Introduction to Statistics Design of Experiments

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

crucial hearing health questions Bring these questions to your next appointment with your audiologist, doctor or hearing care professional

Your result shows a small abdominal aortic aneurysm (AAA) What happens now?

The Research Enterprise in Psychology Chapter 2

Chapter 13 Summary Experiments and Observational Studies

Causal Research Design- Experimentation

Designed Experiments have developed their own terminology. The individuals in an experiment are often called subjects.

Chapter 13. Experiments and Observational Studies. Copyright 2012, 2008, 2005 Pearson Education, Inc.

CHAPTER 5: PRODUCING DATA

Module 4 Introduction

Daily Agenda. Honors Statistics. 1. Check homework C4#9. 4. Discuss 4.3 concepts. Finish 4.2 concepts. March 28, 2017

Chapter 11: Experiments and Observational Studies p 318

Experiments in the Real World

Meg A. Mole, The Adventures of. Future Chemist. Meg celebrates the International Year of Chemistry 2011 and Interviews Dr. Sherita D.

Abdominal aortic aneurysm (AAA) screening Things you need to know

Goal: To become familiar with the methods that researchers use to investigate aspects of causation and methods of treatment

Judea Pearl. Cognitive Systems Laboratory. Computer Science Department. University of California, Los Angeles, CA

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

observational studies Descriptive studies

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

MALE LIBIDO- EBOOKLET

Psychology of Dysfunctional Behaviour RESEARCH METHODS

Experimental design. Basic principles

Goal: To become familiar with the methods that researchers use to investigate aspects of causation and methods of treatment

Study Design. Study design. Patrick Breheny. January 23. Patrick Breheny Introduction to Biostatistics (171:161) 1/34

AP Statistics Chapter 5 Multiple Choice

Sampling Controlled experiments Summary. Study design. Patrick Breheny. January 22. Patrick Breheny Introduction to Biostatistics (BIOS 4120) 1/34

Controlling Worries and Habits

Your result shows a medium abdominal aortic aneurysm (AAA) What happens now?

Bowel health and screening: carers guide. A booklet for carers of people who use easy read materials

REVIEW FOR THE PREVIOUS LECTURE

Controlled Variables

At Merck, ethics are our first priority. We are looking for scientists, managers, administrators, and workers who share this priority.

MEMBERSHIP AND MENTORING

Acknowledging addiction

Chapter 12. The One- Sample

Handout: Instructions for 1-page proposal (including a sample)

Beware of Confounding Variables

Living with Newton's Laws

10 Nov of 6 Bio301D Exam 3

Chapter 14: More Powerful Statistical Methods

HOW TO INSTANTLY DESTROY NEGATIVE THOUGHTS

Character and Life Skills: Self-Control Lesson Title: Red Light, Green Light Grade Level: 3-5

Writing an If Then Hypothesis

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

Quiz 4.1C AP Statistics Name:

The Top 10 Things. Orthodontist SPECIAL REPORT. by Dr. Peter Kimball. To Know When Choosing An. Top 10 Things To Know Before Choosing An Orthodontist

Unit notebook May 29, 2015

Section 4.3 Using Studies Wisely. Read pages 266 and 267 below then discuss the table on page 267. Page 1 of 10

5 COMMON SLEEP MISTAKES

CHAPTER 2 EVALUATING NUTRITION INFORMATION OVERVIEW

Name Class Date. Even when random sampling is used for a survey, the survey s results can have errors. Some of the sources of errors are:

Activities for Someone in Early in Dementia

RESEARCH METHODS: PSYCHOLOGY AS A SCIENCE

A. Indicate the best answer to each the following multiple-choice questions (20 points)

Department of OUTCOMES RESEARCH

How to empower your child against underage drinking

Introduction; Study design

Statistics Success Stories and Cautionary Tales

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion

First Problem Set: Answers, Discussion and Background

ORIENTATION SAN FRANCISCO STOP SMOKING PROGRAM

The role of Randomized Controlled Trials

STAB22 Statistics I. Lecture 12

Gathering. Useful Data. Chapter 3. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Vocabulary. Bias. Blinding. Block. Cluster sample

Experimental Design Process. Things you can change or vary: Things you can measure or observe:

The object of an experiment is to prove that A causes B. If I wanted to prove that smoking causes heart issues, what are some confounding variables?

Folland et al Chapter 4

SHARED DECISION MAKING WORKSHOP SMALL GROUP ACTIVITY LUNG CANCER SCREENING ROLE PLAY

CHAPTER 8 EXPERIMENTAL DESIGN

Problem Set and Review Questions 2

Transcription:

Lesson Plan 1 Class Policies Designed Experiments Observational Studies Weighted Averages

0. Class Policies and General Information Our class website is at www.stat.duke.edu/ banks. My office hours are from 1:00-2:30 p.m. on Sundays, in Old Chem 116. 2 Your grade will be a weighted average of homeworks, quizzes, exams, and lab grades. There will be no final exam. HW and labs each count for 10%; quizzes count for 20%. The best exam counts for 20%, the worst for 10%, the others are 15%. You may drop the two lowest quiz grades, and the lowest homework and lab grades. Each assignment receives a a letter grade. An A+ gives 12 points, an A gives 11, and so forth. The final grade is a weighted average of the components, with cutpoints at 11.5, 10.5, etc. The first homework assignment has been posted and is due next week.

1. DEs and OSs We contrast designed experiments (DEs) with observational studies (OSs). 3 The usual purpose of both kinds of research is to draw conclusions about causation. For example: Does smoking cause cancer? Does premarital sex cause higher divorce rates? Does college partying cause low grades? A double-blind, randomized, controlled experiment gives more accurate conclusions than an observational study.

1.1 Designed Experiments The gold standard for a statistical study is the double-blind, randomized, controlled experiment. 4 A study is controlled if one group receives the treatment and another group does not. (In medicine, that group usually gets either a placebo, or standard medical care, or both.) A study is double-blind if neither the subjects nor the scientists know who is assigned to which group until after the data are collected. This prevents subjects in different groups from behaving in different ways; prevents scientists from introducing any unconscious bias into the data collection process.

A study is randomized if the control group and the treatment group are chosen at random. 5 Without randomization, the groups may differ in a systematic way. For example, surgeons used to assign only the healthiest patients to receive an experimental new surgical treatment, since those patients could best withstand the invasive procedure. But the outcomes for those patients are not a reliable forecast for how normal patients would respond. Historical controls do not give a randomized experiment, which is one reason their use is problematic. The FDA is very reluctant to approve drugs in which all patients in the trial receive the drug, while the control group are patients who were treated before the drug was invented. One concern is that the standard of basic care constantly improves, so the drug may appear effective when, in fact, the only difference is that current patients get, say, better nursing care.

Studies of the portacaval shunt, a treatment for cirrhosis of the liver, is telling. Physicians reported 50 experiments on the procedure in the medical literature (most of these experiments were small, involving only about ten or so patients). The following table breaks out the 50 studies according to their design and their conclusions: 6 Degree of Enthusiasm High Moderate Low Design No Control 24 7 1 Control, Not Randomized 10 3 2 Randomized, Controlled 0 1 3

1.2 Observational Studies In an observational study, the researcher does not get to determine who receives the treatment. For example, people who smoke get lung cancer at a higher rate than those who do not smoke. Does smoking cause lung cancer? 7 The tobacco lobby used to say no, arguing that: there might be a gene that predisposes people to both enjoy smoking and get cancer; people who like to smoke may tend to follow unhealthy lifestyles (e.g., alcohol use), and that may be the real cause of lung cancer; no randomized, controlled, double-blind experiment (on humans) has shown causation.

Obviously, it would be ethically problematic to do a randomized controlled experiment (one would have to assign 14-year-olds at random to smoke heavily for the rest of their lives). And it would be hard to make this double-blind people know if they smoke. 8 But animal studies strongly indicate that smoking causes lung cancer in mammals and birds. The other two arguments from the tobacco lobby carry more weight. The differences between lung cancer rates in the smokers and non-smokers may be due to smoking, or they may be due to a confounding factor. In this case, tobacco lobbies suggested two possible confounding factors: genes and lifestyle.

A confounding factor is associated with both: outcome group membership. For example, one might argue that lung cancer is caused by matches, not tobacco. 9 Similarly, one might argue that cholesterol does not cause heart disease, but rather is a result of poor circulation or breakdown of heart muscle tissue so it is associated, but not causal. One way to try to handle confounding is to make subgroup comparisons that control for possible confounding effects. For example, one could compare the lung cancer rates for smokers who use matches against smokers who use lighters.

Do seatbelts save lives? Seatbelt studies are usually observational (why?). One compares the fatality rates in accidents in which seatbelts were worn to the fatality rate in accidents without seatbelts. But one has to worry about confounding factors. For example, 10 People who don t wear seatbelts may drive more recklessly. People who don t wear seatbelts may prefer cars that are not designed with safety in mind. Some researchers try to control for this by comparing the fatality rates among seatbelt wearers and non-wearers in similar cars, or cars that are thought to have been traveling at the same speed. But this is awkward to do and invites criticism.

In order to control for a confounding factor, one has to guess what it is. But that can be hard and you are never sure that you have thought of everything. 11 In contrast, with a randomized design, the random assignment of people to the treatment and control groups ensures that there is almost no chance of a systematic difference between the groups. You are unlikely to get most of the safe drivers in one group and the reckless in the other, or most of the people with good genes for lung cancer in one group and all those with bad genes in the other. Health experts say that exercise increases one s lifespan. What kinds of data might they have, and what would be the statistical issues regarding the validity of their claim?

1.2.1 Weighted Averages Subgroup analysis is one way to control for a potential confounding factor. Here one studies each group defined by the confounder separately. Another way to control for a confounder is to use a weighted average. 12 In the 1960s, the University of California at Berkeley was embarrassed. It was rejecting a larger proportion of women than men, and applicants claimed there was gender bias. But when the Dean asked each department to report their admission rates separately, it turned out that each department accepted a larger proportion of women than men. (The Dean was doing a subgroup analysis without realizing it.)

This apparent reversal of a pattern is sometimes called Simpson s Paradox. It happens when there is a third confounding variable (major) which affects the other two (admission and gender). 13 The Dean asked Professor Betty Scott to study the problem. She showed that women tended to apply to the majors that were most selective, whereas the men applied to majors that were less selective. So overall, the women had higher rejection rates. To put such comparisons on a fair footing, she calculated the weighted average admission rates for women and men, where the weights are determined by the proportion of people applying to each of the different majors. This controls for the confounding variable.

To see how the weighted average works, we focus on just two majors. Assume major A accepts 80% of all applicants, but Major B accepts just 10%. Suppose 100 men and 200 women apply. Consider two scenarios: Scenario 1: Half the men and half the women apply to A, the rest apply to B. 14 Scenario 2: 90 men apply to A, 10 to B; but 180 women apply to B, 20 to A. In the first case, major is not a confounding variable. Men and women show the same major preferences. (Note: They do not have to apply in 50-50 ratios it would still not be a confounder if both genders applied in 25-75 ratios, for example.) In the second case, major is a confounder. Men prefer A, but women prefer B.

In Scenario 1, the raw number of men who are accepted is.8 50+.1 50 = 45 and for women the percentage is the same: (80+10)/200 is 45%. In Scenario 2, the raw number of men who are accepted is 15.8 90+.1 10 = 73 or 73%. And the raw number of women accepted is.1 180+.8 20 = 34 so their acceptance rate is 34/200 or 17%. This looks like gender bias, but actually it is not the admission policy is completely gender blind.

To make a fair comparison, weight the acceptance rates for men in each major by the fraction of people applying to that major: 90+20 300 72 90 + 10+180 300 1 10 = 0.357 and the weighted average proportion of women accepted is 16 90+20 300 16 20 + 10+180 300 18 180 = 0.357 The weighted average shows that the acceptance rates for men and women, controlling for major, are equal. The general formula for finding the weighted average correction for the acceptance rate of men is: wtd avg = i (prop. of people applying to major i) (acceptance rate for men at major i).

A case study from a former student, who is now working as a financial analyst at a Wall Street firm. The following is an excerpt from his email and is used with his permission. 17 [...] On a less exciting note, I thought I d share a personal anecdote regarding Simpson s Paradox and how my statistics knowledge allowed me to spot a major issue with the strategy of one of our clients. I was working with a cable company reviewing their product strategy, and was looking at one product in particular. The client had observed that customers with a certain product seemed to churn at a rate significantly lower than customers without the product and as a result was planning on pursuing a marketing and pricing strategy to try and get this product into as many homes as possible.

18 I immediately identified confounding as customers who tended to take the product were those that naturally churned at a lower rate. To prove this was an example of Simpson s paradox, I did a comparison of all customer segments and looked at the churn rates for identical cohorts with the product vs. cohorts without the product. The analysis showed that for EVERY segment, customers with the product churned MORE than the same segment without it, but that when looking at the population as a whole, the distribution of customers with the product across favorable churn segments made it look like the product reduced churn. Voila, Simpson s Paradox! To give you a perspective as to the order of magnitude of this little deceptive statistic, we projected that the client may have lost up to $400M by pursuing their original strategy.