STATISTICS: METHOD TO GET INSIGHT INTO VARIATION IN A POPULATIONS If every unit in the population had the same value,say

Similar documents
MATH-134. Experimental Design

Sampling Controlled experiments Summary. Study design. Patrick Breheny. January 22. Patrick Breheny Introduction to Biostatistics (BIOS 4120) 1/34

Study Design. Study design. Patrick Breheny. January 23. Patrick Breheny Introduction to Biostatistics (171:161) 1/34

Introduction; Study design

Collecting Data Example: Does aspirin prevent heart attacks?

Data = collections of observations, measurements, gender, survey responses etc. Sample = collection of some members (a subset) of the population

Confidence Intervals and Sampling Design. Lecture Notes VI

MATH& 146 Lesson 4. Section 1.3 Study Beginnings

Vocabulary. Bias. Blinding. Block. Cluster sample

Principle underlying all of statistics

Introduction to Statistics Design of Experiments

For each of the following cases, describe the population, sample, population parameters, and sample statistics.

aps/stone U0 d14 review d2 teacher notes 9/14/17 obj: review Opener: I have- who has

Bias in Sampling. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Chapter 1 Data Collection

Math 1680 Class Notes. Chapters: 1, 2, 3, 4, 5, 6

You can t fix by analysis what you bungled by design. Fancy analysis can t fix a poorly designed study.

Math 124: Module 3 and Module 4

Chapter 3. Producing Data

Sampling for Success. Dr. Jim Mirabella President, Mirabella Research Services, Inc. Professor of Research & Statistics

Chapter 7. Sampling Techniques

Problems for Chapter 8: Producing Data: Sampling. STAT Fall 2015.

Math 124: Modules 3 and 4. Sampling. Designing. Studies. Studies. Experimental Studies Surveys. Math 124: Modules 3 and 4. Sampling.

Chapter Problem. Why was the Literary Digest poll so wrong?

Review+Practice. May 30, 2012

Lecture Start

STAT 110: Introduction to Descriptive Statistics

Variable Data univariate data set bivariate data set multivariate data set categorical qualitative numerical quantitative

Applied Analysis of Variance and Experimental Design. Lukas Meier, Seminar für Statistik

A Probability Puzzler. Statistics, Data and Statistical Thinking. A Probability Puzzler. A Probability Puzzler. Statistics.

Ch 4 Practice Test. Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 4-1

Population. population. parameter. Census versus Sample. Statistic. sample. statistic. Parameter. Population. Example: Census.

REVIEW FOR THE PREVIOUS LECTURE

TOPIC: Introduction to Statistics WELCOME TO MY CLASS!

STA 291 Lecture 4 Jan 26, 2010

Unit 3: Collecting Data. Observational Study Experimental Study Sampling Bias Types of Sampling

AP Statistics Chapter 5 Multiple Choice

What Is Statistics? CHAPTER. Chapter Overview

AMS 5 EXPERIMENTAL DESIGN

Sampling Reminders about content and communications:

Chapter 4 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Chapter 1 - Sampling and Experimental Design

Chapter 3. Producing Data

Introduction: Statistics, Data and Statistical Thinking Part II

Chapter 5: Producing Data

Do the following review exercises at the end of Chapter 1: 4, 5, 8, 9, 12, 17, and 19.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Sampling and Data Collection

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Oncology Clinical Research & Race: Statistical Principles

Experimental Design There is no recovery from poorly collected data!

Chapter 2. The Data Analysis Process and Collecting Data Sensibly. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

SURVEY RESEARCH. Topic #9. Measurement and assessment of opinions, attitudes, etc. Usually by means of questionnaires and sampling methods.

Unit 5. Thinking Statistically

Math 140 Introductory Statistics

Methodology for the VoicesDMV Survey

STA Module 1 The Nature of Statistics. Rev.F07 1

STA Rev. F Module 1 The Nature of Statistics. Learning Objectives. Learning Objectives (cont.

Chapter 5: Producing Data Review Sheet

P. 266 #9, 11. p. 289 # 4, 6 11, 14, 17

BIAS: The design of a statistical study shows bias if it systematically favors certain outcomes.

THE BIGGEST PUBLIC HEALTH EXPERIMENT EVER: The 1954 Field Trial of the Salk Poliomyelitis Vaccine

Chapter 1: Introduction to Statistics

Lesson Marginal, Joint, and Conditional Probabilities from Two-Way Tables

Introduction to Statistics

AP Statistics Exam Review: Strand 2: Sampling and Experimentation Date:

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Chapter Three Research Methodology

Quantitative research Quiz Answers

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1

Quiz 4.1C AP Statistics Name:

Soci708 Statistics for Sociologists

Chapter 11: Experiments and Observational Studies p 318

BLACK RESIDENTS VIEWS ON HIV/AIDS IN THE DISTRICT OF COLUMBIA

1. Find the appropriate value for constructing a confidence interval in each of the following settings:

UNIT I SAMPLING AND EXPERIMENTATION: PLANNING AND CONDUCTING A STUDY (Chapter 4)

SAMPLE SURVEYS, SAMPLING TECHNIQUES, AND DESIGN OF EXPERIMENTS

Unit 1 Exploring and Understanding Data

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Handout 16: Opinion Polls, Sampling, and Margin of Error

AP Statistics Hypothesis Testing and Confidence Intervals Take Home Test 40 Points. Name Due Date:

Chapter 13: Experiments

Overview: Part I. December 3, Basics Sources of data Sample surveys Experiments

Section Introduction

CHAPTER 6. Experiments in the Real World

Elementary Statistics and Inference. Elementary Statistics and Inference. 1.) Introduction. 22S:025 or 7P:025. Lecture 1.

Lecture 1: Scope, Origins, and Methods in Psychology. Examples of Research in Psychology. Contents

Speaking of Statistics

Experiments in the Real World

How are polls conducted? by Frank Newport, Lydia Saad, David Moore from Where America Stands, 1997 John Wiley & Sons, Inc.

CHAPTER 2 SOCIOLOGICAL RESEARCH METHODS

If you could interview anyone in the world, who. Polling. Do you think that grades in your school are inflated? would it be?

CARP Immunization Poll Report October 22, 2015

Study Methodology: Tricks and Traps

This exam consists of three parts. Provide answers to ALL THREE sections.

Types of Studies. There are three main types of studies. These are ways to collect data.

Patrick Breheny. January 28

UCLA STAT 251 / OBEE 216 UCLA STAT 251 / OBEE 216

Creative Commons Attribution-NonCommercial-Share Alike License

Sampling. (James Madison University) January 9, / 13

Transcription:

STATISTICS: METHOD TO GET INSIGHT INTO VARIATION IN A POPULATIONS If every unit in the population had the same value,say everyone has the same income same blood pressure No need for statistics

Statistics makes conclusion about a population and not individual units in the population. Example: 40% of the student population wear glasses (does not say about individual students) U.S. unemployment rate is 7 % (no mention of who is unemployed) In 70% of the cases Lipitor lowers cholestrol

Suppose I want to know the percentage of students in MSU who use Microsoft Windows Apple IOS Linux Others

Method 1 Query ALL Students about the software This gives information on the entire population Collecting data on an entire population is called CENSUS Generally infeasible in large populations

Method2 Draw a sample of students from the entire population Collect data on the sample Use it to draw conclusions about the population

Statistical Inference Need to draw representative sample Cannot judge if the sample is representative of the population ( we have no information on the population) Draw random sample calculate the probability that the sample is (approximately) representative

A representative sample exhibits characteristics typical of those possessed by the population of interest. A simple random sample of n experimental units is a sample selected from the population in such a way that every different sample of size n has an equal chance of selection.

Issues in sampling Logistics Non statistical. Ensure that the sample is drawn randomly from the population of interest If not, leads to Bias Other sampling errors

How a sample is selected from a population is of vital importance in statistical inference because the probability of an observed sample will be used to infer the characteristics of the sampled population.

Statistics is used in various situations to infer some feature of a population based on a sample to test the effect of a treatment to estimate the effect of a treatment to classify an object

There is a whole theory of Sampling and Design of Experiments Simple random sampling Designed trials to control variability Designed trials to avoid bias

Case Study:sampling from popuation digest1.pdf The Literary Digest Poll 1936, Roosevelt versus Landon. Campaign centered on economic policies.

Literary Digest prediction Landon 57% FDR 43 % Used a sample of 24 million people Actual Result FDR 60.8 %

The Literary Digest s method for choosing its sample was as follows: Based on every telephone directory in the United States, lists of magazine subscribers, rosters of clubs and associations, and other sources, a mailing list of about 10 million names was created. Every name on this list was mailed a mock ballot and asked to return the marked ballot to the magazine.

There were two basic causes of the Literary Digest s downfall: selection bias nonresponse bias

Bias Samples selected from telephone directories, club membership lists etc Biased toward upper-class voters, and exclude lower-income voters the Literary Digest mailing list was far from being a representative cross-section of the population

Non response out of 10 million on the mailing list, only about 2.4 million responded to the survey. People who respond to surveys are different from people who don t, not only in the obvious way (their attitude toward surveys) but also in more subtle and significant ways. Nonresponse is difficult to handle

Moral A badly chosen big sample is much worse than a well-chosen small sample Watch out for selection bias and nonresponse bias.

Quiz 01 Not graded

Testing for effect of treatment Want to study the effect of a variable on another Effect of fertilizer on yield, an ad campaign Ensure that whatever effect on yield is attributable only to fertilizer Conduct the trial on similar plots, similar environment etc

case study Twins Studies of twins reared apart are one of the most powerful tools that scholars have to analyze the relative contributions of heredity and environment to the makeup of individual human natures. Such a study might not set to rest the quarrel over the relative importance of nature versus nurture, but there were few other experiments be more pertinent

Polio Vaccine Trial 1954 Salk polio vaccine trials Biggest public health experiment ever Polio epidemics hit U.S. in 20 th century Struck hardest at children Responsible for 6% of deaths among 5- to 9- year-olds

Polio Vaccine Trial Polio is rare but the virus itself is common Children from higher-income families were more vulnerable to polio! Children in less hygienic surroundings contract mild polio early in childhood while still protected from their mothers antibodies. They develop immunity early. Children from more hygienic surroundings don t develop such antibodies.

Case study:polio Vaccine Polio rate of occurrence is about 50 per 100,000 Suppose the vaccine was 50% effective and 10,000 subjects were recruited for each of the control and treatment groups You would expect 5 polio cases in control group and 2-3 in treatment group Such a difference could be attributed to random variation Clinical trials were needed on a massive scale

Case study:polio Vaccine Why not just distribute the vaccine to some and see if it lowered the polio rate? A yearly drop might mean the drug was effective, or that that year was not an epidemic year 60000 Number of polio cases in the U.S. 1930 to 1955 50000 40000 30000 20000 10000 0 1930 1934 1938 1942 1946 1950 1954 1932 1936 1940 1944 1948 1952 YEAR

The NFIP study Vaccinate all children in grade 2 whose parents give consent Leave grades 1 and 3 unvaccinated Compare incidence of polio in grade 2 to other gradesd Vaccinate group : Treatment Group Grade 1 &3 : Control Group

the observed control experiment, suffers from selection bias Diagnostic Bias

Selection Bias In terms of selection, the treatment and control groups are different with respect to at least two variables: age parental consent for vaccination. Parents who consented to vaccination were on average better educated and more affluent, and lived in more sanitary conditions The critical issue is, whether these variables are related to the variable of interest, namely contracting polio year

Diagnostic Bias diagnosis, the problem is that mild cases of polio resemble influenza and other common diseases. Doctors, who generally believe in the value of vaccines, would tend to diagnose polio slightly more often for the unvaccinated children in the control group than for the vaccinated children in the treatment group

Randomized Study Of the children whose parents give consent, randomly allocate half to a treatment group and the other half to a control group Treatment group gets the vaccine Control group gets a placebo

Double Blinding Neither children nor parents know if child has received vaccine or control Doctors also don t know

Experiment Study Group Population Field Trial Data Polio Cases Paralytic Non- Paralytic False Reports Vaccinated 200,745 33 24 25 Placebo 201,229 115 27 20 Not Inoculated Incomplete Vaccinations 338,778 121 36 25 8,484 1 1 0 Vaccinated 221,998 38 18 20 Controls 725,173 330 61 48 Observed Control Grade 2 Not Inoculated Incomplete Vaccinations 123,605 43 11 12 9,904 4 0 0

Statistics is used in various situations to infer some feature of a population based on a sample to test the effect of a treatment Three or more treatments Analysis of Variance to estimate the effect of a treatment. Regression Problems to classify an object Classification or Pattern Recognition Problems

Dewey Vs Truman (1948) The Gallup, Roper, and Crossley polls all predicted that Dewey would defeat Truman by a significant margin The Crossley, Gallup, and Roper organizations all used quota sampling. Each interviewer was assigned a specified number of subjects to interview. Moreover, the interviewer was required to interview specified numbers of subjects in various categories, based on residential area, sex, age, race, economic status, and other variables.

Dewey Vs Truman (1948) slides/chapter 1/1948 election2.pdf Candidate Crossley Gallup Roper Election Results Truman 45 44 38 50 Dewey 50 50 53 45 Others 5 6 9 5

Dewey Vs Truman (1948) In quota sampling, the subjects are hand-picked to resemble the population with respect to some key characteristics. Quota sampling SEEMS reasonable because it ensures that the sample will resemble the population with respect to some of the important characteristics related to voting behavior. BUT: quota sampling does not work very well due to unintentional bias on the parts of the interviewers. Probability methods use objective chance procedures to select samples. They guard against bias because they leave no discretion to the interviewer.

Quiz 01 (not graded)