OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Size: px
Start display at page:

Download "OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010"

Transcription

1 OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010 SAMPLING AND CONFIDENCE INTERVALS Learning objectives for this session: 1) Understand how a histogram can be read as a probability distribution 2) Understand the importance of random sampling in Statistics 3) Understand how sample means can have distributions 4) Eplain the behavior (distribution) of sample means and the Central Limit Theorem 5) Know how to interpret confidence intervals as seen in the medical literature 6) Know how to calculate a confidence interval for a mean Outside preparation: Pagano & Gauvreau: Chapter 7, pp ; Chapters 8 and 9 In the previous section we introduced the normal distribution and how to reference individual values in a distribution when given the mean and standard deviation. To see ways in which this information is applied, do a quick internet search on the definition of osteoporosis and osteopenia, or the difference between stunting and wasting in terms of malnutrition, or the development of the CDC s definitions of childhood overweight and obesity. In this section we will move beyond data from individuals and discuss the behavior of data summarizing groups, such as the behavior of means. HISTOGRAMS AS PROBABILITY DISTRIBUTIONS What do we mean by the behavior of numbers? It is important to realize that a histogram shows not only the distribution of values in a collection of people (the proportion of people between specified limits) but also shows the probability that an individual selected at random will lie between the specified limits. Recall the normal distribution: For eample, we can ask what is the probability that a randomly selected MCAT test-taker will have a score greater than 12? Well, if we know that the mean is 8 and sd is 2, then therefore only about 2% of people have a score >12. This is equivalent to saying that if we randomly choose someone from this 1

2 distribution, there is only a 1 in 50 chance (2%) that the randomly selected person will have a score that high. It is crucial to understand this concept: a histogram is also a type of probability plot. If you couple this with the special features of a normal distribution, you have a powerful tool for eplaining the probability of observing certain events. SAMPLING Sampling is basically the process of taking a random selection of individuals from a population. The population is everybody, and we use parameters to describe population characteristics. Greek letters are used when talking about population parameters (the mean is indicated with μ, the standard deviation with σ, etc). A sample is a subset of people chosen from the population, and the term statistic (with Roman symbols) is used when summarizing their characteristics. Suppose we wanted to get a good estimate of some value in the population, such as the average Calories of fat consumed daily by women in the US. It would be very difficult (and somewhat pointless) to query every single member of the population (a process called a census). Instead, we can take a good sample, calculate the mean KCal from fat, and then couch that mean with some sort of numerical acknowledgement of the fact that a different sample would have given rise to a slightly different mean. This process is part of inferential Statistics (notice capital-s) and will be covered later when we discuss Confidence Intervals. What do we mean by good sample? You could take an entire course on sampling design, learning about simple random samples, stratified sampling, convenience samples, etc. For now, let s assume that we are working with a simple random sample, in which every member of the population has an equal probability of being selected in the sample. Mmmmm. Pepperoni Imagine that your local pizza place claims that their delivery times are normally distributed, with a mean delivery time of 30 minutes and a standard deviation of 10 minutes. You order up a pizza and it takes 42 minutes for the pie to arrive at your house. Do you have reason to disbelieve the chain s claim? In a sense, if the company is telling the truth, when you call to order a pizza the subsequent delivery time is simply a random value chosen from a normal distribution with a mean of 30 and a standard deviation of 10 (this is known as a random variable 1, in that you don t know how long a given pizza delivery is going to take until you actually call up and order that pizza!). Sometimes, it might take 25 minutes to get there, sometimes 35, and sometimes 40 but most of the time it will be around the mean of 30 minutes. This time, your particular zesty pie took 42 minutes, which is equivalent to a z- score of 1.2. Deliveries this slow happen about 11% of the time (you can look at the histogram to estimate the percentile, or look at a z-table to get the eact value). Now, that s not very rare. If you tried to go on Judge Judy and make the claim that the company committed false advertising, you wouldn t have much of a case. After all, if the mean is 30 and the standard deviation is 10, you would epect that some pizzas take that long to get there. It s the definition of a distribution. So far this is all a review of last class. But let s add a twist. Imagine that you are skeptical of the chain s claim, and you convince 20 of your random friends in 20 random locations to randomly call 1 A random variable can be defined as a potential quantity whose values are determined by a chance-governed mechanism. In the real world, obviously, pizza delivery time wouldn t eactly meet this definition, but let s pretend. 2

3 and order pizzas. You then calculate the average delivery time of these 20 pizzas, and it is 42 minutes. Do you now have reason to doubt the chain s claim? Now just stop and think about this. It feels different, doesn t it? When it was just you, ordering just one pizza, 42 minutes didn t seem crazy you could probably just chalk it up to bad luck. But now you have TWENTY FLIPPING PIZZAS, and the average delivery of those 20 pizzas is 42 minutes. If the company really delivered pizzas in about a half hour, wouldn t you epect the average of your 20 pizzas to be closer to that half hour? Like maybe 35 minutes? Or maybe 32 minutes? Or maybe even 30.5 minutes? What would a judge say with this evidence? Where do we draw the cut-off? How do we determine what is weird or rare enough to bring forth as evidence that the chain s claim is false? It doesn t make sense to use the same histogram above, with its mean of 30, standard deviation of 10, and z-score of 1.2. We need to have a distribution on which to place our observed mean of 42, and your gut probably says (correctly) that it should be a distribution with a relatively tighter spread. That appropriate distribution is known as distribution of sample means. THE DISTRIBUTION OF SAMPLE MEANS Many students struggle with the concept of a distribution of means. How can means have a distribution? Dr. Tybor, isn t a mean simply a statistic, a number that summarizes my data? You have to resist the temptation of thinking about a sample mean as a specific number. Yes, it is true that in your sample, your friends observed a mean pizza delivery time of 42 minutes. But if you had randomly chosen different friends, or if your friends had been in different random locations, or if your friends had called at different times, you would have, by definition, taken a different sample. In each different sample that you could possibly have taken, the observed mean could be higher or lower than 42. So the value that you observe (the sample mean) depends on the random friends that you chose and the random locations where they are this is a chance-governed mechanism that results in the observed value. In other words, it is a random variable. In a sense, when you collect data, you are randomly choosing one random sample mean from the universe of all possible sample means. We can represent this on a histogram. But what does that histogram look like? The distribution of sample means is driven by something called the Central Limit Theorem. It is central because almost all statistical techniques are built upon it. (If the CLT were not true, then you would be at Clery's right now instead of in the library reading these notes.) Central Limit Theorem says that if you theoretically take many random samples from a population, and then calculate the means of each sample and plot them, the distribution has the following characteristics: 1. The distribution of sample means will be normally distributed. This is true even if the original population characteristic does not follow a normal distribution. 2. The mean of the sample means (-bar) is equal to the population mean. 3. The standard deviation of the distribution of sample means depends on both the size of the sample and the standard deviation of the population distribution. This spread of sample means is called the Standard Error of the Mean (SEM). 3

4 SEM = (standard deviation) / (square root of sample size) That should give you a mental image like this: Let s look at each of these points in turn. CLT says the distribution of sample means is Gaussian This is nice because we know that normal distributions have special features that make it easy to reference values somewhere in the distribution. We can easily calculate z-scores or percentiles to determine how rare an observed outcome is. This importance of this will be evident in later topics. CLT says the mean of the sample means is equal to the population mean This should intuitively make sense. If the average delivery time of a pizza is 30 minutes, and you take a sample of a large number of pizzas, your best guess of the mean of that sample will be 30 minutes. It would be surprising if the mean of the sample means were not equal to the population mean. CLT says the spread of sample means can be described with the SEM The Standard Error of the Mean, which is basically the standard deviation of the distribution of sample means, depends on two factors. First is the underlying spread of the population itself. It makes sense that if there was relatively little variability in the delivery time of pizzas (say for eample that all pizza deliveries took between 29 and 31 minutes), then the distribution of the mean delivery times would also be small. It also makes sense that the spread of the distribution is contingent on the size of the 4

5 sample the larger the sample you take, the better its mean will estimate the true population mean. These two common-sense observations can be seen in the calculation of the SEM. SEM = (SD of population) / (square root of sample size) Should I sue the pizza chain or not? We are now equipped with the tools to evaluate the veracity of the chain s claim. They claim that pizza delivery times are normally distribution with μ = 30 and σ = 10. We have collected a sample of size 25 (n = 25) and observed a sample mean of 42 (-bar = 42). How weird is this outcome if we assume that the chain is telling the truth? The SEM describes the distribution of all possible sample means (of n = 25) that would be observed if the chain s claim were correct. We can calculate the SEM and visualize the distribution of sample means. SEM = (10) / (sqrt (25)) = 2 Notice where 42 falls on this distribution. It is 6 SEMs above the mean. Think about this histogram from the perspective of probability if the chain s claim were correct, it would be very very unlikely that a random sample of 25 pizzas would result in a mean delivery time of 42 minutes. With this evidence, it would be reasonable to conclude that the pizza chain is not telling the truth In just a few short pages, we have shown how the special features of the normal distribution and the Central Limit Theorem can be utilized to make powerful statements about the probability of observing certain outcomes. Now let s build upon this knowledge. Confidence Intervals One of the major goals of statistics is to make inferences about a population based on a sample. Suppose, for eample, that a study found that in a sample of diabetic adolescents, the mean resting blood glucose level was 135 mg/dl. The number 135 mg/dl is a point estimate, a statistic summarizing our sample. (In this eample, the point estimate is a mean, but the point estimate could 5

6 be any statistics - an odds ratio, risk ratio, proportion, etc.) The point estimate is our best estimate of what the population parameter might be. However it would be silly to conclude that the true population average glucose level in all diabetic adolescents is equal to eactly 135 mg/dl. It seems reasonable that the true population parameter is probably a little bit higher or lower, and that the point estimate is just an approimation. In order to approimate the population parameter of interest using the point estimate, we employ the use of a confidence interval. A confidence interval is a range of values that we epect includes or contains the true population parameter we wish to estimate. The sample mean, for eample, is the best estimate of the population parameter, whereas the confidence interval is the interval that is the most likely to include the population parameter. In the blood glucose eample above, a potential confidence interval for the mean might consist of all the numbers between 130 and 140 mg/dl. This confidence interval would typically be written as (130, 140). In this case, the endpoints or lower and upper bounds of the confidence interval are 130 and 140, respectively. If the true, unknown population mean μ is 133 mg/dl, then we would say that the true mean is included in the confidence interval. If the unknown population mean is 142 mg/dl, however, this means that μ is not included or contained in the confidence interval. In statistics, we can never be completely certain that the true population parameter falls within the confidence interval. Only God knows, and She s not telling. Therefore, we establish confidence levels, which provide the statistical level of certainty that the true population parameter is contained in the confidence interval. Typically, the confidence level chosen in biomedical research is 95%; occasionally 90% or 99% confidence intervals are used. A 95% confidence interval, for eample, is constructed in such a way that 95% of all possible samples from the population include the true population parameter of interest. Thinking in terms of probability, this is equivalent to saying that there is a 95% chance that my confidence interval from my study (computed from my random sample) includes the population parameter. Look at this picture, showing the theoretical situation in which we take 20 different samples from a population (as if we had enough money to repeat our study 20 different times!). Each horizontal bar represents the sample mean (-bar) and confidence bounds for each of the twenty samples taken. Note that in 19 of the 20 samples, the population mean is contained in the confidence interval, and in one sample, the population mean is not. Interpretation of Confidence Intervals μ Illustration of sampling distribution and 20 confidence intervals etracted from different samples Population or sampling distribution of 6

7 Assume that from our sample, we calculated a 95% confidence interval of (120, 160). This means that we are 95% certain that the true population mean glucose falls between 120 and 160 mg/dl. Remember that this process can also be done for statistics other than means. For eample, another study might find that the odds ratio for the risk factor of cigar smoking on the outcome of emphysema is 1.20, with a 95% confidence interval of 1.07 and This means that we are 95% confident that the true population odds ratio for cigar smoking and emphysema falls between 1.07 and The best estimate is the point estimate of In other words, we are 95% confident that cigar smokers have between 7% and 34% greater odds of emphysema than non-cigar smokers. The general formula for a confidence interval is Point Estimate ± Multiplier*StandardError To calculate the confidence interval for a mean, use this formula: X ± Zσ/ n where Z is the z-score for 2-tailed tests, σ is the standard deviation, and n is the sample size. Please note that the confidence interval is calculated for μ, not for X. For 95% confidence intervals, the formula is X ± 1.96*σ/ n. The z-scores corresponding to 90% and 99% confidence limits are 1.65 and 2.58 (or, approimately 3), respectively. Thus, there are three scenarios that can make a confidence interval wider: 1. Increasing the confidence level; say, from 95% to 99%. 2. Increasing the standard deviation. 3. Decreasing sample size. In this way, confidence intervals are related to the power of the study to detect an effect or difference. In general, the larger the sample size is in a study, the greater power the study has to detect an effect. The larger the sample size, the narrower the confidence interval will be, given the same standard deviation and confidence level. We say also that studies with large sample sizes have high power to detect effects or differences. A more formal discussion of how power is used in hypothesis testing is provided in the net lecture. There are certain assumptions required for different types of confidence intervals. Generally speaking, confidence intervals for a mean require the assumption that the distribution is approimately normal or Gaussian. This assumption is not valid for all measures, however, such as odds ratios, proportions, and others. In the above eample of calculating a confidence interval for a mean, the confidence intervals would be symmetric. That is, the difference between the mean and the lower boundary of the confidence interval is eactly equal to the difference between the upper boundary of the confidence interval and the mean. Confidence intervals are symmetric in many, but not all instances. You ll find that confidence intervals are symmetric for certain measures, such as for means, differences in means, and related measures where the data are assumed to be derived from a symmetric distribution, such as the normal distribution. However, in the case of odds ratios and risk ratios (relative risks), where the confidence intervals are calculated on the log scale, the arithmetic difference between the upper boundary of the interval and the point estimate will tend to be larger than the difference between the point estimate and the lower boundary of the confidence interval. Thanks to Dr. Steve Cohen for his contributions to these notes 7

Sheila Barron Statistics Outreach Center 2/8/2011

Sheila Barron Statistics Outreach Center 2/8/2011 Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when

More information

Chapter 8 Estimating with Confidence

Chapter 8 Estimating with Confidence Chapter 8 Estimating with Confidence Introduction Our goal in many statistical settings is to use a sample statistic to estimate a population parameter. In Chapter 4, we learned if we randomly select the

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 3 Some Key Ingredients for Inferential Statistics Some Key Ingredients for Inferential Statistics Psychologists conduct research to test a theoretical principle

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50 Statistics: Interpreting Data and Making Predictions Interpreting Data 1/50 Last Time Last time we discussed central tendency; that is, notions of the middle of data. More specifically we discussed the

More information

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion Conditions for Estimating p These are the conditions you are expected to check before calculating a confidence interval

More information

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion What proportion of the beads are yellow? In your groups, you will find a 95% confidence interval for the true proportion

More information

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions Copyright 2010 Pearson Education, Inc. Standard Error Both of the sampling distributions we ve looked at are Normal. For proportions For means SD pˆ pq n

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions Copyright 2010, 2007, 2004 Pearson Education, Inc. Standard Error Both of the sampling distributions we ve looked at are Normal. For proportions For means

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science

Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science What is statistics and why do we need it? Statistics attempts to make inferences about unknown values that are common to a population

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimating with Confidence Section 8.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Introduction Our goal in many statistical settings is to use a sample statistic

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Chapter 12. The One- Sample

Chapter 12. The One- Sample Chapter 12 The One- Sample z-test Objective We are going to learn to make decisions about a population parameter based on sample information. Lesson 12.1. Testing a Two- Tailed Hypothesis Example 1: Let's

More information

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. In the second module, we will focus on selection bias and in

More information

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months? Medical Statistics 1 Basic Concepts Farhad Pishgar Defining the data Population and samples Except when a full census is taken, we collect data on a sample from a much larger group called the population.

More information

Module 28 - Estimating a Population Mean (1 of 3)

Module 28 - Estimating a Population Mean (1 of 3) Module 28 - Estimating a Population Mean (1 of 3) In "Estimating a Population Mean," we focus on how to use a sample mean to estimate a population mean. This is the type of thinking we did in Modules 7

More information

Patrick Breheny. January 28

Patrick Breheny. January 28 Confidence intervals Patrick Breheny January 28 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Recap Introduction In our last lecture, we discussed at some length the Public Health Service

More information

Variability. After reading this chapter, you should be able to do the following:

Variability. After reading this chapter, you should be able to do the following: LEARIG OBJECTIVES C H A P T E R 3 Variability After reading this chapter, you should be able to do the following: Explain what the standard deviation measures Compute the variance and the standard deviation

More information

Reflection Questions for Math 58B

Reflection Questions for Math 58B Reflection Questions for Math 58B Johanna Hardin Spring 2017 Chapter 1, Section 1 binomial probabilities 1. What is a p-value? 2. What is the difference between a one- and two-sided hypothesis? 3. What

More information

5.3: Associations in Categorical Variables

5.3: Associations in Categorical Variables 5.3: Associations in Categorical Variables Now we will consider how to use probability to determine if two categorical variables are associated. Conditional Probabilities Consider the next example, where

More information

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)

More information

Risk Aversion in Games of Chance

Risk Aversion in Games of Chance Risk Aversion in Games of Chance Imagine the following scenario: Someone asks you to play a game and you are given $5,000 to begin. A ball is drawn from a bin containing 39 balls each numbered 1-39 and

More information

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. Bayes theorem Bayes' Theorem is a theorem of probability theory originally stated by the Reverend Thomas Bayes. It can be seen as a way of understanding how the probability that a theory is true is affected

More information

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet Standard Deviation and Standard Error Tutorial This is significantly important. Get your AP Equations and Formulas sheet The Basics Let s start with a review of the basics of statistics. Mean: What most

More information

Introduction. Lecture 1. What is Statistics?

Introduction. Lecture 1. What is Statistics? Lecture 1 Introduction What is Statistics? Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain information and understanding from data. A statistic

More information

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA 15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA Statistics does all kinds of stuff to describe data Talk about baseball, other useful stuff We can calculate the probability.

More information

Selection at one locus with many alleles, fertility selection, and sexual selection

Selection at one locus with many alleles, fertility selection, and sexual selection Selection at one locus with many alleles, fertility selection, and sexual selection Introduction It s easy to extend the Hardy-Weinberg principle to multiple alleles at a single locus. In fact, we already

More information

Audio: In this lecture we are going to address psychology as a science. Slide #2

Audio: In this lecture we are going to address psychology as a science. Slide #2 Psychology 312: Lecture 2 Psychology as a Science Slide #1 Psychology As A Science In this lecture we are going to address psychology as a science. Slide #2 Outline Psychology is an empirical science.

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Pooling Subjective Confidence Intervals

Pooling Subjective Confidence Intervals Spring, 1999 1 Administrative Things Pooling Subjective Confidence Intervals Assignment 7 due Friday You should consider only two indices, the S&P and the Nikkei. Sorry for causing the confusion. Reading

More information

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Jeanne Grace Corresponding author: J. Grace E-mail: Jeanne_Grace@urmc.rochester.edu Jeanne Grace RN PhD Emeritus Clinical

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Handout 16: Opinion Polls, Sampling, and Margin of Error

Handout 16: Opinion Polls, Sampling, and Margin of Error Opinion polls involve conducting a survey to gauge public opinion on a particular issue (or issues). In this handout, we will discuss some ideas that should be considered both when conducting a poll and

More information

Two-sample Categorical data: Measuring association

Two-sample Categorical data: Measuring association Two-sample Categorical data: Measuring association Patrick Breheny October 27 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 40 Introduction Study designs leading to contingency

More information

STAT 200. Guided Exercise 4

STAT 200. Guided Exercise 4 STAT 200 Guided Exercise 4 1. Let s Revisit this Problem. Fill in the table again. Diagnostic tests are not infallible. We often express a fale positive and a false negative with any test. There are further

More information

12 INSTRUCTOR GUIDELINES

12 INSTRUCTOR GUIDELINES STAGE: Not Ready to Quit You are a community pharmacist. You notice a woman, who looks to be in her twenties, standing in front of the nicotine replacement products case, looking confused. You are able

More information

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences. Stat 13, Intro. to Statistical Methods for the Life and Health Sciences. 0. SEs for percentages when testing and for CIs. 1. More about SEs and confidence intervals. 2. Clinton versus Obama and the Bradley

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)

More information

The Fallacy of Taking Random Supplements

The Fallacy of Taking Random Supplements The Fallacy of Taking Random Supplements Healthview interview with Dr. Paul Eck Healthview: We can see from our conversations that you are totally against people taking random supplements even if people

More information

Descriptive Statistics Lecture

Descriptive Statistics Lecture Definitions: Lecture Psychology 280 Orange Coast College 2/1/2006 Statistics have been defined as a collection of methods for planning experiments, obtaining data, and then analyzing, interpreting and

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

Statisticians deal with groups of numbers. They often find it helpful to use

Statisticians deal with groups of numbers. They often find it helpful to use Chapter 4 Finding Your Center In This Chapter Working within your means Meeting conditions The median is the message Getting into the mode Statisticians deal with groups of numbers. They often find it

More information

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Previously, when making inferences about the population mean,, we were assuming the following simple conditions: Chapter 17 Inference about a Population Mean Conditions for inference Previously, when making inferences about the population mean,, we were assuming the following simple conditions: (1) Our data (observations)

More information

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such

More information

Statistical inference provides methods for drawing conclusions about a population from sample data.

Statistical inference provides methods for drawing conclusions about a population from sample data. Chapter 14 Tests of Significance Statistical inference provides methods for drawing conclusions about a population from sample data. Two of the most common types of statistical inference: 1) Confidence

More information

Applied Statistical Analysis EDUC 6050 Week 4

Applied Statistical Analysis EDUC 6050 Week 4 Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula

More information

18 INSTRUCTOR GUIDELINES

18 INSTRUCTOR GUIDELINES STAGE: Ready to Quit You are a community pharmacist and have been approached by a 16-year-old girl, Nicole Green, who would like your advice on how she can quit smoking. She says, I never thought it would

More information

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups Making comparisons Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups Data can be interpreted using the following fundamental

More information

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error I. Introduction and Data Collection B. Sampling In this section Bias Random Sampling Sampling Error 1. Bias Bias a prejudice in one direction (this occurs when the sample is selected in such a way that

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Suppose we tried to figure out the weights of everyone on campus. How could we do this? Weigh everyone. Is this practical? Possible? Accurate?

Suppose we tried to figure out the weights of everyone on campus. How could we do this? Weigh everyone. Is this practical? Possible? Accurate? Samples, populations, and random sampling I. Samples and populations. Suppose we tried to figure out the weights of everyone on campus. How could we do this? Weigh everyone. Is this practical? Possible?

More information

Section 6: Analysing Relationships Between Variables

Section 6: Analysing Relationships Between Variables 6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations

More information

HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT

HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT HARRISON ASSESSMENTS HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT Have you put aside an hour and do you have a hard copy of your report? Get a quick take on their initial reactions

More information

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence CHAPTER 8 Estimating with Confidence 8.1 Confidence Intervals: The Basics The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Confidence Intervals: The

More information

Why we get hungry: Module 1, Part 1: Full report

Why we get hungry: Module 1, Part 1: Full report Why we get hungry: Module 1, Part 1: Full report Print PDF Does Anyone Understand Hunger? Hunger is not simply a signal that your stomach is out of food. It s not simply a time when your body can switch

More information

The 3 Things NOT To Do When You Quit Smoking

The 3 Things NOT To Do When You Quit Smoking The 3 Things NOT To Do When You Quit Smoking Here are the 3 common mistakes people make when they try to quit smoking: 1. They think quitting will be hard not true 2. They think they have no willpower

More information

Chapter 1. Picturing Distributions with Graphs

Chapter 1. Picturing Distributions with Graphs Chapter 1 Picturing Distributions with Graphs Statistics Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves

More information

The Single-Sample t Test and the Paired-Samples t Test

The Single-Sample t Test and the Paired-Samples t Test C H A P T E R 9 The Single-Sample t Test and the Paired-Samples t Test BEFORE YOU GO ON The t Distributions Estimating Population Standard Deviation from the Sample Calculating Standard Error for the t

More information

My Review of John Barban s Venus Factor (2015 Update and Bonus)

My Review of John Barban s Venus Factor (2015 Update and Bonus) My Review of John Barban s Venus Factor (2015 Update and Bonus) December 26, 2013 by Erin B. White 202 Comments (Edit) This article was originally posted at EBWEIGHTLOSS.com Venus Factor is a diet program

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Probability Models for Sampling

Probability Models for Sampling Probability Models for Sampling Chapter 18 May 24, 2013 Sampling Variability in One Act Probability Histogram for ˆp Act 1 A health study is based on a representative cross section of 6,672 Americans age

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Reducing Social Threats

Reducing Social Threats Reducing Social Threats Module Objectives In this module we re going to build on what we ve already learned by discussing a specific, brain- based, model. Specifically, by the end of this module you ll

More information

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d PSYCHOLOGY 300B (A01) Assignment 3 January 4, 019 σ M = σ N z = M µ σ M d = M 1 M s p d = µ 1 µ 0 σ M = µ +σ M (z) Independent-samples t test One-sample t test n = δ δ = d n d d = µ 1 µ σ δ = d n n = δ

More information

Making Inferences from Experiments

Making Inferences from Experiments 11.6 Making Inferences from Experiments Essential Question How can you test a hypothesis about an experiment? Resampling Data Yield (kilograms) Control Group Treatment Group 1. 1.1 1.2 1. 1.5 1.4.9 1.2

More information

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference 10.1 Estimating with Confidence Chapter 10 Introduction to Inference Statistical Inference Statistical inference provides methods for drawing conclusions about a population from sample data. Two most common

More information

Basic Biostatistics. Dr. Kiran Chaudhary Dr. Mina Chandra

Basic Biostatistics. Dr. Kiran Chaudhary Dr. Mina Chandra Basic Biostatistics Dr. Kiran Chaudhary Dr. Mina Chandra Overview 1.Importance of Biostatistics 2.Biological Variations, Uncertainties and Sources of uncertainties 3.Terms- Population/Sample, Validity/

More information

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics Psy201 Module 3 Study and Assignment Guide Using Excel to Calculate Descriptive and Inferential Statistics What is Excel? Excel is a spreadsheet program that allows one to enter numerical values or data

More information

Never P alone: The value of estimates and confidence intervals

Never P alone: The value of estimates and confidence intervals Never P alone: The value of estimates and confidence Tom Lang Tom Lang Communications and Training International, Kirkland, WA, USA Correspondence to: Tom Lang 10003 NE 115th Lane Kirkland, WA 98933 USA

More information

t-test for r Copyright 2000 Tom Malloy. All rights reserved

t-test for r Copyright 2000 Tom Malloy. All rights reserved t-test for r Copyright 2000 Tom Malloy. All rights reserved This is the text of the in-class lecture which accompanied the Authorware visual graphics on this topic. You may print this text out and use

More information

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Bayes Theorem Application: Estimating Outcomes in Terms of Probability Bayes Theorem Application: Estimating Outcomes in Terms of Probability The better the estimates, the better the outcomes. It s true in engineering and in just about everything else. Decisions and judgments

More information

Statistical Methods Exam I Review

Statistical Methods Exam I Review Statistical Methods Exam I Review Professor: Dr. Kathleen Suchora SI Leader: Camila M. DISCLAIMER: I have created this review sheet to supplement your studies for your first exam. I am a student here at

More information

Sleeping Beauty is told the following:

Sleeping Beauty is told the following: Sleeping beauty Sleeping Beauty is told the following: You are going to sleep for three days, during which time you will be woken up either once Now suppose that you are sleeping beauty, and you are woken

More information

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice Transcript Details This is a transcript of a continuing medical education (CME) activity accessible on the ReachMD network. Additional media formats for the activity and full activity details (including

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

Handout 11: Understanding Probabilities Associated with Medical Screening Tests STAT 100 Spring 2016

Handout 11: Understanding Probabilities Associated with Medical Screening Tests STAT 100 Spring 2016 Example: Using Mammograms to Screen for Breast Cancer Gerd Gigerenzer, a German psychologist, has conducted several studies to investigate physicians understanding of health statistics (Gigerenzer 2010).

More information

AQA (A) Research methods. Model exam answers

AQA (A) Research methods. Model exam answers AQA (A) Research methods Model exam answers These answers are not for you to copy or learn by heart, they are for you to see how to develop you answers to get the marks. They have been written according

More information

V. Gathering and Exploring Data

V. Gathering and Exploring Data V. Gathering and Exploring Data With the language of probability in our vocabulary, we re now ready to talk about sampling and analyzing data. Data Analysis We can divide statistical methods into roughly

More information

CHAPTER THIRTEEN. Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE

CHAPTER THIRTEEN. Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE CHAPTER THIRTEEN Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE OVERVIEW NULL HYPOTHESIS SIGNIFICANCE TESTING (NHST) EXPERIMENTAL SENSITIVITY

More information

UNIVERSITY OF SOUTHERN CALIFORNIA TOWARDS NO TOBACCO USE (TNT) STUDENT SURVEY POSTTEST

UNIVERSITY OF SOUTHERN CALIFORNIA TOWARDS NO TOBACCO USE (TNT) STUDENT SURVEY POSTTEST UNIVERSITY OF SOUTHERN CALIFORNIA TOWARDS NO TOBACCO USE (TNT) STUDENT SURVEY POSTTEST Today s Date: - - Grade: Birthday: - - 19 Age: Sex: Month Day Year School Name: THINGS TO REMEMBER Read each question

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

Meeting a Kid with Autism

Meeting a Kid with Autism What s up with Nick? When school started, we had a new kid named Nick. He seemed a little different. My friends and I wondered, What's up with Nick? Turns out, Nick has autism. What is Autism This year,

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I Statistics and Probability B Ch. 4 Sample Surveys and Experiments Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I Suppose we wanted to investigate if caffeine truly affects ones pulse

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Research Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?

Research Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables? Research Questions, Variables, and Hypotheses: Part 2 RCS 6740 6/7/04 1 Review What are research questions? What are variables? Definition Function Measurement Scale 2 Hypotheses OK, now that we know how

More information

9 INSTRUCTOR GUIDELINES

9 INSTRUCTOR GUIDELINES STAGE: Ready to Quit You are a clinician in a family practice group and are seeing 16-yearold Nicole Green, one of your existing patients. She has asthma and has come to the office today for her yearly

More information

Comparing Two Means using SPSS (T-Test)

Comparing Two Means using SPSS (T-Test) Indira Gandhi Institute of Development Research From the SelectedWorks of Durgesh Chandra Pathak Winter January 23, 2009 Comparing Two Means using SPSS (T-Test) Durgesh Chandra Pathak Available at: https://works.bepress.com/durgesh_chandra_pathak/12/

More information

Undertaking statistical analysis of

Undertaking statistical analysis of Descriptive statistics: Simply telling a story Laura Delaney introduces the principles of descriptive statistical analysis and presents an overview of the various ways in which data can be presented by

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimating with Confidence Key Vocabulary: point estimator point estimate confidence interval margin of error interval confidence level random normal independent four step process level C confidence

More information

Genetic Counselor: Hi Lisa. Hi Steve. Thanks for coming in today. The BART results came back and they are positive.

Genetic Counselor: Hi Lisa. Hi Steve. Thanks for coming in today. The BART results came back and they are positive. Hi, I m Kaylene Ready, a genetic counselor who specializes in the education and counseling of individuals at high-risk for hereditary breast and ovarian cancer syndrome. Women with an inherited BRCA 1

More information

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES) STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES) In baseball after a player gets a hit, they need to decide whether to stop at first base, or try to stretch their hit from a single to a double. Does the

More information

Inferential Statistics: An Introduction. What We Will Cover in This Section. General Model. Population. Sample

Inferential Statistics: An Introduction. What We Will Cover in This Section. General Model. Population. Sample Inferential Statistics: An Introduction 10/13/2003 P225 Inferential Statistics 1 What We Will Cover in This Section Introduction. Probability and the Normal Curve. Probability and Sampling Means. The Z-test.

More information