Basic Statistics for Comparing the Centers of Continuous Data From Two Groups

Size: px
Start display at page:

Download "Basic Statistics for Comparing the Centers of Continuous Data From Two Groups"

Transcription

1 STATS CONSULTANT Basic Statistics for Comparing the Centers of Continuous Data From Two Groups Matt Hall, PhD, Troy Richardson, PhD Comparing continuous data across groups is paramount in research and hospital operations. Naively assuming that any observed difference between the two groups implies that the groups are truly different ignores a few important facts: (1) there is natural variation that occurs in almost every process; and (2) our confidence in concluding that the differences are real, and not just due to chance. In the present article, we describe statistical procedures used to perform a comparison across 2 groups. At some point in your career, you will likely want to compare data across $2 groups. In fact, this function is paramount to the research you read in this journal (eg, as shown in the first table given in almost any article) and to the day-to-day operations within hospitals. You may, for example, be asked to compare factors such as length of stay, mortality rates, productivity, or utilization of a specific drug between your hospital and another hospital. You may also be asked to compare some measure within your own institution at different time points, perhaps before and after a specific intervention. Naively assuming that any difference in outcome implies that the groups are truly different is risky. Doing so completely ignores a few important facts: (1) there is natural variation that occurs in almost every process; and (2) our confidence in concluding that the differences are real, and not just due to chance. In comes statistics. In this article, we focus our attention on comparing the center of continuous data across 2 groups, but the ideas are generalizable to.2 groups. We outline here a series of important steps that you can take to identify the type of test you need and how to interpret the results. Throughout these steps, we consider a sample data set to compare the birth weight and charges for 187 female children and 227 male children with a principal diagnosis of septicemia of the newborn (International Classification of Diseases, Ninth Revision, code ) from the 2009 Kids Inpatient Database (KID; Healthcare Utilization Project, Agency for Healthcare Research and Quality). 1 KID contains an unweighted sample of 3.4 million hospital discharges for children ages 0 to 20 years from all community, nonrehabilitation hospitals in 44 states, regardless of payer. Data quality and reliability are jointly assured through the Healthcare Cost and Utilization Project and participating states and health care institutions. Discharges in which the birth weight was unavailable were excluded. Because we are comparing the birth weights of female and male subjects in this population, the underlying hypothesis that we are testing is the null hypothesis of no difference (H 0 : m Females 5 m Males ) versus any difference (H 1 : m Females Þ m Males ). The goal is to determine if we have enough evidence in the data to reject the null hypothesis in favor of the alternative. DOI: /hpeds Copyright 2016 by the American Academy of Pediatrics Address correspondence to Matt Hall, 6803 W. 64th Street, Overland Park, KS matt.hall@childrenshospitals.org HOSPITAL PEDIATRICS (ISSN Numbers: Print, ; Online, ). FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose. FUNDING: No external funding. POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose. Send questions, comments, or ideas for a future section to us at: matt.hall@childrenshospitals.org. Children s Hospital Association, Overland Park, Kansas 50 HALL and RICHARDSON

2 STEP 1: KNOW WHAT KIND OF DATA YOU HAVE The first step in determining if observed differences in outcomes are real or due to random chance is to know what kind of data you have. Although there are several different types of data (and more sophisticated definitions), our focus is on 2 of the most common types in the hospital setting: continuous and categorical. Broadly speaking, continuous data are data that can typically take on any numerical value within a range. Some examples include cost, age, and hours worked per day. These data are, in many ways, the more challenging type to analyze and the focus of the present article. In our next article, we will focus on categorical data, which are FIGURE 1 Distribution of (A) birth weight and (B) total charges from the KID 2009 database for 414 children with a principal diagnosis of septicemia of the newborn. HOSPITAL PEDIATRICS Volume 6, Issue 1, January

3 data that can be put into nonoverlapping categories or groups. Examples of categorical data include gender, payer, disposition, and receipt of a specific drug. STEP 2: HOW MANY GROUPS ARE YOU COMPARING? If you are doing a comparison, you clearly have at least 2 groups. Two groups may be all you ever need to compare: pre versus post, drug A versus drug B, or us versus them. However, there may be times when you have.2 groups. Suppose, for example, that you want to compare the average length of stay for patients with asthma across 3 different attending physicians at your hospital. In this situation, you would need a different type of statistical test to determine if there are differences across the 3 groups; this form of analysis is beyond the scope of the present article, however. STEP 3: IF THE DATA ARE CONTINUOUS, KNOW HOW YOUR DATA LOOK All statistical tests have assumptions underlying them to make them valid. One of the assumptions for continuous data (that is often overlooked) is how the data are distributed: normal (ie, bell shaped) or nonnormal. Although there are tests to determine if your data are normal (ie, the Shapiro-Wilk test 2 ), 1 method is just to eyeball the data from a histogram. If you have a statistical program, you can even request that a normal distribution be overlaid on the histogram to assist your visual determination. Fig 1 presents some sample histograms. It appears that birth weight (Fig 1A) is fairly normal, but total charges (Fig 1B) are highly skewed right (eg, a few observations that are much higher than the rest). You should be cautious of using any statistical test on the total charges data that relies on the data being from a normal distribution. One method of trying to fix-up the charge data is by transforming the data by taking the natural log of the data (Fig 2). The distribution of the transformed data looks much more normal, but trying to explain differences between 2 groups on a natural log-transformed scale can become confusing. When in doubt, it is always safer to assume that your data are not normally distributed. Most of the times in health care, they are not. You won t lose much statistical power, and you won t lose any sleep at night because you made the wrong decision. STEP 4: SUMMARIZE YOUR DATA Although summarizing the data is not really a necessary step in the process for comparing the center of 2 distributions, it is an important thing to remember when you are presenting your results. If you feel comfortable from Step 3 that your data are normal, you can summarize the data within the groups by using means with SDs or 95% confidence intervals. However, if you have nonnormal data, it is always best to use medians and quartiles (ie, the 25th and 75th percentiles) to avoid the influence of outliers in the data. There are other options if you really prefer means, with the added advantage of being more sensitive to changes than the median. The more common robust (ie, performs well with nonnormal distributions) means include the following: (1) the geometric mean, which can be calculated by taking the natural log of your data, calculating the mean, and then exponentiating the result; (2) a trimmed mean, 3 which excludes any value beyond a specified threshold (eg, below FIGURE 2 Distribution of charges from the KID 2009 database for 414 children with a principal diagnosis of septicemia of the newborn after the natural log (ln) transformation. 52 HALL and RICHARDSON

4 TABLE 1 Summary Statistics for Charges According to Gender From the KID 2009 Database for Infants With a Principal Diagnosis of Septicemia of the Newborn Measure of the Center Charges ($) for Female Subjects (n 5 187) Charges ($) for Male Subjects (n 5 227) Mean (95% CI) ( to ) ( to ) Median [IQR] [ to ] [ to ] Geometric mean (95% CI) (2218 to ) (2543 to ) 10% Trimmed mean (95% CI) ( to ) ( to ) 10% Winsorized mean (95% CI) ( to ) ( to ) IQR, interquartile range. the fifth or higher than the 95th percentiles [ie, 10% trimmed mean]); or (3) the Winsorized mean, which does not exclude values below the fifth or higher than the 95th percentiles but replaces them with the fifth or 95th percentile (ie, the 10% Winsorized mean). Table 1 describes these various measures of the center for the charge data. It is noteworthy that the mean is much larger than the other measures because it is influenced by the high outliers in our data. STEP 5: PICK YOUR TEST Finally, you are ready to think about a statistical test. The Supplemental Figure displays a flowchart to help you select the appropriate test based on your earlier answers. Suppose we want to know if differences in birth weight exist based on gender. Using the histogram in Fig 3, it looks like male infants generally weigh a little more, but we should confirm it with a t test because we believe we have normally distributed data and 2 groups. (As a side note, the t test was introduced in a 1908 article in Biometrika by a chemist at the Guinness brewery named William Gosset. 4 Because he was not permitted to publish under his own name, Gosset published his work under the pseudonym Student. ) STEP 5: INTERPRETING THE RESULT Once a statistical test is performed, it generally yields a value of the test statistic and an associated P value. The P value provides a measure of how much evidence we have to reject the null hypothesis. For the tests discussed here, the null hypothesis is that the groups have the same central tendency (ie, mean or median). The alternative hypothesis would be that the groups are different. Generally, if P,.05, we reject the null hypothesis in favor of the alternative. Figure 4 presents the SAS version 9.4 (SAS Institute, Inc, Cary, NC) output for the birth weight example comparing female and male infants, but the t test can also be performed in Excel by using the T.TEST function (Microsoft Corporation, Redmond, WA). In Fig 4A, we note that female subjects have a mean weight of g FIGURE 3 Distribution of birth weight according to gender from the KID 2009 database for 187 female subjects and 227 male subjects with a principal diagnosis of septicemia of the newborn. HOSPITAL PEDIATRICS Volume 6, Issue 1, January

5 FIGURE 4 SAS output from the TTEST procedure comparing the birth weights of females and males with a principle diagnosis of septicemia of the newborn. CL, confidence level; SD, standard deviation; SE, standard error. compared with male subjects with an average weight of g. The method of t test that we used to compare the averages is based on whether the 2 groups have a similar spread (ie, variance). This question is tested in Fig 4B. The P value for this test is large (P 5.447), and we thus conclude that the groups have equal variance, and we can use the pooled version of the t test in Fig 4C as opposed to the Satterthwaite version, which was designed for unequal variance. For the pooled version, P indicates strong evidence against the null hypothesis of equal means in birth weight between the genders. In other words, we are very confident that true differences in birth weight exist between male and female newborns and that these observed differences are more than just random chance. There is a lot of other interesting information in the output from SAS (including the estimateddifferenceof g), but we have done our duty to compare the means. You may be wondering what would have happened if we had not assumed that the data were normally distributed and used a Wilcoxon rank-sum 5 test instead. In this example, we would have come to the same conclusion because the result of that test was P Comparing continuous data from 2 groups is common in research, and the comparisons are typically presented in the first table of the article. The results of the statistical tests allow the reader to assess baseline differences between the groups and guide the multivariable modeling that needs to occur. These statistical tests can easily be performed with the few steps outlined here and some basic software. REFERENCES 1. Introduction to the HCUP KIDS Inpatient Database (KID) Healthcare Cost and Utilization Project Available at: kid_2009_introduction.jsp. Accessed November 13, Shapiro SS, Wilk MB. An analysis of variance test for normality (complete samples). Biometrika. 1965;52(3/4): Wilcox RR. Introduction to Robust Estimation and Hypothesis Testing. 3rd ed. Amsterdam, the Netherlands: Elsevier/ Academic Press; Raju TN. William Sealy Gosset and William A. Silverman: two students of science. Pediatrics. 2005;116(3): Wilcoxon F. Individual comparisons by ranking methods. Biom Bull. 1945;1(6): HALL and RICHARDSON

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc. Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able

More information

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Last-Chance Ambulatory Care Webinar Thursday, September 5, 2013 Learning Objectives For

More information

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017 Essential Statistics for Nursing Research Kristen Carlin, MPH Seattle Nursing Research Workshop January 30, 2017 Table of Contents Plots Descriptive statistics Sample size/power Correlations Hypothesis

More information

The Confidence Interval. Finally, we can start making decisions!

The Confidence Interval. Finally, we can start making decisions! The Confidence Interval Finally, we can start making decisions! Reminder The Central Limit Theorem (CLT) The mean of a random sample is a random variable whose sampling distribution can be approximated

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Pharmacotherapy Webinar Review Course Tuesday, September 3, 2013 Descriptive statistics:

More information

Examining differences between two sets of scores

Examining differences between two sets of scores 6 Examining differences between two sets of scores In this chapter you will learn about tests which tell us if there is a statistically significant difference between two sets of scores. In so doing you

More information

Profile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth

Profile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Profile analysis is the repeated measures extension of MANOVA where a set of DVs are commensurate (on the same scale). Profile

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

The normal curve and standardisation. Percentiles, z-scores

The normal curve and standardisation. Percentiles, z-scores The normal curve and standardisation Percentiles, z-scores The normal curve Frequencies (histogram) Characterised by: Central tendency Mean Median Mode uni, bi, multi Positively skewed, negatively skewed

More information

Hospital Discharge Data

Hospital Discharge Data Hospital Discharge Data West Virginia Health Care Authority Hospitalization data were obtained from the West Virginia Health Care Authority s (WVHCA) hospital discharge database. Data are submitted by

More information

Test 1C AP Statistics Name:

Test 1C AP Statistics Name: Test 1C AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. At the beginning of the school year, a high-school teacher asks every student in her classes

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

V. Gathering and Exploring Data

V. Gathering and Exploring Data V. Gathering and Exploring Data With the language of probability in our vocabulary, we re now ready to talk about sampling and analyzing data. Data Analysis We can divide statistical methods into roughly

More information

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Circle the best answer. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners to the lung

More information

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO NATS 1500 Mid-term test A1 Page 1 of 8 Name (PRINT) Student Number Signature Instructions: York University DIVISION OF NATURAL SCIENCE NATS 1500 3.0 Statistics and Reasoning in Modern Society Mid-Term

More information

International Statistical Literacy Competition of the ISLP Training package 3

International Statistical Literacy Competition of the ISLP   Training package 3 International Statistical Literacy Competition of the ISLP http://www.stat.auckland.ac.nz/~iase/islp/competition Training package 3 1.- Drinking Soda and bone Health http://figurethis.org/ 1 2 2.- Comparing

More information

Multiple Bivariate Gaussian Plotting and Checking

Multiple Bivariate Gaussian Plotting and Checking Multiple Bivariate Gaussian Plotting and Checking Jared L. Deutsch and Clayton V. Deutsch The geostatistical modeling of continuous variables relies heavily on the multivariate Gaussian distribution. It

More information

NORTH SOUTH UNIVERSITY TUTORIAL 1

NORTH SOUTH UNIVERSITY TUTORIAL 1 NORTH SOUTH UNIVERSITY TUTORIAL 1 REVIEW FROM BIOSTATISTICS I AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 DATA TYPES/ MEASUREMENT SCALES Categorical:

More information

Elementary Statistics:

Elementary Statistics: 1. How many full chapters of reading in the text were assigned for this lecture? 1. 1. 3. 3 4. 4 5. None of the above SOC497 @ CSUN w/ Ellis Godard 1 SOC497 @ CSUN w/ Ellis Godard 5 SOC497/L: SOCIOLOGY

More information

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape.

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. MODULE 02: DESCRIBING DT SECTION C: KEY POINTS C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape. C-2:

More information

YSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002

YSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002 YSU Students STATS 3743 Dr. Huang-Hwa Andy Chang Term Project May 00 Anthony Koulianos, Chemical Engineer Kyle Unger, Chemical Engineer Vasilia Vamvakis, Chemical Engineer I. Executive Summary It is common

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

copyright D. McCann, 2006) PSYCHOLOGY is

copyright D. McCann, 2006) PSYCHOLOGY is UNIT I(slides copyright D. McCann, 2006) PSYCHOLOGY is PSYCHOLOGY is the scientific study of behavior and mental processes SCOPE OF THE SUBJECT Blowfly Wilder Penfield Attitude change Mob Violence (deindividuation)

More information

A practical guide for understanding confidence intervals and P values

A practical guide for understanding confidence intervals and P values Otolaryngology Head and Neck Surgery (2009) 140, 794-799 INVITED ARTICLE A practical guide for understanding confidence intervals and P values Eric W. Wang, MD, Nsangou Ghogomu, BS, Courtney C. J. Voelker,

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

ANOVA in SPSS (Practical)

ANOVA in SPSS (Practical) ANOVA in SPSS (Practical) Analysis of Variance practical In this practical we will investigate how we model the influence of a categorical predictor on a continuous response. Centre for Multilevel Modelling

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

People have used random sampling for a long time

People have used random sampling for a long time Sampling People have used random sampling for a long time Sampling by lots is mentioned in the Bible. People recognised that it is a way to select fairly if every individual has an equal chance of being

More information

Understanding Statistics for Research Staff!

Understanding Statistics for Research Staff! Statistics for Dummies? Understanding Statistics for Research Staff! Those of us who DO the research, but not the statistics. Rachel Enriquez, RN PhD Epidemiologist Why do we do Clinical Research? Epidemiology

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

ABSTRACT THE INDEPENDENT MEANS T-TEST AND ALTERNATIVES SESUG Paper PO-10

ABSTRACT THE INDEPENDENT MEANS T-TEST AND ALTERNATIVES SESUG Paper PO-10 SESUG 01 Paper PO-10 PROC TTEST (Old Friend), What Are You Trying to Tell Us? Diep Nguyen, University of South Florida, Tampa, FL Patricia Rodríguez de Gil, University of South Florida, Tampa, FL Eun Sook

More information

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test

More information

South Australian Research and Development Institute. Positive lot sampling for E. coli O157

South Australian Research and Development Institute. Positive lot sampling for E. coli O157 final report Project code: Prepared by: A.MFS.0158 Andreas Kiermeier Date submitted: June 2009 South Australian Research and Development Institute PUBLISHED BY Meat & Livestock Australia Limited Locked

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

COAL COMBUSTION RESIDUALS RULE STATISTICAL METHODS CERTIFICATION SOUTHERN ILLINOIS POWER COOPERATIVE (SIPC)

COAL COMBUSTION RESIDUALS RULE STATISTICAL METHODS CERTIFICATION SOUTHERN ILLINOIS POWER COOPERATIVE (SIPC) Regulatory Guidance Regulatory guidance provided in 40 CFR 257.90 specifies that a CCR groundwater monitoring program must include selection of the statistical procedures to be used for evaluating groundwater

More information

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)

More information

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Jeanne Grace Corresponding author: J. Grace E-mail: Jeanne_Grace@urmc.rochester.edu Jeanne Grace RN PhD Emeritus Clinical

More information

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis: Section 1.0 Making Sense of Data Statistics: Data Analysis: Individuals objects described by a set of data Variable any characteristic of an individual Categorical Variable places an individual into one

More information

GCE. Statistics (MEI) OCR Report to Centres. June Advanced Subsidiary GCE AS H132. Oxford Cambridge and RSA Examinations

GCE. Statistics (MEI) OCR Report to Centres. June Advanced Subsidiary GCE AS H132. Oxford Cambridge and RSA Examinations GCE Statistics (MEI) Advanced Subsidiary GCE AS H132 OCR Report to Centres June 2013 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge and RSA) is a leading UK awarding body, providing a wide

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

TOTAL HIP AND KNEE REPLACEMENTS. FISCAL YEAR 2002 DATA July 1, 2001 through June 30, 2002 TECHNICAL NOTES

TOTAL HIP AND KNEE REPLACEMENTS. FISCAL YEAR 2002 DATA July 1, 2001 through June 30, 2002 TECHNICAL NOTES TOTAL HIP AND KNEE REPLACEMENTS FISCAL YEAR 2002 DATA July 1, 2001 through June 30, 2002 TECHNICAL NOTES The Pennsylvania Health Care Cost Containment Council April 2005 Preface This document serves as

More information

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003 NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003 Obs GROUP I DOPA LNDOPA 1 neurblst 1 48.000 1.68124 2 neurblst 1 133.000 2.12385 3 neurblst 1 34.000 1.53148

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

STT315 Chapter 2: Methods for Describing Sets of Data - Part 2

STT315 Chapter 2: Methods for Describing Sets of Data - Part 2 Chapter 2.5 Interpreting Standard Deviation Chebyshev Theorem Empirical Rule Chebyshev Theorem says that for ANY shape of data distribution at least 3/4 of all data fall no farther from the mean than 2

More information

BIOSTATS 540 Fall 2017 Exam 1 Page 1 of 12

BIOSTATS 540 Fall 2017 Exam 1 Page 1 of 12 BIOSTATS 540 Fall 2017 Exam 1 Page 1 of 12 BIOSTATS 540 - Introductory Biostatistics Fall 2017 Examination 1 Units 1&2 Summarizing Data & Data Visualization Due: Tuesday October 10, 2017 Last Date for

More information

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 4 Chapter 2 CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 1. A. Age: name/interval; military dictatorship: value/nominal; strongly oppose: value/ ordinal; election year: name/interval; 62 percent: value/interval;

More information

Use Survey Happiness Complete. SAV file on Blackboard Assigned to Mistake 8

Use Survey Happiness Complete. SAV file on Blackboard Assigned to Mistake 8 Deborah Hughis Final Exam 9802 Dr. Morote Use Survey Happiness Complete. SAV file on Blackboard Assigned to Mistake 8 (We were given individualized data from files and had to analyze them to complete the

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

STAT 503X Case Study 1: Restaurant Tipping

STAT 503X Case Study 1: Restaurant Tipping STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in

More information

breast cancer; relative risk; risk factor; standard deviation; strength of association

breast cancer; relative risk; risk factor; standard deviation; strength of association American Journal of Epidemiology The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail:

More information

Introduction & Basics

Introduction & Basics CHAPTER 1 Introduction & Basics 1.1 Statistics the Field... 1 1.2 Probability Distributions... 4 1.3 Study Design Features... 9 1.4 Descriptive Statistics... 13 1.5 Inferential Statistics... 16 1.6 Summary...

More information

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield Department of Statistics TEXAS A&M UNIVERSITY STAT 211 Instructor: Keith Hatfield 1 Topic 1: Data collection and summarization Populations and samples Frequency distributions Histograms Mean, median, variance

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)

More information

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D. This guide contains a summary of the statistical terms and procedures. This guide can be used as a reference for course work and the dissertation process. However, it is recommended that you refer to statistical

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

Chapter 2--Norms and Basic Statistics for Testing

Chapter 2--Norms and Basic Statistics for Testing Chapter 2--Norms and Basic Statistics for Testing Student: 1. Statistical procedures that summarize and describe a series of observations are called A. inferential statistics. B. descriptive statistics.

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

1) What is the independent variable? What is our Dependent Variable?

1) What is the independent variable? What is our Dependent Variable? 1) What is the independent variable? What is our Dependent Variable? Independent Variable: Whether the font color and word name are the same or different. (Congruency) Dependent Variable: The amount of

More information

Chapter 17 Sensitivity Analysis and Model Validation

Chapter 17 Sensitivity Analysis and Model Validation Chapter 17 Sensitivity Analysis and Model Validation Justin D. Salciccioli, Yves Crutain, Matthieu Komorowski and Dominic C. Marshall Learning Objectives Appreciate that all models possess inherent limitations

More information

Instructions and Checklist

Instructions and Checklist BIOSTATS 540 Fall 2015 Exam 1 Corrected 9-28-2015 Page 1 of 11 BIOSTATS 540 - Introductory Biostatistics Fall 2015 Examination 1 Due: Monday October 5, 2015 Last Date for Submission with Credit: Monday

More information

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3 Lecture 3 Nancy Pfenning Stats 1000 We learned last time how to construct a stemplot to display a single quantitative variable. A back-to-back stemplot is a useful display tool when we are interested in

More information

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE ...... EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE TABLE OF CONTENTS 73TKey Vocabulary37T... 1 73TIntroduction37T... 73TUsing the Optimal Design Software37T... 73TEstimating Sample

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

STA Module 9 Confidence Intervals for One Population Mean

STA Module 9 Confidence Intervals for One Population Mean STA 2023 Module 9 Confidence Intervals for One Population Mean Learning Objectives Upon completing this module, you should be able to: 1. Obtain a point estimate for a population mean. 2. Find and interpret

More information

An Introduction to Statistical Thinking Dan Schafer Table of Contents

An Introduction to Statistical Thinking Dan Schafer Table of Contents An Introduction to Statistical Thinking Dan Schafer Table of Contents PART I: CONCLUSIONS AND THEIR UNCERTAINTY NUMERICAL AND ELEMENTS OF Chapter1 Statistics as a Branch of Human Reasoning Chapter 2 What

More information

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of numbers. Also, students will understand why some measures

More information

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months? Medical Statistics 1 Basic Concepts Farhad Pishgar Defining the data Population and samples Except when a full census is taken, we collect data on a sample from a much larger group called the population.

More information

Choosing the Correct Statistical Test

Choosing the Correct Statistical Test Choosing the Correct Statistical Test T racie O. Afifi, PhD Departments of Community Health Sciences & Psychiatry University of Manitoba Department of Community Health Sciences COLLEGE OF MEDICINE, FACULTY

More information

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

Identify two variables. Classify them as explanatory or response and quantitative or explanatory. OLI Module 2 - Examining Relationships Objective Summarize and describe the distribution of a categorical variable in context. Generate and interpret several different graphical displays of the distribution

More information

GSK Clinical Study Register

GSK Clinical Study Register In February 2013, GlaxoSmithKline (GSK) announced a commitment to further clinical transparency through the public disclosure of GSK Clinical Study Reports (CSRs) on the GSK Clinical Study Register. The

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

HS Exam 1 -- March 9, 2006

HS Exam 1 -- March 9, 2006 Please write your name on the back. Don t forget! Part A: Short answer, multiple choice, and true or false questions. No use of calculators, notes, lab workbooks, cell phones, neighbors, brain implants,

More information

Observational studies; descriptive statistics

Observational studies; descriptive statistics Observational studies; descriptive statistics Patrick Breheny August 30 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 38 Observational studies Association versus causation

More information

UNEQUAL CELL SIZES DO MATTER

UNEQUAL CELL SIZES DO MATTER 1 of 7 1/12/2010 11:26 AM UNEQUAL CELL SIZES DO MATTER David C. Howell Most textbooks dealing with factorial analysis of variance will tell you that unequal cell sizes alter the analysis in some way. I

More information

Quantitative Data and Measurement. POLI 205 Doing Research in Politics. Fall 2015

Quantitative Data and Measurement. POLI 205 Doing Research in Politics. Fall 2015 Quantitative Fall 2015 Theory and We need to test our theories with empirical data Inference : Systematic observation and representation of concepts Quantitative: measures are numeric Qualitative: measures

More information

9 research designs likely for PSYC 2100

9 research designs likely for PSYC 2100 9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one

More information

Types of data and how they can be analysed

Types of data and how they can be analysed 1. Types of data British Standards Institution Study Day Types of data and how they can be analysed Martin Bland Prof. of Health Statistics University of York http://martinbland.co.uk In this lecture we

More information

Investigating the robustness of the nonparametric Levene test with more than two groups

Investigating the robustness of the nonparametric Levene test with more than two groups Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing

More information

Welcome to OSA Training Statistics Part II

Welcome to OSA Training Statistics Part II Welcome to OSA Training Statistics Part II Course Summary Using data about a population to draw graphs Frequency distribution and variability within populations Bell Curves: What are they and where do

More information

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival* LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Statistics Final Review Semeter I Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The Centers for Disease

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

Empirical Rule ( rule) applies ONLY to Normal Distribution (modeled by so called bell curve)

Empirical Rule ( rule) applies ONLY to Normal Distribution (modeled by so called bell curve) Chapter 2.5 Interpreting Standard Deviation Chebyshev Theorem Empirical Rule Chebyshev Theorem says that for ANY shape of data distribution at least 3/4 of all data fall no farther from the mean than 2

More information

Biostatistics for Med Students. Lecture 1

Biostatistics for Med Students. Lecture 1 Biostatistics for Med Students Lecture 1 John J. Chen, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 14, 2018 Lecture note: http://biostat.jabsom.hawaii.edu/education/training.html

More information

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID # STA 3024 Spring 2013 Name EXAM 3 Test Form Code A UF ID # Instructions: This exam contains 34 Multiple Choice questions. Each question is worth 3 points, for a total of 102 points (there are TWO bonus

More information

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Statistics: A Brief Overview Part I. Katherine Shaver, M.S. Biostatistician Carilion Clinic

Statistics: A Brief Overview Part I. Katherine Shaver, M.S. Biostatistician Carilion Clinic Statistics: A Brief Overview Part I Katherine Shaver, M.S. Biostatistician Carilion Clinic Statistics: A Brief Overview Course Objectives Upon completion of the course, you will be able to: Distinguish

More information

Explore. sexcntry Sex according to country. [DataSet1] D:\NORA\NORA Main File.sav

Explore. sexcntry Sex according to country. [DataSet1] D:\NORA\NORA Main File.sav EXAMINE VARIABLES=nc228 BY sexcntry /PLOT BOXPLOT HISTOGRAM NPPLOT /COMPARE GROUPS /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. Explore Notes Output Created Comments Input Missing

More information

Our plan for giving better care to people with dementia Oxleas Dementia

Our plan for giving better care to people with dementia Oxleas Dementia Our plan for giving better care to people with dementia Oxleas Dementia 2013-2016 November 2013 1 Contents 1. What is our plan about? 2. Finding out if someone has dementia 3. Finding out the care and

More information