Biostatistics & SAS programming

Similar documents
Business Statistics Probability

Inferential Statistics

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Unit 1 Exploring and Understanding Data

Knowledge is Power: The Basics of SAS Proc Power

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Sheila Barron Statistics Outreach Center 2/8/2011

Binary Diagnostic Tests Paired Samples

Linear Regression in SAS

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Statistical inference provides methods for drawing conclusions about a population from sample data.

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

CHAPTER VI RESEARCH METHODOLOGY

Our goal in this section is to explain a few more concepts about experiments. Don t be concerned with the details.

Multiple Samples Inference Examples

Missy Wittenzellner Big Brother Big Sister Project

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

STA Module 9 Confidence Intervals for One Population Mean

Reflection Questions for Math 58B

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Understandable Statistics

14.1: Inference about the Model

IAPT: Regression. Regression analyses

Designing Psychology Experiments: Data Analysis and Presentation

ANOVA. Thomas Elliott. January 29, 2013

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

Binary Diagnostic Tests Two Independent Samples

Still important ideas

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

SPSS output for 420 midterm study

d =.20 which means females earn 2/10 a standard deviation more than males

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Chapter 8: Estimating with Confidence

A response variable is a variable that. An explanatory variable is a variable that.

Instructions for doing two-sample t-test in Excel

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

Hypothesis testing: comparig means

Data Analysis in the Health Sciences. Final Exam 2010 EPIB 621

SPSS output for 420 midterm study

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

An Introduction to Bayesian Statistics

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

STP 231 Example FINAL

DIFFERENCE BETWEEN TWO MEANS: THE INDEPENDENT GROUPS T-TEST

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Study Guide for the Final Exam

REVIEW PROBLEMS FOR FIRST EXAM

Applied Statistical Analysis EDUC 6050 Week 4

Intro to SPSS. Using SPSS through WebFAS

Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items

Simple Linear Regression the model, estimation and testing

E 490 FE Exam Prep. Engineering Probability and Statistics

STATISTICS & PROBABILITY

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Analysis of Variance (ANOVA)

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Level 2 Mathematics and Statistics, 2013

Experimental Research I. Quiz/Review 7/6/2011

Question 1(25= )

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Lesson 2 The Experimental Method

Survey Project Data Analysis Guide

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Chapter 12. The One- Sample

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

QUANTIFYING THE EFFECT OF SETTING QUALITY CONTROL STANDARD DEVIATIONS GREATER THAN ACTUAL STANDARD DEVIATIONS ON WESTGARD RULES

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

STATISTICS - CLUTCH CH.11: HYPOTHESIS TESTING: PART 1.

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Choosing designs and subjects (Bordens & Abbott Chap. 4)

Part 8 Logistic Regression

Student Performance Q&A:

Business Research Methods. Introduction to Data Analysis

Stat Wk 9: Hypothesis Tests and Analysis

Things you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points.

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Chapter 14. Inference for Regression Inference about the Model 14.1 Testing the Relationship Signi!cance Test Practice

Chapter 9: Comparing two means

Something to think about. What happens, however, when we have a sample with less than 30 items?

CHAPTER III METHODOLOGY

Technical Whitepaper

Examining differences between two sets of scores

Self-assessment test of prerequisite knowledge for Biostatistics III in R

Clincial Biostatistics. Regression

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

Results. Example 1: Table 2.1 The Effect of Additives on Daphnia Heart Rate. Time (min)

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY

METHOD VALIDATION: WHY, HOW AND WHEN?

Transcription:

Biostatistics & SAS programming Kevin Zhang April 18, 2017 Determine Sample Size and Power 1

Errors April 18, 2017 Biostat 2

In practice When you design the study, you need to first tell how many units, i.e. the sample size, should be involved: 10, 100, 1000, or more? Which one you will trust? A sample with 10 observations A sample with 10,000 observations April 18, 2017 Biostat 3

Power The power of the hypothesis test demonstrate the sensitivity of the hypothesis: Whether the conclusion is reliable? Power function Power function is an equation of sample size: We may enlarge the power by getting a larger sample size. April 18, 2017 Biostat 4

POWER proc proc power in SAS is used for power analysis. You can detect the power for the given sample size, or determine the sample size using desired power. POWER need to know what kind of problem you will solve: MULTREG -- Tests of one or more coefficients in multiple linear regression ONECORR -- Fisher s Z test and T test of (partial) correlation ONESAMPLEFREQ -- Tests, confidence interval precision, and equivalence tests of a single binomial proportion ONESAMPLEMEANS -- One-sample test, confidence interval precision, or equivalence test ONEWAYANOVA -- One-way ANOVA including single-degree-of-freedom contrasts PAIREDMEANS -- Paired T test, confidence interval precision, or equivalence test PLOT -- Displays plots for previous sample size analysis TWOSAMPLEMEANS -- Two-sample T test, confidence interval precision, or equivalence test April 18, 2017 Biostat 5

Example 1 A clinical dietician wants to compare two different diets, A and B, for diabetic patients. She hypothesizes that diet A (Group 1) will be better than diet B (Group 2), in terms of lower blood glucose. She plans to get a random sample of diabetic patients and randomly assign them to one of the two diets. At the end of the experiment, which lasts 6 weeks, a fasting blood glucose test will be conducted on each patient. She also expects that the average difference in blood glucose measure between the two group will be about 10 mg/dl. Furthermore, she also assumes the standard deviation of blood glucose distribution for diet A to be 15 and the standard deviation for diet B to be 17. The dietician wants to know the number of subjects needed in each group assuming equal sized groups. April 18, 2017 Biostat 6

Analysis April 18, 2017 Biostat 7

SAS code explaination proc power; twosamplemeans test=diff groupmeans = 0 10 stddev = 16.03 npergroup =. power = 0.8; run; Two sample mean test, we need to check the difference. Set the averages of groups, here we just set 0 and 10 thus to describe the desired diff Leave npergroup blank thus SAS will calculate sizes for groups. Specify the desired power as 80% April 18, 2017 Biostat 8

Your settings We will achieve 80% power when 42 patients in each group April 18, 2017 Biostat 9

Evaluate the power of a given sample What happens if we only have 30 patients in each group? proc power; twosamplemeans test=diff groupmeans = 0 10 stddev = 16.03 npergroup = 30 power =.; run; Power is? 30 patients in each group April 18, 2017 Biostat 10

In practice, how to evaluate the power of an imbalance design? More patients assigned to Diet A, say 40 Only 20 patients wish to take Diet B proc power; twosamplemeans test=diff groupmeans = 0 10 stddev = 16.03 groupns = (40 20) power =.; run; April 18, 2017 Biostat 11

Small simulation study Wish to see the change of the sample size when we have different mean differences? proc power; twosamplemeans test=diff meandiff =.2 to 1.2 by.2 stddev = 1 power =.8 npergroup =. ; run; Checking differences: 0.2, 0.4, 0.6, 0.8, 1.0, 1.2 Larger difference will be easier to be detected, thus a smaller sample size will be needed. April 18, 2017 Biostat 12

Power chart A plot to show the trend of sample size proc power; twosamplemeans test=diff meandiff =.2 to 1 by.2 stddev = 1 power =.9 ntotal =.; plot x = power min=.5 max=.95; run; April 18, 2017 Biostat 13

Correlation Examples A researcher is interested in seeing whether a significant positive correlation exists between reading speed and IQ in adolescents. Before beginning the study, the researcher would like to know what sample size would be required to detect a positive correlation of 0.5 with power of 80%. Correlation analysis Hypothesis test about the significance of the correlation Assumed correlation is 0.5 HH 0 : ρρ = 0 vvvv HH 1 : ρρ > 0 April 18, 2017 Biostat 14

proc power; onecorr alpha=0.05 sides=1 corr=0.5 ntotal=. power=0.8; run; April 18, 2017 Biostat 15

Proportion Example A survey claims that 90% dentists recommend a particular brand of toothpaste for their patients suffering with sensitive teeth. A researcher decides to test this claim by taking a random sample of 80 dentists, but wants to first find out if this sample size is large enough to achieve 80% power. Hypothesis test about the proportion (i.e. percentage) April 18, 2017 Biostat 16

proc power; onesamplefreq test sides=2 nullproportion=0.9 proportion=0.05 to 0.85 by 0.05 alpha=0.05 ntotal=80 power=.; run; Assume any proportion that is different from the proposed 90%. Here we check the power for a list of different possible proportions in the sample April 18, 2017 Biostat 17

One sample T test of mean A researcher is planning a pharmaceutical study on a new formulation of a drug. The current formulation has an average elimination rate of 0.06. The researcher hypothesizes that the elimination rate for the new formulation is higher than 0.06. Wanting to be confident, the researcher would like to see how large the sample size must be to achieve 90% power. A standard deviation of 0.02 will be used based on studies of the original formulation of the drug. Hypothesis Test of the Average to 0.06 One tail test April 18, 2017 Biostat 18

proc power; onesamplemeans sides=1 nullmean=0.06 mean=0.01 to 0.1 by 0.01 stddev=0.02 ntotal=. power=0.9; run; Null hypo Test structure April 18, 2017 Biostat 19

Paired T test A researcher is interested in investigating whether BMI changes in males aged 55-65 years after spending four weeks on a novel diet and exercise program. The researcher plans to take BMI measurements on a random sample of men before and after the intervention and see whether there was a change. An 80% level of power is desired and a standard deviation of 2.0 based on past studies of weight loss and BMI change is used for calculations. Comparison between two readings T test A SAME sample has been read twice (Before vs After) Paired design April 18, 2017 Biostat 20

proc power; pairedmeans sides=2 nulldiff=0 Null assumes no diff meandiff=0.5 to 3 by 0.5 corr=0.5 stddev=2.0 npairs=. power=0.8; run; Correlation: Before vs After npairs instead of ntotal Possible differences in the sample April 18, 2017 Biostat 21

Example of ANOVA A researcher is interested in investigating the effects of three different diets on percent weight loss when implemented along with a 5-day per week cardio exercise program. The diets include a low carbohydrate diet, a high protein diet, and a control diet (just exercise). Before beginning the study, sample size determinations must be made. The researcher would like to achieve power of 80%. From previous study, the average percent weight loss values are: 9 for Low, 12 for High and 8 for Control. Assume the standard deviation is 3.0 Comparing 3 groups (Low High - Control) One-way ANOVA April 18, 2017 Biostat 22

proc power; onewayanova test=overall groupmeans=9 12 8 stddev=3.0 npergroup=. power=0.8; run; Here we have 3 groups, so we need to know how many subjects in each. Balance design assumed. April 18, 2017 Biostat 23