Knowledge is Power: The Basics of SAS Proc Power

Similar documents
Statistical questions for statistical methods

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Inferential Statistics

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Sheila Barron Statistics Outreach Center 2/8/2011

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.

Lessons in biostatistics

Biostatistics & SAS programming

Quasicomplete Separation in Logistic Regression: A Medical Example

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

Linear Regression in SAS

Parameter Estimation of Cognitive Attributes using the Crossed Random- Effects Linear Logistic Test Model with PROC GLIMMIX

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

appstats26.notebook April 17, 2015

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

1.4 - Linear Regression and MS Excel

Generalized Estimating Equations for Depression Dose Regimes

Treatment Adaptive Biased Coin Randomization: Generating Randomization Sequences in SAS

Advanced ANOVA Procedures

AP STATISTICS 2008 SCORING GUIDELINES (Form B)

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Before we get started:

112 Statistics I OR I Econometrics A SAS macro to test the significance of differences between parameter estimates In PROC CATMOD

Psychology Research Process

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

An Interactive SAS/AF System For Sample Size Calculation

Pitfalls in Linear Regression Analysis

An Introduction to Bayesian Statistics

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

Intro to SPSS. Using SPSS through WebFAS

ANOVA in SPSS (Practical)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Day 11: Measures of Association and ANOVA

4 Diagnostic Tests and Measures of Agreement

A Comparison of Linear Mixed Models to Generalized Linear Mixed Models: A Look at the Benefits of Physical Rehabilitation in Cardiopulmonary Patients

How to analyze correlated and longitudinal data?

Improved Transparency in Key Operational Decisions in Real World Evidence

Lab 8: Multiple Linear Regression

MEA DISCUSSION PAPERS

A SAS Macro for Adaptive Regression Modeling

Basic Features of Statistical Analysis and the General Linear Model

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Dan Byrd UC Office of the President

MAKING THE NSQIP PARTICIPANT USE DATA FILE (PUF) WORK FOR YOU

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Zheng Yao Sr. Statistical Programmer

Research Example Aliza Ben-Zacharia DrNP, ANP

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Data Analysis with SPSS

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Fundamental Clinical Trial Design

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Binary Diagnostic Tests Two Independent Samples

Psychology Research Process

Introduction to Survival Analysis Procedures (Chapter)

Name: emergency please discuss this with the exam proctor. 6. Vanderbilt s academic honor code applies.

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS

Statistics Assignment 11 - Solutions

ABSTRACT THE INDEPENDENT MEANS T-TEST AND ALTERNATIVES SESUG Paper PO-10

ANALYZING ALCOHOL BEHAVIOR IN SAN LUIS OBISPO COUNTY

CHAPTER 2 TAGUCHI OPTIMISATION TECHNIQUE

STP 231 Example FINAL

ABSTRACT INTRODUCTION

Problem Set 3 ECN Econometrics Professor Oscar Jorda. Name. ESSAY. Write your answer in the space provided.

This module illustrates SEM via a contrast with multiple regression. The module on Mediation describes a study of post-fire vegetation recovery in

Quantitative Evaluation

Exam 4 Review Exercises

A Spreadsheet for Deriving a Confidence Interval, Mechanistic Inference and Clinical Inference from a P Value

Learning with Rare Cases and Small Disjuncts

Bayes Factors for t tests and one way Analysis of Variance; in R

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2

Bangor University Laboratory Exercise 1, June 2008

Choosing a Significance Test. Student Resource Sheet

The Geography of Viral Hepatitis C in Texas,

Chapter 17 Sensitivity Analysis and Model Validation

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Ashwini S Erande MPH, Shaista Malik MD University of California Irvine, Orange, California

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

A SAS sy Study of ediary Data

Predicting New Customer Retention for Online Dieting & Fitness Programs

A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia


Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Definition 1: A fixed point iteration scheme to approximate the fixed point, p, of a function g, = for all n 1 given a starting approximation, p.

The Association of Morbid Obesity with Mortality and Coronary Revascularization among Patients with Acute Myocardial Infarction

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Choosing the Correct Statistical Test

a) Is it reasonable to compute the relative risk for endometrial cancer? Explain. b) Can we estimate relative risk for heart attacks? Explain.

Stat 13, Lab 11-12, Correlation and Regression Analysis

Reverse Engineering a Regression Table. Achim Kemmerling Willy Brandt School Erfurt University

11/24/2017. Do not imply a cause-and-effect relationship

Transcription:

ABSTRACT Knowledge is Power: The Basics of SAS Proc Power Elaina Gates, California Polytechnic State University, San Luis Obispo There are many statistics applications where it is important to understand how the power function of a specific distribution behaves. Coding a power function from scratch can be an arduous process and can become complicated when investigating effect size and sample size. This presentation will cover the basic uses of proc power in regards to testing proportions and include examples of power analyses. It will also demonstrate the simplicity of using proc power to generate plots of power curves and obtain other valuable information. INTRODUCTION In hypothesis testing, there is a null and alternate hypothesis. When something is found to be statistically significant, we reject the null in favor of the alternative. Power in regards to hypothesis testing is defined as the probability of correctly rejecting the null hypothesis. In a more general definition, power is defined as the probability of rejecting the null hypothesis (without any assumptions). Many factors play a part in calculating power, one of which is sample size. This is perhaps the most common use of studying power curves prior to a study. In order to save time and money in statistical studies, researchers use power analysis to determine what their optimal sample size should be in order to show statistical significance. This paper will outline how to use proc power, specifically with proportions, to determine a suitable sample size, how to calculate power after a sample size is chosen, and how to interpret the plot of the power curve. DETERMINING SAMPLE SIZE Power analysis is useful in determining the number of subjects needed in a study or a clinical trial. One of these applications may be deciding how many subjects are needed in a control group versus a treatment group to achieve a specific level of power. For example, a new drug is being developed to treat migraine headaches. The current treatment reduces symptoms in 40% of patients; this new drug will be put into production only if its effectiveness is at least 15% higher than the current treatment. For this experiment, we need two groups of patients. One will be given the current drug and the other group will be given the new drug. The results of the groups will be compared to determine the effectiveness of the drug and if it is at least 15% more effective. Before conducting this experiment, the researchers need to know how many subjects will be needed in each group to achieve power of at least 0.8. Using proc power, we will conduct a power analysis for this experiment. The code is shown in below. We will use the twosamplefreq option. The test we will be using to compare the two groups is a Pearson Chi-Square test and this is specified in the test= option. The default of proc power is a two-sided test. In this study we will change it to a one sided test because we are interested in the improvement in symptoms. Finally, we include what level of power we want to achieve after power= and include ntotal=. so SAS calculates the sample size minimum. power = 0.8 ntotal=.; 1

Figure 1. Proc Power Results for Migraine Example Part I After running the power analysis, the output shows us that in order to achieve a level of power of at least 0.8 we must have a sample size of 272 subjects. Since we haven t changed any options regarding the weight of the two groups, the default is equal sample sizes. This can be altered with the groupweights= option shown below. This will give us the sample size minimum for two groups where one has twice as many subjects as the other. groupweights = (1 2) power = 0.8 ntotal=.; DETERMINING POWER FOR A GIVEN SAMPLE SIZE When conducting an experiment where the sample size has already been selected, you can use proc power to calculate the power as well as provide plots showing how the sample size will affect the power. Suppose that there are only 160 subjects who are qualified to participate in the study involving the migraine treatment described above. The researchers are curious as to how powerful the test will be with this sample size. Now we switch the code to include the total number of subjects and change power= to missing. ntotal= 160 power =.; 2

Figure 2. Proc Power Results for Migraine Example Part II Effect size can also have an impact on the power of a test. It is tougher to detect a small difference between the null and alternative hypothesis than a larger difference. In the migraine example, the effect size is relatively small. Using the plot statement with proc power, we can look at how different effect sizes and sample sizes change the power of this test in Figure 3. Now instead of only including 0.4 and 0.55, representing the proportion of subjects who s symptoms are improved with the current and new medication respectively, we can include many pairs. We have included a smaller effect size, represented by the pair 0.4 and 0.5, as well as two larger effect sizes. Now the output will calculate the power for all of these pairs of proportions. After the twosamplefreq options, we have included a plot option. This will generate a plot in the output. By including x=n, we are plotting sample size on the x-axis. I have also twosamplefreq test = pchi groupproportions = (0.4, 0.5)(0.4, 0.55) (0.4, 0.6) (0.4, 0.65) power =. ntotal= 300; plot x = n min = 100 max = 500; included the limits of the x-axis after min= and max=. 3

Figure 3. Plot and Results Generated by Proc Power The plot in Figure 3 illustrates the effects of increasing sample size and increasing effect size. We can see that as the sample size increases, so does the power. The power also increases as the effect size gets larger. This application of proc power saves a lot of time. If we were to code this plot from scratch we would need many loops and iterations. This output also calculates the power for each of the effect sizes. CONCLUSION The POWER procedure saves the user time coding and provides all of the relevant output and plots needed for a power analysis. As a student, I have found proc power to be extremely beneficial because there are many assignments in which I have needed to conduct a power analysis. It is also beneficial for SAS users who design experiments. The plot option is extremely flexible and more user-friendly than coding from scratch. Some of the other options for tests are two-sample tests involving means as well as one sample tests for both means and proportions. This procedure also has options for tests involving linear regression, survival, ANOVA, and logistic regression. Proc power is a great tool for calculating power, sample sizes, and creating plots. REFERENCES SAS Institute Inc., SAS 9.2 User Guide. The Power Procedure, Cary, NC: SAS Institute Inc., 2016 SAS Data Analysis Examples. UCLA. Statistical Consulting Group. from http://www.ats.ucla.edu/stat/sas/dae/proportionpow.htm ACKNOWLEDGMENTS I would like to acknowledge Professor Matthew Carlton for his phenomenal instruction on power curves and hypothesis testing. I would like to thank Rebecca Ottesen for her advice on my presentation as well as her neverending help and instruction in my SAS endeavors. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Elaina Gates E-mail: elainagates@gmail.com Web: https://www.linkedin.com/in/elaina-gates-500247105 4

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5