Using SAS to Calculate Tests of Cliff s Delta. Kristine Y. Hogarty and Jeffrey D. Kromrey

Similar documents
Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Business Statistics Probability

STATISTICS AND RESEARCH DESIGN

ABSTRACT THE INDEPENDENT MEANS T-TEST AND ALTERNATIVES SESUG Paper PO-10

A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Still important ideas

Investigating the robustness of the nonparametric Levene test with more than two groups

Parameter Estimation of Cognitive Attributes using the Crossed Random- Effects Linear Logistic Test Model with PROC GLIMMIX

Unit 1 Exploring and Understanding Data

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Still important ideas

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

The Association Design and a Continuous Phenotype

Lessons in biostatistics

Overview of Non-Parametric Statistics

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

AMSc Research Methods Research approach IV: Experimental [2]

Examining differences between two sets of scores

SOME NOTES ON STATISTICAL INTERPRETATION

Using a Likert-type Scale DR. MIKE MARRAPODI

One-Way Independent ANOVA

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Linear Regression in SAS

Simple Linear Regression the model, estimation and testing

On testing dependency for data in multidimensional contingency tables

CHAPTER 3 RESEARCH METHODOLOGY

9 research designs likely for PSYC 2100

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses

Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Choosing the Correct Statistical Test

Chapter 2--Norms and Basic Statistics for Testing

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

Reveal Relationships in Categorical Data

bivariate analysis: The statistical analysis of the relationship between two variables.

Basic Biostatistics. Chapter 1. Content

PRINCIPLES OF STATISTICS

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Psychology Research Process

Comparison of the Null Distributions of

Section on Survey Research Methods JSM 2009

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Single-Factor Experimental Designs. Chapter 8

Study Guide for the Final Exam

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Appraisal of Statistical Practices in HRI vis-á-vis the T-Test for Likert Items/Scales

Practical Experience in the Analysis of Gene Expression Data

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

Empirical Formula for Creating Error Bars for the Method of Paired Comparison

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

Chapter 1: Introduction to Statistics

How to describe bivariate data

Readings Assumed knowledge

How to analyze correlated and longitudinal data?

Two-Way Independent ANOVA

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Score Tests of Normality in Bivariate Probit Models

Six Modifications Of The Aligned Rank Transform Test For Interaction

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

SESUG Paper SD

Chapter 2 Organizing and Summarizing Data. Chapter 3 Numerically Summarizing Data. Chapter 4 Describing the Relation between Two Variables

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

EPIDEMIOLOGY. Training module

Confidence Intervals On Subsets May Be Misleading

PSY 216: Elementary Statistics Exam 4

Research and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida

HOW STATISTICS IMPACT PHARMACY PRACTICE?

Analysis and Interpretation of Data Part 1

MEA DISCUSSION PAPERS

Hungry Mice. NP: Mice in this group ate as much as they pleased of a non-purified, standard diet for laboratory mice.

Research Designs and Potential Interpretation of Data: Introduction to Statistics. Let s Take it Step by Step... Confused by Statistics?

POLS 5377 Scope & Method of Political Science. Correlation within SPSS. Key Questions: How to compute and interpret the following measures in SPSS

Daniel Boduszek University of Huddersfield

n Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)

appstats26.notebook April 17, 2015

An Introduction to Research Statistics

Louis Leon Thurstone in Monte Carlo: Creating Error Bars for the Method of Paired Comparison

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Meta-Analysis: A Gentle Introduction to Research Synthesis

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Meta-Analysis of Correlation Coefficients: A Monte Carlo Comparison of Fixed- and Random-Effects Methods

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

Bayes Factors for t tests and one way Analysis of Variance; in R

Transcription:

Using SAS to Calculate Tests of Cliff s Delta Kristine Y. Hogarty and Jeffrey D. Kromrey Department of Educational Measurement and Research, University of South Florida ABSTRACT This paper discusses a program that calculates three estimates for making statistical inferences regarding the delta statistic. The delta statistic (Cliff, 99) may be used when testing null level measurements. This statistic and the inferential methods associated with it are addressed by considering data arranged in a dominance matrix. The program calculates the marginal values of the dominance matrix that are used in the inferential statistics associated with d (sample estimate of the population parameter). The three methods of inference for d include two estimates of the variance of d, and a confidence band around the sample value of d. The paper provides a demonstration of the SAS/IML code and examples of application of the code to simulation studies in statistics. INTRODUCTION A variety of options are available to researchers interested in examining response variables that are measured as ordered categories, such as Likert scale and other rating scale items. A Pearsonian chi-square test of homogeneity or a test for the equality of population means such as the independent means t-test are commonly employed when testing for the equality of two groups on such a response variable. Implicit in the former analysis is the treatment of the response variable as nominal-level measurement, while the latter analysis implies an assumption of interval-level data. Ordinal indices of association fall between these two extremes, but are less frequently seen in applied research (Cliff, 996a). Norman Cliff (99, 996a) has proposed the use of the delta statistic for testing null level measurements. The delta statistic is used to test equivalence of probabilities of scores in each group being larger than scores in the other (the property that Cliff (99) referred to as "dominance"). Cliff (996a) states several reasons for preferring ordinal analyses over the more familiar methods. First, ordinal measures of association such as Tau and delta are useful both descriptively and inferentially because of their robustness properties when compared to traditional parametric tests such as the independent means t-test. Second, many variables in behavioral research can only be given ordinal scale status, and ordinal statistics are invariant under monotonic transformations, whereas those based on certain parametric analysis may not be. Lastly, ordinal statistics may provide answers more directly aligned with the research questions under investigation. In a recent Monte Carlo study, Kromrey and Hogarty (998) compared the Type I error control and statistical power of four tests of group differences on ordered categorical response data: a parametric test of mean differences (independent means t-test), the Pearsonian chi-square test of homogeneity, the cumulative logit model recommended by Agresti (989, 996), and the delta statistic recommended by Cliff (99, 996a). The number of categories of the response variable, sample size, population distribution shape, and effect size were examined. The findings suggest, that under certain conditions (e.g., skewed marginal distributions, 7-point response scale), the cumulative logit models and the tests of delta tend to be more powerful than the other tests. These initial findings suggest that for many data conditions, the choice of an appropriate test statistic is vitally important to the validity of research inferences.

AN EXAMPLE This program was designed to build a dominance matrix; calculate the sample estimate of the delta statistic (d); calculate both a consistent and an unbiased estimate of the variance of d; and to construct three asymmetric confidence intervals (i.e., 90%, 95%, and 99%). These statistics will be presented in reference to the set of data presented in Table. Table Sample of Two Groups Responses to a 5-Point Likert Item Control Group 5 Experimental Group 5 These data, consisting of responses to a 5-point Likert item, were obtained from six members of an experimental group and ten members of a control group. The research question to be addressed is whether the two populations from which the samples were obtained differ in their response to this item. Norman Cliff (99, 996a) has proposed the use of the delta statistic for testing null level measurements. The population parameter for which such tests are intended is the probability that a randomly selected member of one population has a higher response than a randomly selected member of the second population, minus the reverse probability. That is, delta = Pr(x i >x j ) - Pr(x i <x j ) where x i is a member of population one and x j is a member of population two. A sample estimate of this parameter can be obtained by enumerating the number of occurrences of a sample one member having a higher response value than a sample two member, and the number of occurrences of the reverse. This gives the sample statistic #(x i >x j ) - #(x i <x j ) d = n This statistic, and inferential methods associated with it, are readily addressed by considering the data in an arrangement called a dominance matrix. This by n matrix has elements taking the value of if the row response is larger than the column response, - if the row response is less than the column response, and 0 if the two responses are identical. The sample value of d is simply the average value of the elements in the dominance matrix. The dominance matrix for the Table data is presented in Table. Table Dominance Matrix for the Sample Data 5 di. 0 - - - - - -0.8 0 - - - - - -0.8 0 - - - - -.0500 0 - - - - -.0500 0 - - - - -.0500 0 - - - -0.67 0 - - - -0.67 0 - - - -0.67 0 0-0. 5 0 0.8 d.j 0.8 0. -0. -0.7-0.7-0.9-0.50 The row and column marginals of this table provide mean values of the elements in the respective rows and columns of the matrix. These marginals are used in the inferential statistics associated with d. The null hypothesis tested in such inferential statistics (representing no relationship between the grouping variable and the response variable) is that delta is equal to zero.

Cliff (996b) presented three methods of inference for d. The first method uses an "unbiased" estimate of the variance of d. This estimate is given by n Σ (d i. - d) + Σ (d.j - d) - ΣΣ (d ij - d) S d = n (n -)( -) where d i. is the marginal value of row i, d.j is the column marginal of column j, and d ij is the value of element ij in the matrix. For the sample data in Table, the value of d is -0.5 and the value of S d is 0.098. The square root of this variance is used as the denominator of a z statistic: z unbiased = d / S d For the sample data, the value of z is -0.798, yielding a probability under the null hypothesis of 0.5. The unbiased test fails to reject the null hypothesis of delta = 0. The second method of inference for d uses a "consistent" estimate of the variance: (n -)S di. + ( -)S d.j + S dij S dc = n where S di. = Σ(d i. - d) / ( - ), S d.j = Σ(d j. - d) / (n - ), and S dij = ΣΣ(d ij - d) / [( - )(n - )]. As with the "unbiased" estimate of variance, the square root of this "consistent" estimate of the variance of d can be used as the denominator of a z statistic: z consistent = d / S dc For the Table data, the value of S dc is 0.06, yielding a value for z consistent of -0.768, with a probability under the null hypothesis of 0.. The conclusion with this sample is the same as that reached with the unbiased test, that is, a failure to reject the null hypothesis of delta = 0. The final method of inference regarding d uses S dc to construct an asymmetric confidence interval around the sample value of d. When such an interval does not include the value of zero, the null hypothesis of delta = 0 can be rejected. The limits of this asymmetric confidence interval are given by d - d ± Z α/ S dc [(-d ) + Z α/ S dc ] ½ -d + Z α/ S dc where Z α/ is the normal deviate corresponding to the ( - α/) th percentile of the normal distribution. For the Table data, the lower limit of the 95% confidence interval is -0.7, and the upper limit is 0.6. Because this interval contains the value of zero, the null hypothesis is not rejected at the.05 level. Cliff (996a) has pointed out that the well-known Mann-Whitney Wilcoxon statistic can also provide a test of delta = 0 (because d and U are related by d = U/[ n - ]). However, the rank test is not recommended by Cliff because it is actually testing for the equivalence of the two groups' distributions rather than focusing on the parameter delta. MACRO %macro cliffd(testdata,grpvar,dvscore); proc iml; use &testdata; read all var {&dvscore} where (&grpvar=) into A; read all var {&dvscore} where (&grpvar=) into B; NA=NROW(A); NB=NROW(B); START CLIFF(A,B,NA,NB); SCORE={&dvscore}; GRP = {&grpvar}; DOM_MTRX = J(NA,NB,0); do i = to NA; do j = to NB; if A[i,] > B[j,] then DOM_MTRX[i,j] = ; if A[i,] < B[j,] then DOM_MTRX[i,j] = -; end; end; *print DOM_MTRX; di = DOM_MTRX[,+] # (/NB); dj = DOM_MTRX[+,] # (/NA); d = DOM_MTRX[+,+] # (/(NA#NB));

di = (di - d # J(NA,,)) ##; sdi = di[+,]#(/(na-)); dj = (dj - d # J(,NB,)) ##; sdj = dj[,+]#(/(nb-)); dij = (DOM_MTRX - d #J(NA,NB,))##; sdij = dij[+,+] # (/((NA-)#(NB-))); "unbiased" estimate sd = (NB###di[+,] + NA###dj[,+] - dij[+,+]) / (NA#NB#(NA-)#(NB-)); sd = ((NB#sdi)/(NA#(NB-))) + ((NA#sdj)/ (NB#(NA-))) - (sdij/(na#nb)); if sd < (( - d##)/(na#nb-)) then sd = (( - d##)/(na#nb-)); z = d / sqrt(sd); prob_z = ( - probnorm(abs(z)))#; "consistent" estimate sdc = ((NB-)#sdi + (NA-)#sdj + sdij) / (NA#NB); z = d / sqrt(sdc); prob_z = ( - probnorm(abs(z)))#; Asymmetric confidence intervals CI_95U = (d - d## +.96#sqrt(sdc) # sqrt(( - d##)## +.96###sdc)) / ( - d## +.96###sdc); CI_95L = (d - d## -.96#sqrt(sdc) # sqrt(( - d##)## +.96###sdc)) / ( - d## +.96###sdc); CI_90U = (d - d## +.65# sqrt(sdc) # sqrt(( - d##)## +.65###sdc)) / ( - d## +.65###sdc); CI_90L = (d - d## -.65#sqrt(sdc) # sqrt(( - d##)## +.65###sdc)) / ( - d## +.65###sdc); CI_99U = (d - d## +.576# sqrt(sdc) # sqrt(( - d##)## +.576###sdc)) / ( - d## +.576###sdc); CI_99L = (d - d## -.576#sqrt(sdc) # sqrt(( - d##)## +.576###sdc)) / ( - d## +.576###sdc); file print; put @0 'Tests of d'/ @0 ' '// @0 'Dependent Variable=' @5 SCORE // @0 'Grouping Variable=' @5 GRP // @0 'Sample Sizes=' @ NA. @0 NB. // @0 'Sample Value of d=' @ 5 d 8.6 // @0 'Confidence Bands' @0 'Lower Limit' @50 'Upper Limit'/ @0 ' ' @50 ' '/ @5 '90%' @ CI_90L 8.6 @5 CI_90U 8.6 / @5 '95%' @ CI_95L 8.6 @5 CI_95U 8.6 / @5 '99%' @ CI_99L 8.6 @5 CI_99U 8.6 // @0 ' '// @0 'Tests of delta=0:' @7 'Z' @8 'Two-tailed probability'/ @0 ' ' @ ' ' @7 ' '// @ 'Consistent' @ z 6. @5 prob_z 8.6 / @ 'Unbiased' @ z 6. @5 prob_z 8.6 /; FINISH; run cliff(a,b,na,nb); quit; %mend cliffd; data one; input group score; cards; 5 5 ; %cliffd(one,group,score) run;

OUTPUT Table Tests of delta Tests of d Dependent Variable= Grouping Variable= SCORE GROUP Sample Sizes= 0 6 Sample Value of d= -.50000 Confidence Bands Lower Limit Upper Limit 90% -.66806 0.80995 95% -.750 0.6009 99% -.7858 0.5000 Tests of delta=0: Z Two-tailed probability Consistent -0.768 0.688 Unbiased -0.798 0.656 CONCLUSION As stated earlier, under certain circumstances, tests of delta have been found to be more powerful than some of the more commonly used tests such as the independent means t-test or the Pearsonian chi-square. This macro is provided to facilitate the calculation of estimates of the delta statistic (d), both the unbiased and consistent estimates of the variance of d; and to construct asymmetric confidence intervals. Cliff, N. (99). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin,, 9-509. Cliff, N. (996a). Answering ordinal questions with ordinal data using ordinal statistics. Multivariate Behavioral Research,, -50. Cliff, N. (996b). Ordinal methods for behavioral data analysis. Hillsdale, NJ: Erlbaum. Kromrey, J. D., & Hogarty, K. Y. (998). Analysis Options for Testing Group Differences on Ordered Categorical Variables: An Empirical Investigation of Type I Error Control and Statistical Power. Multiple Linear Regression Viewpoints, 5, 70-8. SAS/IML is a registered trademark of SAS Institute Inc. in the USA and other countries. indicates USA registration. CONTACT INFORMATION The authors can be contacted at the University of South Florida, Department of Educational Measurement and Research, FAO 00U, 0 E. Fowler Ave, Tampa, FL 60, by telephone at (8) 97-0, or by e-mail: khogarty@luna.cas.usf.edu or kromrey@typhoon.coedu.usf.edu REFERENCES Agresti, A. (989). Tutorial on modeling ordered categorical response data. Psychological Bulletin, 05, 90-0. Agresti, A. (996). An introduction to categorical data analysis. New York: Wiley.