A Monte Carlo Simulation Study for Comparing Power of the Most Powerful and Regular Bivariate Normality Tests

International Journal of Statistics and Applications 014, 4(1): 40-45 DOI: 10.593/j.statistics.0140401.04 A Monte Carlo Simulation Study for Comparing Power of the Most Powerful and Regular Bivariate Normality Tests Hamed Tabesh, Zeynab Haghdust, Azadeh Saki * Department of Biostatistics and Epidemiology, School of Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran Abstract In many areas of medical research, a bivariate analysis is desirable because it simultaneously tests two correlated response variables. Several parametric bivariate procedures are available but each of them requires bivariate normality assumption for response variables. Although in recent years, continuous efforts have been made to test bivariate normality but it is not clear that which test is the most powerful in specified situation. The aim of this study is to compare power of eight different test of bivariate normality with at least one paper which marked them as a powerful test and dedicate the most powerful test in the specified situation. In this study, power of Mardia skewness, Mardia kurtosis, Henze-Zirkler, Mshapiro, Shapiro-wilk, Royston s W, Doornik-Hansen and Szekely-Rizzo compared with each other using Monte Carlo simulation techniques. The power study shows that the most powerful test under bivariate distributions with different shapes is not the same. Using simulation studies, we show that Mshapiro test will perform much better under symmetric, skewed, medium tailed and heavy tailed distributions. Also, Royston s W test will perform much better when underlying distribution is highly skewed. Keywords Bivariate Analysis, Bivariate Normality Test, Power, Monte Carlo Simulation 1. Introduction 1.1. Preliminary It is often the case that health science studies involve some correlated variables as response variables. For example in a clinical trial whose main purpose was to compare control and treatment group, treatment of chronic obstructive pulmonary diseases, lung function test can be used as the main response variable and peak expiratory flow rate, forced vital capacity and forced expiratory volume were considered as secondary response variables[1]. It is obvious that the correlation between these main and secondary response variables were great. Structure of most data according to different aspect such as underlying structure (intrinsic properties), environmental structure or laboratory structure is observable and usually they are corrected[]. So, for many health research studies and medical studies experiments involve two correlated response data as called bivariate response data. For such studies, a bivariate analysis that compares the treatment on two response variables simultaneously may have advantages over two separate univariate test, one for each variable[3]. The great advantage of bivariate analysis is the * Corresponding author: azadehsaki@yahoo.com (Azadeh Saki) Published online at http://journal.sapub.org/statistics Copyright 014 Scientific & Academic Publishing. All Rights Reserved possibility of increased power. If the response variables are not too highly correlated, the bivariate test has a chance of finding significant differences among the treatments even if none of the univariate tests is significant[4]. But many of the procedure required to analyze bivariate date, including Hotelling T, discriminant analysis and bivariate regression, assume bivariate normality (BVN). So its testing has great importance. Actually most of bivariate tests are based on bivariate normality assumption and perhaps this is a reason that make greater use of BVN tests in recent years[5]. Despite the sensitivity of these bivariate techniques to BVN assumption, this assumption frequently does not tested. May be because of the lack of awareness of the existence of the test and the lack of information regarding size and power[6]. This study focuses on the latter issue of size and power. Bivariate normality test may be restated as follows. Consider random bivariate sample (xxxx, yyyy) ii = 1,.., mm from continuous bivariate population e.g. weight and height of infants in a population. HHHH: (xx, yy)~nn (μμ, ) (1) HH1: HH0 We examine via a Monte Carlo simulation the performance of some of the most common and powerful tests for hypothesis (1) in the literature. 1.. Background In recent years, continuous efforts have been made to test hypothesis (1) and numerous papers have been written on

International Journal of Statistics and Applications 014, 4(1): 40-45 41 this topic. Some properties of these tests are "overall", "omnibus", "directed" or "graphical" but with increase in the number of BVN test conducting logical test seems to be difficult[7]. Most of the BVN tests are development of univariate normality tests. Thus, most of BVN tests are based on skewness and kurtosis or goodness of fit procedure[8]. Generally, BVN tests can be classified in to four groups but none of them are quite distinct[8]: Class 1: goodness of fit test, class : skewness and kurtosis approaches, class3: consistent and invariant test, class 4: graphical and correlational approach. In the first category, most of the comparative and review studies introduced Royston test which is an extension of the Shapiro-Wilk tests, as the best[6, 9]. Mardia skewness and Mardia kurtosis test as the examples of second group have been widely used, although some previous studies have reported them as low power tests. Doornik and Hansen test based on kurtosis can be another example of this category [10]. For the third category approach Epps and Pully in 1983 proposed and invariant test[11]. In the last category, Healy approach could be mentioned that draw BVN based on squared raddi[1]. In many previous studies, researchers have compared power of few tests with their proposed tests and most of the available tests have not been compared with each other, so in this study we wish to calculate power of the most powerful and common recommended BVN tests in previous researches via simulation studies for different distributions (symmetric, skewed and highly skewed).. Methods A total of 8 different tests of BVN, with at least one paper which marked them as powerful test, were studied in this investigation..1. Tests for Bivariate Normality In this section, for completeness a brief description of selected test statistics is presented. Suppose XX represents a bivariate data matrix, where is the sample size..1.1. Mardia Skewness Mardia skewness test is given by MMMM = 1 gg iiii where gg iiii = (xx ii xx ) SS 1 (xx jj xx ) And SS 1 is the inverse of the sample variance matrixes [13]..1.. Mardia Kurtosis Mardia kurtosis test is given by iiii MMMM = 1 gg iiii iiii where gggggg = (xxxx xx ) SS 1 (xxxx xx ) And SS 1 is the inverse of the sample variance matrix SS [13]..1.3. Henze-Zirkler Henze-Zirkler test with test statistic: HHHH = ψψ (tt) exp( tt ) φφ ββ (tt)dd(tt) where ψψ (tt) is an empirical characteristic function, φφ ββ (tt) is a kernel function that was chosen to be NN (0, ββ II ) and ββ is a smoothing parameter[14, 15]..1.4. Mshapiro Mshapiro test is given by where WW = [ aa ii MMMMh = [(1 WW) λλ μμ yy ] σσ yy ii=1 yy (ii) ] (yy ii yy ) and aa = (aa1, aa,, aaaa) = mm VV 1 [mm VV 1 (VV 1 mm )] 1 with mm and VV being of the vector of expected value and the covariance matrix of standard normal order statistic, respectively[16, 17]..1.5. Shapiro wilk Test Shapiro wilk test with test statistics: 1 SSSS = 1 WW zzzz where WW zzzz is Shapiro-Wilk s univariate statistics for the ii tth standardized variate ZZ ii [18]..1.6. Royston s W Test Royston extended the Shapiro-Wilk univariate test to a bivariate (multivariate) case. The test statistic: ii=1 RR = eeee Where GG = jj kkkk and ee estimated by cccccc ee = [1 + (pp 1)cc ] where cc = iiii is an estimate for the average (pp pp) correlation among kkkk and kkkk = (FF 1 FF( ZZZZ ) ) ; FF(xx)is the normal cumulative distribution function[9]..1.7. Doornik and Hansen Test Doornik and Hansen s omnibus test with test statistic: DH = Z 1 Z 1 + Z Z where Z 1 and ZZ denote the transformed skewness and kurtosis, respectively [10]..1.8. Szekely and Rizzo Test Szekely and Rizzo test is given by

4 Hamed Tabesh et al.: A Monte Carlo Simulation Study for Comparing Power of the Most Powerful and Regular Bivariate Normality Tests where SSSS = ( EE yy jj ZZ Γ (p + 1) Γ pp jj =1 EE yy ZZ = Γ (p + 1) Γ pp 1 + ( 1)kk yy kk+ ππ kk! (kk + 1)(kk + ) kk=0 yy jj yy kk ) jj,kk=1 Γ (p + 1) Γ(kk + 3 ) Γ(kk + pp + 1) pp = ; and YY = [yy 1, yy,, yy is the pp standardized data matrix [ 19]... Simulation Study All of the eight tests mentioned in the previous section compared with each other using samples from bivariate normal and non-normal distribution. Simulations were run for bivariate normal distribution with ρρ = 0.5,0, 0.5. Also simulations were run for some non-normal distributions generated using the gg aaaaaa h distribution [0], i.e. generating ZZZZZZ from a bivariate normal distribution and setting XX iiii = exp(ggzz iiii ) 1 gg exp( h ZZ iiii ) For gg = 0 this expression is taken to be XXXXXX = ZZZZZZ exp( hzzzzzz ). As the gg aaaaaa h distribution provides a convenient method for considering a very wide range of situation corresponding to both symmetric and asymmetric distributions, it use is highly recommended. The case gg = h = 0 corresponds to a normal distribution, the case gg = 0 corresponds to a symmetric distribution, and as gg increases the skewness increases as well. For example, with gg = 0.5 and h = 0 the skewness is 1.75, which is great[3]. In this study, simulations were run with gg = 0, 0., 0.5 to span the range of skewness values that seems to occur in practice. These different values of g, gg = 0, 0.,0.5, stand for non-skewed, skewed and highly skewed distributions respectively. The parameter h determines the heaviness of the tail. As h increases, the heaviness increases as well. With h = 0. and gg = 0 the kurtosis equals 36. This might seem extreme [3], so our simulation were run for h = 0, 0.1, 0.. As the power of tests dependent on sample size, so our simulations were run for = 15, 5, 50, 100.The monte carlo study was employed, where 5000 samples were generated for each combination of = 15,5,50,100, gg = 0,0.,0.5 and h = 0,0.1,0.. 3. Results and Discussion Table 1. Monte carlo rejection proportion for the bivariate population (ρ =0, n=15) 0.0, 0.0 0.0018 0.0164 0.0368 0.1344 0.0450 0.0568 0.0.878 0.0470 0., 0.0 0.008 0.054 0.0898 0.74 0.1 0.1470 0.1768 0.1050 0.5, 0.0 0.0806 0.88 0.4036 0.5700 0.568 0.5610 0.566 0.4350 0.0, 0.1 0.074 0.0976 0.1050 0.900 0.156 0.1918 0.3 0.1346 0., 0.1 0.0494 0.14 0.1700 0.3750 0.366 0.798 0.970 0.066 0.5, 0.1 0.171 0.3878 0.468 0.6460 0.5698 0.6138 0.6048 0.5098 0.0, 0. 0.1068 0.31 0.48 0.4550 0.394 0.3798 0.3980 0.980 0., 0. 0.1356 0.81 0.3058 0.504 0.3918 0.445 0.4530 0.347 0.5, 0. 0.66 0.486 0.5478 0.7004 0.6360 0.6766 0.6648 0.5844 Table. Monte carlo rejection proportion for the bivariate population (ρ =0, n=5) 0.0, 0.0 0.006 0.078 0.04 0.1188 0.049 0.0494 0.0616 0.0486 0., 0.0 0.043 0.1384 0.1 0.958 0.158 0.35 0.196 0.1466 0.5, 0.0 0.994 0.636 0.6636 0.771 0.83 0.84 0.84 0.7096 0.0, 0.1 0.1346 0.1994 0.144 0.366 0.396 0.764 0.936 0.1876 0., 0.1 0.004 0.3188 0.63 0.4784 0.3854 0.4106 0.418 0.3196 0.5, 0.1 0.4706 0.7334 0.7 0.874 0.83 0.847 0.8338 0.7638 0.0, 0. 0.366 0.458 0.381 0.599 0.50 0.543 0.571 0.44 0., 0. 0.41 0.5048 0.4694 0.671 0.5956 0.698 0.64 0.598 0.5, 0. 0.606 0.795 0.786 0.8654 0.8634 0.8764 0.870 0.8188

International Journal of Statistics and Applications 014, 4(1): 40-45 43 Table 3. Monte carlo rejection proportion for the bivariate population (ρ =0, n=50) 0.0, 0.0 0.016 0.0448 0.044 0.11 0.0518 0.05 0.057 0.0484 0., 0.0 0.114 0.3454 0.316 0.435 0.4348 0.4396 0.406 0.976 0.5, 0.0 0.65 0.9646 0.934 0.953 0.994 0.9938 0.9918 0.968 0.0, 0.1 0.3616 0.3376 0.96 0.5014 0.4066 0.416 0.491 0.305 0., 0.1 0.4994 0.6056 0.4576 0.6876 0.6636 0.676 0.685 0.5566 0.5, 0.1 0.839 0.977 0.946 0.9654 0.987 0.987 0.9850 0.9704 0.0, 0. 0.7478 0.6376 0.646 0.807 0.783 0.7904 0.8336 0.7164 0., 0. 0.7954 0.778 0.7578 0.8694 0.8696 0.8734 0.8940 0.8114 0.5, 0. 0.931 0.977 0.9706 0.977 0.988 0.988 0.9876 0.981 Table 4. Monte carlo rejection proportion for the bivariate population (ρ =0, n=75) g, h MK MS HZ Msh W R DH SR 0.0, 0.0 0.054 0.048 0.0536 0.108 0.0498 0.050 0.0534 0.0518 0., 0.0 0.1846 0.546 0.3364 0.5577 0.684 0.668 0.406 0.976 0.5, 0.0 0.6500 0.9646 0.934 0.953 0.994 0.9938 0.9918 0.966 0.0, 0.1 0.3616 0.3376 0.96 0.5014 0.4066 0.416 0.491 0.305 0., 0.1 0.4994 0.6056 0.4576 0.6876 0.6636 0.676 0.685 0.5566 0.5, 0.1 0.8390 0.977 0.946 0.9654 0.987 0.987 0.985 0.9704 0.0, 0. 0.7478 0.6376 0.646 0.807 0.783 0.7904 0.8336 0.7164 0., 0. 0.7954 0.778 0.7578 0.8694 0.8696 0.8734 0.8899 0.8114 0.5, 0. 0.931 0.9770 0.9706 0.9770 0.9880 0.988 0.9876 0.981 Table 5. Monte carlo rejection proportion for the bivariate population (ρ =0.5, n=15) 0.0, 0.0 0.0018 0.0164 0.0368 0.1344 0.0450 0.5760 0.0868 0.0470 0., 0.0 0.009 0.0606 0.0978 0.484 0.1306 0.1518 0.174 0.1158 0.5, 0.0 0.1106 0.31 0.4534 0.5988 0.578 0.5674 0.4130 0.3404 0.0, 0.1 0.07 0.0980 0.1100 0.956 0.153 0.190 0.086 0.1350 0., 0.1 0.060 0.1598 0.188 0.3864 0.380 0.776 0.960 0.198 0.5, 0.1 0.018 0.446 0.5038 0.660 0.5800 0.61 0.5940 0.538 0.0, 0. 0.1196 0.410 0.54 0.476 0.3194 0.3774 0.3846 0.3044 0., 0. 0.158 0.864 0.38 0.5310 0.3880 0.4386 0.446 0.366 0.5, 0. 0.996 0.5064 0.5658 0.7080 0.634 0.6638 0.6580 0.5986 The results in Table 1, Table, Table 3 and Table 4 were based on 5000 samples of sizes 15, 5, 50, 75 respectively, from a bivariate population with uncorrelated variables. The result in Table 5, Table 6, Table 7 and Table 8 were based on similar conditions from a bivariate population with ρρ = 0.5 for two considered variables also the results in Table 9, Table 10, Table 11 and Table 1 were for two considered variables with ρρ = 0.5. A nominal significance level of 0.05 wasused. mvnormt est, mvshapirotest, MVN, normwhn.test and "energy Library in R program version.15.3 were used. Under the bivariate distribution with uncorrelated variables and = 15, the simulation results showed that "Msh" test statistic performed better than any of the test statistics compared here for almost all distributions (symmetric, skewed and heavy or light tailed). It followed by "R" and "DH. The findings of this study show that "Msh" test had greater power than others for symmetric, skewed and heavy tailed uncorrelated, bivariate distributions but not for highly skewed when sample size is medium( = 5). For highly skewed uncorrelated bivariate distributions when sample size is medium it seems that "R" is the best and "SW" and "DH" are the second and the third. But for large samples( 50) under similar condition approximately power of "Msh,"R, and "DH" are similar and better than others. The simulation results under bivariate distribution with ρρ = 0.5, 0.5 were closely similar. In general simulation performed for bivariate distributions with different correlations showed similar power trends.

44 Hamed Tabesh et al.: A Monte Carlo Simulation Study for Comparing Power of the Most Powerful and Regular Bivariate Normality Tests Table 6. Monte carlo rejection proportion for the bivariate population (ρ =0.5, n=5) 0.0, 0.0 0.006 0.078 0.04 0.1188 0.0408 0.0516 0.058 0.0486 0., 0.0 0.051 0.1548 0.1476 0.3094 0.186 0.38 0.6 0.1744 0.5, 0.0 0.3568 0.674 0.7188 0.781 0.8184 0.3 0.806 0.755 0.0, 0.1 0.137 0.05 0.144 0.361 0.7 0.64 0.858 0.196 0., 0.1 0. 0.3498 0.874 0.4956 0.3978 0.4176 0.464 0.3408 0.5, 0.1 0.578 0.7516 0.7434 0.831 0.898 0.87 0.86 0.785 0.0, 0. 0.379 0.436 0.3956 0.6108 0.507 0.5378 0.5786 0.455 0., 0. 0.4398 0.594 0.481 0.68 0.5894 0.6178 0.6416 0.5454 0.5, 0. 0.658 0.8044 0.796 0.867 0.8536 0.8566 0.8576 0.834 Table 7. Monte carlo rejection proportion for the bivariate population (ρ =0.5, n=50) 0.0, 0.0 0.016 0.0448 0.044 0.1100 0.0470 0.0466 0.0474 0.0484 0., 0.0 0.134 0.3674 0.614 0.469 0.44 0.46 0.43 0.3336 0.5, 0.0 0.718 0.974 0.9606 0.959 0.996 0.99 0.9908 0.9790 0.0, 0.1 0.3738 0.3518 0.404 0.51 0.406 0.4056 0.4908 0.3198 0., 0.1 0.54 0.668 0.4818 0.7084 0.6718 0.6494 0.698 0.5750 0.5, 0.1 0.873 0.9808 0.9618 0.97 0.987 0.985 0.986 0.9860 0.0, 0. 0.763 0.6604 0.6586 0.86 0.7874 0.781 0.840 0.778 0., 0. 0.848 0.7836 0.771 0.8756 0.8698 0.8584 0.8956 0.886 0.5, 0. 0.951 0.9784 0.976 0.98 0.9886 0.9884 0.9884 0.9854 Table 8. Monte carlo rejection proportionfor the bivariate population (ρ =0.5, n=75) 0.0, 0.0 0.054 0.048 0.0536 0.108 0.048 0.0508 0.0486 0.0518 0., 0.0 0.196 0.578 0.3746 0.5818 0.635 0.615 0.616 0.4910 0.5, 0.0 0.870 0.9978 0.9958 0.989 0.999 0.9996 0.999 0.9984 0.0, 0.1 0.5546 0.4178 0.35 0.6166 0.5456 0.5346 0.6404 0.446 0., 0.1 0.7166 0.7768 0.637 0.834 0.806 0.8048 0.837 0.7394 0.5, 0.1 0.9650 0.9980 0.993 0.9914 0.9984 0.998 0.9986 0.9974 0.0, 0. 0.918 0.756 0.81 0.9136 0.9178 0.9038 0.9466 0.8696 0., 0. 0.939 0.8776 0.904 0.9508 0.958 0.9518 0.967 0.9344 0.5, 0. 0.991 0.9960 0.996 0.9970 0.998 0.9980 0.9986 0.9974 4. Conclusions In health and medical researches, were two variables such as cholesterol level and blood pressure are considered for important diagnoses, the bivariate values may be related in an unknown way[1], so bivariate analysis is considered. An important topic for applying most of the bivariate analysis, BVN assumption, is necessary. So make a decision about BVN considered an important problem. According to number of BVN available test, we have compared them under various bivariate distributions with different correlations and different sample sizes. The results of the simulation studies showed that the Mshapiro test performed better than most of its competitors whether the underlying distribution was normal, or non-normal, skewed or symmetric or heavy tailed, but not for highly skewed also it makes type I error inflated. It means that we could not conclude that Mshapiro is the best overall but we could say Mshapiro is the best under specific conditions. The simulation results also revealed that similar power could be considered for "SW, "R", "DH" whether the underlying distribution was highly skewed and they had greater power than other competitors. Therefore, Mshapiro s application where underlying distribution is not highly skewed recommended, since it is more powerful than any of alternatives compared here for almost all sample sizes and Royston s application recommended where Mshapiro is not the best. ACKNOWLEDGEMENTS This work was financially supported by vice-chancellor for Research Affairs of Ahvaz Jundishapur University of

International Journal of Statistics and Applications 014, 4(1): 40-45 45 Medical Sciences. This study is part of biostatistics MS degree thesis of Zaynab Haghdust. REFERENCES [1] Pocock, S.J., Clinical trials: a practical approach. NY: Wiley, 1983. [] Gnanadesikan, R., Methods for statistical data analysis of multivariate observations. 011: Wiley-Interscience. [3] Tabesh, H., S. Ayatollahi, and M. Towhidi, A simple powerful bivariate test for two sample location problem. Theoretical Biology and Medical Modeling, 010,7:13. [4] Fleiss, J.L., The design and analysis of clinical experiments, 1986, Taylor & Francis. [5] Bogdan, M., Data driven smooth tests for bivariate normality. Journal of Multivariate Analysis, 1999. 68(1): p. 6-53. [6] Farrell, P.J., M. Salibian-Barrera, and K. Naczk, On tests for multivariate normality and associated simulation studies. Journal of Statistical Computation and Simulation, 007. 77(1): p. 1065-1080. [7] Henze, N., Invariant tests for multivariate normality: a critical review. Statistical papers, 00. 43(4): p. 467-506. [8] Mecklin, C.J. and D.J. Mundfrom, A Monte Carlo comparison of the Type I and Type II error rates of tests of multivariate normality. Journal of Statistical Computation and Simulation, 005. 7:()5 p. 93-107. [9] Royston, J., Some techniques for assessing multivarate normality based on the shapiro-wilk W. Applied Statistics, 1983: p. 11-133. [10] Doornik, J.A. and H. Hansen, An Omnibus Test for Univariate and Multivariate Normality*. Oxford Bulletin of Economics and Statistics, 008. 70(s1): p. 97-939. [11] Epps, T.W. and L.B. Pulley, A test for normality based on the empirical characteristic function. Biometrika, 1983. 70(3): p. 73-76. [1] Healy, M., Multivariate normal plotting. Applied Statistics, 1968: p. 157-161. [13] Mardia, K.V., Measures of multivariate skewness and kurtosis with applications. Biometrika, 1970. 57(3): p. 519-530. [14] Henze, N. and B. Zirkler, A class of invariant consistent tests for multivariate normality. Communications in Statistics-Theory and Methods, 1990. 19(10): p. 3595-3617. [15] Henze, N. and T. Wagner, A new approach to the BHEP tests for multivariate normality. Journal of Multivariate Analysis, 1997. 6(1): p. 1-3. [16] Royston, J., An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, 198: p. 115-14. [17] Royston, J., Algorithm AS 181: The W test for normality. Journal of the Royal Statistical Society. Series C (Applied Statistics), 198. 31(): p. 176-180. [18] Villasenor Alva, J.A. and E.G. Estrada, A generalization of Shapiro Wilk's test for multivariate normality. Communications in Statistics-Theory and Methods, 009. 38(11): p. 1870-1883. [19] Székely, G.J. and M.L. Rizzo, A new test for multivariate normality. Journal of Multivariate Analysis, 005. 93(1): p. 58-80. [0] Hoaglin, D.C., Summarizing Shape Numerically: The g and h Distributions. Exploring data tables, trends, and shapes, 1985: p. 461-513. [1] Rawlings, J.O., S.G. Pantula, and D.A. Dickey, Applied regression analysis: a research tool. 1998: Springer.