Insights in Genetics and Genomics

Similar documents
Copy Number Variation Methods and Data

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

Reconstruction of gene regulatory network of colon cancer using information theoretic approach

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

Physical Model for the Evolution of the Genetic Code

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

Saeed Ghanbari, Seyyed Mohammad Taghi Ayatollahi*, Najaf Zare

IMPROVING THE EFFICIENCY OF BIOMARKER IDENTIFICATION USING BIOLOGICAL KNOWLEDGE

Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO

INTEGRATIVE NETWORK ANALYSIS TO IDENTIFY ABERRANT PATHWAY NETWORKS IN OVARIAN CANCER

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Optimal Planning of Charging Station for Phased Electric Vehicle *

NUMERICAL COMPARISONS OF BIOASSAY METHODS IN ESTIMATING LC50 TIANHONG ZHOU

A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect

Using Past Queries for Resource Selection in Distributed Information Retrieval

Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models

Statistical models for predicting number of involved nodes in breast cancer patients

Association between cholesterol and cardiac parameters.

Normal variation in the length of the luteal phase of the menstrual cycle: identification of the short luteal phase

Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer

HERMAN AGUINIS University of Colorado at Denver. SCOTT A. PETERSEN U.S. Military Academy at West Point. CHARLES A. PIERCE Montana State University

Project title: Mathematical Models of Fish Populations in Marine Reserves

Statistically Weighted Voting Analysis of Microarrays for Molecular Pattern Selection and Discovery Cancer Genotypes

Research Article Statistical Analysis of Haralick Texture Features to Discriminate Lung Abnormalities

Disease Mapping for Stomach Cancer in Libya Based on Besag York Mollié (BYM) Model

AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THRESHOLDING AND SVM

WHO S ASSESSMENT OF HEALTH CARE INDUSTRY PERFORMANCE: RATING THE RANKINGS

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

Study and Comparison of Various Techniques of Image Edge Detection

Economic crisis and follow-up of the conditions that define metabolic syndrome in a cohort of Catalonia,

Statistical Analysis on Infectious Diseases in Dubai, UAE

Estimating the distribution of the window period for recent HIV infections: A comparison of statistical methods

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Sparse Representation of HCP Grayordinate Data Reveals. Novel Functional Architecture of Cerebral Cortex

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

Introduction ORIGINAL RESEARCH

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

ALMALAUREA WORKING PAPERS no. 9

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

The effect of salvage therapy on survival in a longitudinal study with treatment by indication

THE NATURAL HISTORY AND THE EFFECT OF PIVMECILLINAM IN LOWER URINARY TRACT INFECTION.

NHS Outcomes Framework

THIS IS AN OFFICIAL NH DHHS HEALTH ALERT

TTCA: an R package for the identification of differentially expressed genes in time course microarray data

CONSTRUCTION OF STOCHASTIC MODEL FOR TIME TO DENGUE VIRUS TRANSMISSION WITH EXPONENTIAL DISTRIBUTION

(From the Gastroenterology Division, Cornell University Medical College, New York 10021)

DeSigN: connecting gene expression with therapeutics for drug repurposing and development

FAST DETECTION OF MASSES IN MAMMOGRAMS WITH DIFFICULT CASE EXCLUSION

Survival Rate of Patients of Ovarian Cancer: Rough Set Approach

Lymphoma Cancer Classification Using Genetic Programming with SNR Features

Evaluation of Literature-based Discovery Systems

National Polyp Study data: evidence for regression of adenomas

Non-parametric Survival Analysis for Breast Cancer Using nonmedical

Research Article Computational Analysis of Specific MicroRNA Biomarkers for Noninvasive Early Cancer Detection

Does reporting heterogeneity bias the measurement of health disparities?

A Meta-Analysis of the Effect of Education on Social Capital

Optimization of Neem Seed Oil Extraction Process Using Response Surface Methodology

A Wild Bootstrap approach for the selection of biomarkers in early diagnostic trials

An Introduction to Modern Measurement Theory

AUTOMATED CHARACTERIZATION OF ESOPHAGEAL AND SEVERELY INJURED VOICES BY MEANS OF ACOUSTIC PARAMETERS

Feature Selection for Predicting Tumor Metastases in Microarray Experiments using Paired Design

Biomarker Selection from Gene Expression Data for Tumour Categorization Using Bat Algorithm

Alma Mater Studiorum Università di Bologna DOTTORATO DI RICERCA IN METODOLOGIA STATISTICA PER LA RICERCA SCIENTIFICA

Resampling Methods for the Area Under the ROC Curve

Desperation or Desire? The Role of Risk Aversion in Marriage. Christy Spivey, Ph.D. * forthcoming, Economic Inquiry. Abstract

ASSESSMENT OF PARAMETRIC AND NON-PARAMETRIC METHODS FOR SELECTING STABLE AND ADAPTED SPRING BREAD WHEAT GENOTYPES IN MULTI - ENVIRONMENTS ABSTRACT

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

Latent Class Analysis for Marketing Scales Development

Price linkages in value chains: methodology

Integration of sensory information within touch and across modalities

Journal of Economic Behavior & Organization

Effects of Estrogen Contamination on Human Cells: Modeling and Prediction Based on Michaelis-Menten Kinetics 1

Experimental Study of Dielectric Properties of Human Lung Tissue in Vitro

Integrative Computational Identifications of the Signaling Pathway Network Related to TNF-alpha Stimulus in Vascular Endothelial Cells

BINNING SOMATIC MUTATIONS BASED ON BIOLOGICAL KNOWLEDGE FOR PREDICTING SURVIVAL: AN APPLICATION IN RENAL CELL CARCINOMA

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

An Improved Time Domain Pitch Detection Algorithm for Pathological Voice

Validation of the Gravity Model in Predicting the Global Spread of Influenza

Strategies for the Early Diagnosis of Acute Myocardial Infarction Using Biochemical Markers

ARTICLE IN PRESS Neuropsychologia xxx (2010) xxx xxx

USING DIFFERENTIAL GEOMETRIC LARS ALGORITHM TO STUDY THE EXPRESSION PROFILE OF A SAMPLE OF PATIENTS WITH LATEX-FRUIT SYNDROME

Encoding processes, in memory scanning tasks

THE NORMAL DISTRIBUTION AND Z-SCORES COMMON CORE ALGEBRA II

Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome

Estimation of Relative Survival Based on Cancer Registry Data

Prediction of Human Disease-Related Gene Clusters by Clustering Analysis

I I I I I I I I I I I I 60

What Determines Attitude Improvements? Does Religiosity Help?

Appendix F: The Grant Impact for SBIR Mills

Evaluation of two release operations at Bonneville Dam on the smolt-to-adult survival of Spring Creek National Fish Hatchery fall Chinook salmon

The Case for Selection at CCR5-D32

Transcription:

Insghts n Genetcs and Genomcs Research Artcle Open Access New Score Tests for Equalty of Varances n the Applcaton of DNA Methylaton Data Analyss [Verson ] Welang Qu Xuan L Jarrett Morrow Dawn L DeMeo Scott T Wess Xaogang Wang and Yuejao Fu Channng Dvson of Network Medcne Brgham and Women's Hosptal Harvard Medcal School USA Department of Mathematcs and Statstcs York Unversty Canada Correspondng author: Welang Qu Channng Dvson of Network Medcne Brgham and Women's Hosptal Harvard Medcal School 8 Longwood Avenue Boston 5 USA Tel: -67-55- 84; Emal: stwxq@channng.harvard.edu : Co-frst author Copyrght: 7 Welang Qu et al. Ths artcle s dstrbuted under the terms of the Creatve Commons Attrbuton 4. Internatonal Lcense (http:// creatvecommons.org/lcenses/by/4./) whch permts unrestrcted use dstrbuton and reproducton n any medum provded you gve approprate credt to the orgnal author(s) and the source. Orgnal Submsson Receved: March 7 Accepted: July 3 7 Publshed: August 7 How to cte ths artcle: Welang Qu Xuan L Jarrett Morrow Dawn L DeMeo Scott T Wess Xaogang Wang Yuejao Fu. New Score Tests for Equalty of Varances n the Applcaton of DNA Methylaton Data Analyss [Verson ]. Insghts Genet Genomcs. (7) : 3. Abstract Recently DNA methylaton marks wth dfferental varablty have been reported to have bologcal meanngs. Ahn and Wang (3) proposed a jont test for testng f two dstrbutons have the same mean and the same varance for DNA methylaton data analyss. Ther jont test statstc whch has good performance s a quadratc form of a vector of two test statstcs. The frst test statstc ams to test for equalty of means whle the second test (denoted as AWvar) statstc ams to test for equalty of varances. One advantage of the score tests for logstc regressons s that no assumptons are requred for the dstrbutons of the predctors. However the performance of the AWvar test has not been studed yet. In ths study we evaluated the performance of the AWvar test and proposed three mproved AWvar tests (denoted as AWvar.Levene AWvar.BF and AWvar.Ltrm) based on Levene test Brown-Forsythe test and trmmed-mean-based Levene test respectvely. Systematc smulaton studes and a real DNA methylaton data analyss showed that () AWvar test worked well under normalty assumpton; () the three mproved AWvar tests had much hgher testng power than AWvar f normalty assumpton s volated; and (3) AWvar.Levene AWvar.BF and AWvar.Ltrm had slghtly better performance than ther correspondng counterparts: Levene test Brown-Forsythe test and trmmed-mean-based Levene test. Keywords OPR Scence Open Access Open Peer Revew Levene Test; Brown-Forsythe Test; Logstc Regresson; Systematc Smulaton; Methylaton Welang Qu et al. Insghts Genet Genomcs. (7) : 3.

Insghts n Genetcs and Genomcs Introducton Recently a seres of papers about DNA methylaton data analyss [-7] proposed to dentfy DNA methylaton marks that have dfferent varances between two groups of subjects (e.g. cases and controls). DNA methylaton s a bochemcal process that can regulate gene expresson wthout changng genetc code by addng a methyl group to the cytosne DNA nucleotdes. Varable DNA methylaton has been assocated wth cancers and many complex dseases. It s a well-studed research topc to test for equalty of varance n the scence of Statstcs. The classcal test for the equalty of varances s the F-test whch s based on the rato of two sample varances. Under the null hypothess that the two dstrbutons are normal dstrbutons wth same varances the F-test statstc follows the F dstrbuton. The man lmtaton of the F-test s that t s senstve (.e. type I error rate would be much hgher than the nomnal value) to the volaton of the normalty assumpton. Ths s relevant to the nvestgaton of epgenetcs as hgh-throughput DNA methylaton data mght contan outlers (e.g. caused by techncal falure) and DNA methylaton levels mght not be normally dstrbuted [8]. More than 5 robust equal-varance tests have been proposed snce the man lmtaton of the F-test was reported. [9] compared 56 equal-varance testng procedures usng smulaton studes and found that Brown-Forsythe test [] (denoted as BF test) was one of the best tests n terms of keepng nomnal type I error rate whle havng adequate power. We have recently performed a comparatve study of tests for homogenety of varances wth applcaton to DNA methylaton data [8]. We found that the trmmed-mean-based Levene test (denoted as Ltrm test) and BF test outperformed other tests for most scenaros. Recently [] proposed a jont test amng to test f two dstrbutons have the same mean and the same varance and showed ts good performance by usng smulaton studes and real data analyss. The jont test statstc s a quadratc form of a vector of two test statstcs. The frst test statstc ams to test for equalty of means. The second test statstc ams to test for the equalty of varances. We denote the second test as the AWvar test. [] evaluated the performance of ther jont test. One advantage of the score tests of logstc regressons s that no assumptons are requred for the dstrbutons of the predctors. Hence we expect the AWvar test would be robust to outlers and the volaton of normalty assumpton. However they dd not evaluate the performance of the AWvar test. In ths artcle we evaluated the performance of the AWvar test and proposed three mproved AWvar tests (denoted as AWvar.Levene AWvar.BF and AWvar.Ltrm) based on Levene BF and Ltrm respectvely. We dd systematc smulaton studes and a real DNA methylaton data analyss to compare the performances of AWvar Levene BF Ltrm AWvar.Levene AWvar.BF and AWvar.Ltrm. OPR Scence Open Access Open Peer Revew Method In ths secton we frst revew the defnton of the AWvar test and then propose three mproved AWvar tests. AWvar Test For a gven DNA methylaton mark (.e. CpG ste) [] defned the followng statstc for testng equalty of varance between two dstrbutons: where n s the number of cases n s the number of controls y s a bnary varable ndcatng case-control status (.e. y = ndcates that the -th subject s a case and y = ndcates a control) y s the average of y = n +n (.e. y s equal to the proporton of cases) z s the squared wthn-group devaton. That s z f subject s a control f subject s a case where x s the DNA methylaton level at a gven DNA methylaton mark for the -th subject x n n = + x ( y ) / n = (.e. the sample n+ n mean DNA methylaton level for controls) and x = / = xy n (.e. the sample mean DNA methylaton level for cases). For the logstc regresson model log t( p z ) = β + β z where p = pr( y = z) the score test statstc T s asymptotcally ch squared dstrbuted wth degree of freedom under the null hypothess that β = : U T β = = x Var( U ) Where (.e. the sample mean of z ). Three Improved AWvar Tests Welang Qu et al. Insghts Genet Genomcs. (7) : 3. Var( U ) = y( y) ( z z ) n+ n U = z ( y y) = ( x x ) = ( x x ) = z = z = /( ) and Snce z s senstve to outlers we borrow the deas of three Levene tests (Levene test Brown-Forsythe test and trmmed-mean based Levene test) to propose three mproved AWvar tests by mplementng robust versons of z so that the mproved AWvar tests are less senstve to outlers. The frst mproved AWvar test (denoted as AWvar.Levene) s based on the Levene test (c.f. Supplementary Document Secton A). The dea s to replace the squared devaton of x by absolute devaton: 3

Insghts n Genetcs and Genomcs z x x = x x f subject s a control f subject s a case. for cases and controls respectvely. The 5% trmmed mean for a sample s the sample mean after trmmed 5% lowest values and 5% hghest values. Let x x f subject s a control z = x x f subject s a case. For the logstc regresson logt( p z ) = β + β z where p the score test statstc = pr( y = z ) T s asymptotcally ch squared dstrbuted wth degree of freedom under the null hypothess that β = where and The second mproved AWvar test (denoted as AWvar.BF) s based on the Brown-Forsythe test (c.f. Supplementary Document Secton B). The dea s to replace the wthn-group mean by wthn-group medan n AWvar.Levene: OPR Scence Open Access Open Peer Revew f subject s a control f subject s a case where s the medan for controls and x s the medan for cases. For the logstc regresson log t( p z ) = β + β z where p = pr( y = z ) the score test statstc T s asymptotcally ch squared dstrbuted wth degree of freedom under the null hypothess that β = : where and : U β = T = x Var( U ) n+ n U = z ( y y) z z = = / ( ). x x z = x x x T Var( U ) = y( y) ( z z ) n n = The thrd mproved AWvar test (denoted as AWvar.Ltrm) s based on the trmmed-mean based Levene test (c.f. Supplementary Document Secton C). The dea s to use trmmed wthn-group means. Denote and as the 5% trmmed means = z = z / ( ). x x U β = = x Var( U ) n+ n = = U ( z z ) For the logstc regresson log t( p z ) = β + β z where p = pr( y = z ) the score test statstc s asymptotcally ch squared dstrbuted wth degree of freedom under the null hypothess that β = : where U = z ( y y) and T Smulaton Studes We used the same systematc smulaton desgn as the one used n [8] to compare the 7 equal-varance tests: AWvar Levene BF Ltrm AWvar.Levene AWvar.BF and AWvar.Ltrm. Brefly we performed sets of smulaton studes. The frst set of smulaton studes (denoted as Smulaton I) s based on the smulaton studes n []. Specfcally for a gven CpG ste DNA methylaton levels for cases and controls were generated from a mxture of two normal two Student's t or two ch-squared dstrbutons. DNA methylaton levels of all CpG stes were ndependently generated. The second set of smulaton studes (denoted as Smulaton II) s based on the smulaton studes n [7]. DNA methylaton levels were generated from a mxture of Bayesan herarchcal models. Specfcally for a CpG ste gven ts varance the DNA methylaton levels were generated from normal dstrbutons. The varances themselves are random varables from a scaled nverse ch squared dstrbuton. Each par of CpG stes was margnally correlated wth each other. In each smulaton scenaro we generated data sets so that we can evaluate the varatons of estmated type I error rates or estmated power. In each smulated data set CpG stes were generated. We set the case group and control group to have the same number of subjects. Please refer to [8] for detals. The smulaton studes evaluated the effects of () sample sze () the presence of heterogenety of means (3) volaton of the normalty assumpton and (4) outlers on the performances of the 7 equal-varance tests. We consdered three sample szes (number of subjects per-group= 5 or ) to Welang Qu et al. Insghts Genet Genomcs. (7) : 3. T U β = = x Var( U ) n+ n = Var( U ) = y( y) ( z z ) = z z = = / ( ). 4

Insghts n Genetcs and Genomcs evaluate the effect of sample sze on the performance of the 7 tests of equalty of varance. To evaluate the effect of nequalty of means we consdered 4 scenaros: dstrbutons have the same mean and same varance; dstrbutons have the same mean but dfferent varances; dstrbutons have dfferent means but the same varance; and dstrbutons have dfferent means and dfferent varances. To evaluate the effect of non-normal dstrbuton we consdered t dstrbutons and ch squared dstrbutons n addton to normal dstrbutons. When evaluatng the effect of outlers we followed [7] and replaced the DNA methylaton level of one randomly pcked case subject by the maxmum DNA methylaton level across all CpG stes and all subjects. There were 36 pars of dfferent scenaros n Smulaton I and 4 pars of dfferent scenaros n Smulaton II. Wthn a par cases and controls have the same varance n one scenaro and have dfferent varances n another scenaro. Hence there were 48 pars of scenaros n total. Forty-eght scenaros n whch cases and controls have the same varance of DNA methylaton levels were used to evaluate whether the emprcal type I error rates are equal to or less than the nomnal value.5. Specfcally for each smulated data set we test for equalty of varance for the CpG stes separately. A test s postve f the p-value of the test s <.5. The proporton of postve tests among the tests s an estmate of the type I error rate. For the smulated data sets of a gven scenaro we wll have estmated type I error rates. We then tested the null hypothess H that the mean type I error rate s.5 by usng one-sded one-sample t-test. The number (n reject ) of scenaros that rejected H could be used to evaluate f a test for equalty of varance s good or not. The smaller n reject s the better the equal-varance test s. The other 48 scenaros n whch cases and controls have dfferent varances of DNA methylaton levels were used to evaluate the power of each of the 7 equal-varance tests. Specfcally for each smulated data set we test for equalty of varance for the CpG stes separately. A test s postve f the p-value of the test s <.5. The proporton of postve tests among the tests s an estmate of the power. For each scenaro we then obtaned the medan of the estmated powers for each equal-varance test. We then ranked the medan powers only for the equal-varance tests that dd not reject the null hypothess that the mean type I error rates s.5. If the medan power of a test s the hghest then the rank of the test s. For tes average ranks were used. If a test that rejected the null hypothess that mean type I error rate.5 then we set ts rank as mssng value -. We next obtaned the medan rank (denoted as m) for each of the 7 equal-varance tests across the scenaros. The smaller m s the better the equal-varance test s. We drew the plot of n reject versus m to vsualze the relatve performance of the 7 equal-varance tests. Real Data Analyses We used two publcly avalable DNA methylaton data sets (GSE8[] and GSE37[6]) whch were downloaded from Gene Expresson Omnbus (GEO) (www.ncb.nlm.nh.gov/ OPR Scence Open Access Open Peer Revew geo) to compare the performances of the 7 equal varance tests. For both data sets we are nterested n detectng CpG stes dfferentally varable between samples wth normal hstology and samples wth cervcal ntraepthelal neoplasa of grade or hgher (CIN+). GSE8 contans 3 normal samples and 8 CIN+ samples whle GSE37 contans 4 normal samples and 4 CIN+ samples. DNA methylaton levels n both GSE8 and GSE37 were measured by IllumnaHumanMethylaton7 platform whch measures the methylaton levels for 7578 CpG stes. We checked data qualty for each of the data sets and excluded CpG stes resdng on SNPs or wth mssng values. After data QC and preprocessng (for detals please refer to [8]) there were 859 CpG stes appearng n both cleaned data sets. In the QC and preprocessng step we excluded CpG stes resdng on SNPs. We used GSE8 as the dscovery set and GSE37 as the valdaton set. For a gven CpG ste n a gven data set we appled each of the 7 equal-varance tests to test for equalty of varance. For a gven equal-varance test we clamed a CpG ste n the analyss of GSE8 as sgnfcantly dfferentally varable f the false dscovery rate (FDR)[3] adjusted p-value for the CpG ste s less than.5. For a sgnfcantly dfferentally varable (DV) CpG ste n the analyss of GSE8 f the correspondng un-adjusted p-value n the analyss of GSE37 s less than.5 then we clamed that the sgnfcance n the analyss of GSE8 s valdated n the analyss of GSE37. Results Results of Smulaton Studes Table summarzes the medan ranks of medan estmated powers of the 7 equal-varance tests for the 48 scenaro pars n whch cases and controls have dfferent varances of DNA methylaton levels. The lower the medan rank s the better a method s. We can see that () AWvar worked well (medan ranks =) for data generated from normal dstrbutons wthout outlers no matter whether cases have the same mean methylaton levels as controls or not; () however for data generated from non-normal dstrbutons (t dstrbutons or ch square dstrbutons) or for data contanng outlers AWvar dd not perform well (medan ranks are among the hghest); (3) both Levene (n reject =34) and AWvar.Levene (n reject =37) had very hgh values of n reject. That s they tend to have nflated type I error rates n most of the 48 scenaros; (4) Other 4 equal-varance tests (BF Ltrm AWvar.BF and AWvar.Ltrm) tend to have hgher type I error rate for data generated from the ch square dstrbuton; (5) compared to the BF and Ltrm AWvar.BF and AWvar.Ltrm had smaller medan ranks and same n reject ; (6) For data that have large sample sze (npergroup=) and were generated from normal dstrbutons all 7 equal-varance tests tend to have same medan ranks; (7) Although AWvar.Levene had hghest value of n reject t had smallest medan ranks for scenaros n whch AWvar.Levene kept nomnal type I error rate; and (8) AWvar.Ltrm had the smallest medan rank of the medan powers (m=.5); (9) In terms of keepng nomnal type I error rate BF AWvar.BF and AWvar are better than the other 4 Welang Qu et al. Insghts Genet Genomcs. (7) : 3. 5

Insghts n Genetcs and Genomcs equal varance tests (Levene Ltrm AWvar.Levene and AWvar. Ltrm). Table : Rank of the medan powers of the smulated data sets for each of the 48 smulaton scenaro pars. out=yes means scenaros n whch data contan outlers; eqm=yes means scenaros n whch mean methylaton levels of cases s equal to that of controls; - n the cells ndcates the null hypothess H that the type I error rate of the equal-varance test s.5 was rejected based on smulated data sets for the scenaro; nreject = number of scenaros where an equal-varance test rejected H; m= medan rank of the medan powers among the 7 equal varance tests (for ranks wth tes average ranks were used); c.n and c.chsq ndcate condtonal normal and ch squared dstrbutons respectvely. OPR Scence Open Access Open Peer Revew AWvar. Levene npergroup out eqm Dstr Levene BF Ltrm AWvar AWvar. BF AWvar. Ltrm no no c.chsq - - 3 - - no no chsq - - - - - - - no no c.n - 5 3-4 no no N - 5 3-4 no no t - 5 3-4 no yes chsq - - - - - no yes N - 5 3-4 no yes t - 4 5-3 yes no c.chsq - 4 5-3 yes no chsq - 4 5-3 yes no c.n 6 4 7 5 3 yes no N - 4 5-3 yes no t - 4 5-3 yes yes chsq - 4 5-3 yes yes N - 4 5-3 yes yes t 5 5 5 5 5 5 no no c.chsq - - 3 - - 5 no no chsq - - - - - - - 5 no no c.n 6 4-5 3 5 no no N - 3.5.5 - - 3.5.5 5 no no t - 4 5-3 5 no yes chsq - 3 - - - 5 no yes N - 3.5.5 - - 3.5.5 5 no yes t - 4 5-3 5 yes no c.chsq - 4 5-3 5 yes no chsq - 4 5-3 5 yes no c.n 6 4 7 5 3 5 yes no N -.5.5 5 -.5.5 5 yes no t - 4 5-3 5 yes yes chsq - 4 5-3 5 yes yes N 3.5 3.5 6-3.5 3.5 5 yes yes t 6 3.5 7 5 3.5 no no c.chsq -.5-3 -.5 - no no chsq - - - - - - no no c.n 4 4 4 4 4 4 4 no no N 4 4 4 4 4 4 4 no no t -.5-3 -.5 - no yes chsq - - 3 - - no yes N 4 4 4 4 4 4 4 no yes t 6 3.5-5 3.5 yes no c.chsq - 3.5.5 5-3.5.5 yes no chsq - 3.5.5 5-3.5.5 yes no c.n 3.5 3.5 3.5 7 3.5 3.5 3.5 yes no N 4 4 4 4 4 4 4 yes no t - - - - - - yes yes chsq - 4.5 5-3.5 yes yes N 4 4 4 4 4 4 4 yes yes t 6 5 7 4 3 n reject 34 4 5 37 4 m 4 5 3.5 3.5.5 Fgure : Plots of n reject versus m where n reject s the number of scenaros where an equal-varance test rejected the null hypothess H that mean type I error rates s.5 and m s the medan rank of the medan powers. The upper-left upper-rght bottom-left panels are the plots where n reject and m were obtaned based on scenaros wth sample sze 5 or subjects per group respectvely. The bottom-rght panel s the plot where n reject and m were obtaned based on all scenaros. Fgure shows the plots of n reject versus m for the 7 equal-varance tests for scenaros wth npergroup = 5 or or for all 48 scenaro pars separately. If an equal-varance test s good t should have both small n reject and small m. Hence a good equal-varance test should appear n the bottom left corner of the plots n Fgure. Fgure showed that () AWvar has low n reject n all 4 plots ndcatng that AWvar s good at keepng nomnal type I error rate compared to other 6 tests; ) AWvar has the largest m n all 4 plots ndcatng that AWvar tends to be less powerful than other 6 tests; (3) AWvar.Levene AWvar.BF and AWvar. Ltrm have smaller m than AWvar ndcatng that the 3 mproved AWvar tests tend to be more powerful than AWvar; (4) compared to AWvar AWvar. BF has smaller or smlar n reject n all 4 plots whch ndcatng AWvar. BF tends to perform better than AWvar n terms of both type I error rate and power; (5) compared to the BF and Ltrm AWvar.BF and AWvar.Ltrm had smaller m and same n reject ; and (6) AWvar.Levene's n reject values are the hghest among the 7 equal-varance tests n all 4 plots although ts m are the smallest for scenaros wth npergroup= and npergroup=5. Onlne Supplementary Fgures S to S6 showed the parallel boxplots of the estmated type I error rates and estmated power for each par of the 96 scenaros that were desgned to evaluate the type I error rate and the power of the 7 equal-varance tests. From these boxplots we observed that () AWvar performed best when data were generated from the normal dstrbuton wthout outlers (Onlne Supplementary Fgures S and S); () AWvar performed badly when data were generated from a dstrbuton wth outlers and wthout equal mean (e.g. Onlne Supplementary Fgures S6 S4 and S6); (3) sample sze had a large effect on the performance of the 7 equal-varance tests. For example the power of the 7 tests were less than.5 when npergroup= whle power 6 Welang Qu et al. Insghts Genet Genomcs. (7) : 3.

Insghts n Genetcs and Genomcs was greater than.99 when npergroup= for the scenaros where data were generated from normal dstrbutons wthout outlers (e.g. Onlne Supplementary Fgure S5); and (4) departure from normalty had effects on all 7 tests (e.g. comparng Onlne Supplementary Fgures S S7 and S). Results of the Real Data Analyss For the real data set GSE8 the numbers of DV CpG stes (.e. CpG stes wth FDR-adjusted p-value <.5) obtaned by the 7 equal-varance tests are (AWvar) 448 (Levene) 3 (BF) 39 (Ltrm) 33 (AWvar.Levene) (AWvar.BF) and 4 (AWvar.Ltrm) respectvely. The cross table of overlappng DV CpG stes are shown n Table. We can see that the 448 DV CpG stes detected by Levene test contan the 33 DV CpG stes detected by AWvar.Levene. The 33 DV CpG stes detected by AWvar.Levene n turn contan the 39 DV CpG stes detected by Ltrm. The 39 DV CpG stes detected by Ltrm n turn contan the 4 DV CpG stes detected by AWvar.Ltrm and the 3 DV CpG stes detected by BF. And of the 3 DV CpG stes detected by BF were also detected by AWvar.Ltrm. Table : Cross table of overlappng DV CpG stes among the 7 equal-varance tests n the analyss of GSE8. Levene BF Ltrm AWvar AWvar.Levene AWvar.BF AWvar.Ltrm Levene 448 3 39 33 4 BF 3 3 3 3 Ltrm 39 3 39 39 4 AWvar AWvar.Levene 33 3 39 33 4 AWvar.BF AWvar.Ltrm 4 4 4 4 Table 3: Number of DV CpG stes for GSE8 and number/proporton of DV CpG stes valdated by GSE37. nvaldated/pvaldated s the number/proporton of DV CpG stes that were valdated n GSE37. Test nsg nvaldated pvaldated (%) Levene 448 76 6.6 BF 3 76.9 Ltrm 39 8 7.8 AWvar NA NA AWvar.Levene 33 6 68.5 AWvar.BF NA NA AWvar.Ltrm 4 78.6 The numbers/proportons of valdated DV CpG stes are shown n Table 3. We can see that AWvar.Ltrm had the hghest valdaton rato (78.6\%) followed by BF (76.9\%) Ltrm (7.8\%) AWvar.Levene (68.5\%) and Levene (6.6\%). Dscusson In ths artcle we evaluated the performance of the AWvar score test and proposed three mproved AW score tests for equalty of varance. The smulaton studes showed that the AWvar test s good at keepng nomnal type I error rate for all OPR Scence Open Access Open Peer Revew 48 pars of scenaros and had hghest power for the scenaros where data were generated from normal dstrbutons wthout outlers. For other scenaros the other 6 tests performed better than the AWvar test n terms of power. Note that AWvar score test statstc can be rewrtten as the dfference of the two sample varances[] and that F test whch s the rato of two sample varances s senstve to non-normalty and outlers. Hence we expect that AWvar test s senstve to non-normalty and outlers too. Levene BF and Ltrm tests are robust versons of F test whch are robust to outlers and departures of normalty. In ths artcle we proposed three mproved AWvar tests (AWvar. Levene AWvar.BF and AWvar.Ltrm) based on Levene BF and Ltrm respectvely. The three mproved AW score tests (AWvar. Levene AWvar.BF and AWvar.Ltrm) had slghtly larger power than and had smlar type I error rate to ther counterparts (Levene BF and Ltrm). For larger sample sze npergroup= all 7 equal-varance tests had smlar power however Levene and AWvar.Levene tend to have hgh type I error rates. The results of the real data analyss showed that Levene and AWvar.Levene are much powerful than other 4 tests snce Levene and AWvar.Levene detected more than 8 tmes DV CpG stes than the other 5 equal-varance tests. The results also showed that the DV CpG stes detected by Levene and AWvar. Levene contan the DV CpG stes detected by other 5 equal-varance tests. AWvar.Ltrm had the hghest proporton of valdated DV CpG stes and AWvar.Levene had hgher proporton of valdated DV CpG stes than Levene. The results also showed that AWvar test s not as much as powerful than the other 6 equal-varance tests. The fact that AWvar.BF dd not fnd any DV CpG stes for GSE8 ndcates that real data are more complcated than smulated data sets and more nvestgatons of the 3 mproved AWvar tests are warranted. Supplementary Table S lsted the 4 DV CpG stes obtaned by AWvar.Ltrm. The test statstcs and pvalues for both GSE8 and GSE37 are lsted. The functonal annotaton clusterng obtaned by DAVID functonal annotaton tool[4] (Supplementary Table S) showed that the 3 genes correspondng to the 4 DV CpG stes are related to bologcal functons of cell membrane plasma membrane and cell adheson. The parallel boxplots of methylaton levels of the 4 DV CpG stes versus dsease status (CIN+ samples versus normal samples) are shown n Supplementary Fgure S7 (for GSE8) and Supplementary Fgure S8 (for GSE37). The varabltes of these 4 DV CpG stes are much larger n CIN+ samples than n normal samples whch s consstent wth what observed by [6]. The top DV CpG ste cg783 (p-value=6.4 x -7 FDR adjusted p-value=.53 n the analyss of GSE8) detected by AWvar.Ltrm s near the gene EPB4L3 on chromosome 8. The full name of EPB4L3 s erythrocyte membrane proten Welang Qu et al. Insghts Genet Genomcs. (7) : 3. 7

Insghts n Genetcs and Genomcs band 4. lke 3 whch s a proten-codng gene. Accordng to GeneCards[5] EPB4L3 s a tumor suppressor that nhbts cell prolferaton and promotes apoptoss. By searchng PubMed usng the keywords EPB4L3 and cervcal we found that t s known n the lterature that DNA methylaton n the EPB4L3 gene s assocated wth CIN+[6-6]. For example t has been reported that EPB4L3 s a potental bomarker n cervcal cancer and s often slenced by cancer-specfc promoter methylaton[]. EPB4L3 has also been used to construct classfer to trage women to reduce adverse events and costs assocated wth unnecessary colposcopy[68-45]. These results of lterature searches show that the sgnfcant results obtaned by AWvar.Ltrm have bologcal meanngs. Therefore the mproved AWvar tests could help dentfy DNA methylaton marks that could potentally uncover the molecular dfferences between dseased samples and normal samples. It would be an nterestng future work to fnd what addtonal nformaton that dfferental varable DNA methylaton marks could brng. DNA methylaton usually occurs on CpG sland regons. Thus wthn a CpG sland correlaton mght exst between CpG stes. We performed an equal-varance test for each CpG ste one at a tme. Hence we gnored the potental correlatons among CpG stes. We mght mprove the testng power by borrowng nformaton from correlated CpG stes. We wll nvestgate how to utlze the correlaton nformaton between CpGs n future work. The proposed methods mght help for callng peaks that could be assocated DNA methylaton regons. The dfferental varablty tests can also be appled to other omcs data analyses. Acknowledgement Ths work was sponsored by NIH P HL 385 NIH P HL 5339 and NIH P HL 45. References. Fenberg AP RA Irzarry. Evoluton n health and medcne Sackler colloquum: Stochastc epgenetc varaton as a drvng force of development evolutonary adaptaton and dsease. Proc Natl Acad Sc U S A. ; 7: 757-764.. Fenberg AP Irzarry RA Fradn D Aryee MJ Murakam P et al. Personalzed epgenomc sgnatures that are stable over tme and covary wth body mass ndex. Sc Transl Med. ; : 49ra67. 3. Issa JP. Epgenetc varaton and cellular Darwnsm. Nat Genet. ; 43: 74-76. OPR Scence Open Access Open Peer Revew 4. Hansen KD Tmp W Bravo HC Sabuncyan S Langmead B et al. Increased methylaton varaton n epgenetc domans across cancer types. Nat Genet. ; 43: 768-775. 5. Jaffe AE Fenberg AP Irzarry RA Leek JT. Sgnfcance analyss and statstcal dssecton of varably methylated regons. Bostatstcs. ; 3: 66-78. 6. Teschendorff AE M Wdschwendter. Dfferental varablty mproves the dentfcaton of cancer rsk markers n DNA methylaton studes proflng precursor cancer lesons. Bonformatcs. ; 8: 487-494. 7. Phpson B Oshlack A. DffVar: a new method for detectng dfferental varablty wth applcaton to methylaton n cancer and agng. Genome Bol. 4; 5: 465. 8. L X Qu W Morrow J DeMeo DL Wess ST et al. A Comparatve Study of Tests for Homogenety of Varances wth Applcaton to DNA Methylaton Data. PLoS One. 5; : e4595. 9. Conover WJ ME Johnson MM Johnson. A Comparatve Study of Tests for Homogenety of Varances wth Applcatons to the Outer Contnental Shelf Bddng Data. Technometrcs. 98; 3: 35-36.. Brown MB AB Forsythe. Robust tests for equalty of varances. Journal of the Amercan Statstcal Assocaton. 974; 69: 364-367.. Ahn S T Wang. A powerful statstcal method for dentfyng dfferentally methylated markers n complex dseases. Pacfc Symposum on Bocomputng. Pacfc Symposum on Bocomputng. 3; 69-79.. Teschendorff AE Allson Jones Hed Fegl Alexandra Sargent Joanna J Zhuang et al. Epgenetc varablty n cells of normal cytology s assocated wth the rsk of future morphologcal transformaton. Genome medcne. ; 4: 4. 3. Benjamn Y Y Hochberg. Controllng the false dscovery rate: a practcal and powerful approach to multple testng. Journal of the Royal Statstcal Socety Seres B. 995; 57: 89-3. 4. Huang da W BT Sherman RA Lempck. Systematc and ntegratve analyss of large gene lsts usng DAVID bonformatcs resources. Nat Protoc. 9; 4: 44-57. 5. Rebhan M Chalfa-Casp V Prlusky J Lancet D. GeneCards: ntegratng nformaton about genes protens and dseases. Trends Genet. 997; 3: 63. 6. Ejsnk JJ Lendva Á Deregowsk V Klp HG Verpooten G et al. A four-gene methylaton marker panel as trage test n hgh-rsk human papllomavrus postve patents. Int J Cancer. ; 3: 86-869. Welang Qu et al. Insghts Genet Genomcs. (7) : 3. 8

Insghts n Genetcs and Genomcs 7. Guerrero-Setas D Pérez-Jances N Blanco-Fernandez L Ojer A Cambra K et al. RASSF hypermethylaton s present and related to shorter survval n squamous cervcal cancer. Mod Pathol. 3; 6: -. 8. Vasljevć N Scbor-Bentkowska D Brentnall AR Cuzck J Lorncz AT. Credentalng of DNA methylaton assays for human genes as dagnostc bomarkers of cervcal ntraepthelal neoplasa n hgh-rsk HPV postve women. Gynecol Oncol. 4; 3: 79-74. 9. Brentnall AR Vasljevć N Scbor-Bentkowska D Cadman L Austn J et al. A DNA methylaton classfer of cervcal precancer based on human papllomavrus and human genes. Int J Cancer. 4; 35: 45-43.. Boers A Bosgraaf RP van Leeuwen RW Schuurng E Hedeman DA et al. DNA methylaton analyss n self-sampled brush materal as a trage test n hrh- PV-postve women. Br J Cancer. 4; : 95-.. Louvanto K Franco EL Ramanakumar AV Vasljevć N Scbor-Bentkowska D et al. Methylaton of vral and host genes and severty of cervcal lesons assocated wth human papllomavrus type 6. Int J Cancer. 5; 36: E638-645.. Husman C van der Wjst MG Falah F Overkamp J Karsten G et al. Prolonged re-expresson of the hypermethylated gene EPB4L3 usng artfcal transcrpton factors and epgenetc drugs. Epgenetcs. 5; : 384-396. 3. Blanco-Luqun I Guarch R. Dfferental role of gene hypermethylaton n adenocarcnomas squamous cell carcnomas and cervcal ntraepthelal lesons of the uterne cervx. Pathol Int. 5; 65: 476-485. 4. Brentnall AR Vasljevc N Scbor-Bentkowska D Cadman L Austn J et al. HPV33 DNA methylaton measurement mproves cervcal pre-cancer rsk estmaton of an HPV6 HPV8 HPV3 and EPB4L3 methylaton classfer. Cancer Bomark. 5; 5: 669-675. 5. Lorncz AT Brentnall AR Scbor-Bentkowska D Reuter C Banwat R et al. Valdaton of a DNA methylaton HPV trage classfer n a screenng sample. Int J Cancer. 6; 38: 745-75. 6. Boers A Wang R van Leeuwen RW Klp HG de Bock GH et al. Dscovery of new methylaton markers to mprove screenng for cervcal ntraepthelal neoplasa grade /3. Cln Epgenetcs. 6; 8: 9. OPR Scence Open Access Open Peer Revew Welang Qu et al. Insghts Genet Genomcs. (7) : 3. 9