Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models

Similar documents
Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

THE NATURAL HISTORY AND THE EFFECT OF PIVMECILLINAM IN LOWER URINARY TRACT INFECTION.

The effect of salvage therapy on survival in a longitudinal study with treatment by indication

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

Copy Number Variation Methods and Data

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

NHS Outcomes Framework

Project title: Mathematical Models of Fish Populations in Marine Reserves

CONSTRUCTION OF STOCHASTIC MODEL FOR TIME TO DENGUE VIRUS TRANSMISSION WITH EXPONENTIAL DISTRIBUTION

Using Past Queries for Resource Selection in Distributed Information Retrieval

A comparison of statistical methods in interrupted time series analysis to estimate an intervention effect

Study and Comparison of Various Techniques of Image Edge Detection

ALMALAUREA WORKING PAPERS no. 9

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/22/2015. Econ 1820: Behavioral Economics Mark Dean Spring 2015

An Introduction to Modern Measurement Theory

WHO S ASSESSMENT OF HEALTH CARE INDUSTRY PERFORMANCE: RATING THE RANKINGS

Chapter 20. Aggregation and calibration. Betina Dimaranan, Thomas Hertel, Robert McDougall

NUMERICAL COMPARISONS OF BIOASSAY METHODS IN ESTIMATING LC50 TIANHONG ZHOU

Optimal Planning of Charging Station for Phased Electric Vehicle *

Estimating the distribution of the window period for recent HIV infections: A comparison of statistical methods

Alma Mater Studiorum Università di Bologna DOTTORATO DI RICERCA IN METODOLOGIA STATISTICA PER LA RICERCA SCIENTIFICA

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

National Polyp Study data: evidence for regression of adenomas

Economic crisis and follow-up of the conditions that define metabolic syndrome in a cohort of Catalonia,

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

Causal inference in nonexperimental studies typically

Statistical Analysis on Infectious Diseases in Dubai, UAE

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

THIS IS AN OFFICIAL NH DHHS HEALTH ALERT

Balanced Query Methods for Improving OCR-Based Retrieval

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

Association between cholesterol and cardiac parameters.

CLUSTERING is always popular in modern technology

What Determines Attitude Improvements? Does Religiosity Help?

Physical Model for the Evolution of the Genetic Code

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Evaluation of two release operations at Bonneville Dam on the smolt-to-adult survival of Spring Creek National Fish Hatchery fall Chinook salmon

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Evaluation of the generalized gamma as a tool for treatment planning optimization

Gurprit Grover and Dulumoni Das* Department of Statistics, Faculty of Mathematical Sciences, University of Delhi, Delhi, India.

The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

A Geometric Approach To Fully Automatic Chromosome Segmentation

Estimation of Relative Survival Based on Cancer Registry Data

Insights in Genetics and Genomics

We analyze the effect of tumor repopulation on optimal dose delivery in radiation therapy. We are primarily

Introduction ORIGINAL RESEARCH

Appendix F: The Grant Impact for SBIR Mills

Combined Temporal and Spatial Filter Structures for CDMA Systems

THE NORMAL DISTRIBUTION AND Z-SCORES COMMON CORE ALGEBRA II

Non-parametric Survival Analysis for Breast Cancer Using nonmedical

Analysis of Correlated Recurrent and Terminal Events Data in SAS Li Lu 1, Chenwei Liu 2

Biomarker Selection from Gene Expression Data for Tumour Categorization Using Bat Algorithm

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data

Normal variation in the length of the luteal phase of the menstrual cycle: identification of the short luteal phase

Optimizing an HIV testing program using a system dynamics model of the continuum of care

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Design of PSO Based Robust Blood Glucose Control in Diabetic Patients

IV Estimation. Dr. Alexander Spermann. Summer Term 2012

Resampling Methods for the Area Under the ROC Curve

A Meta-Analysis of the Effect of Education on Social Capital

Are Drinkers Prone to Engage in Risky Sexual Behaviors?

The Influence of the Isomerization Reactions on the Soybean Oil Hydrogenation Process

Survival Comparisons for Breast Conserving Surgery and Mastectomy Revisited: Community Experience and the Role of Radiation Therapy

Maize Varieties Combination Model of Multi-factor. and Implement

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

I I I I I I I I I I I I 60

UNIVERISTY OF KWAZULU-NATAL, PIETERMARITZBURG SCHOOL OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE

TOPICS IN HEALTH ECONOMETRICS

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

4.2 Scheduling to Minimize Maximum Lateness

Price linkages in value chains: methodology

Fitsum Zewdu, Junior Research Fellow. Working Paper No 3/ 2010

Boosting for tumor classification with gene expression data. Seminar für Statistik, ETH Zürich, CH-8092, Switzerland

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

HERMAN AGUINIS University of Colorado at Denver. SCOTT A. PETERSEN U.S. Military Academy at West Point. CHARLES A. PIERCE Montana State University

Effects of Estrogen Contamination on Human Cells: Modeling and Prediction Based on Michaelis-Menten Kinetics 1

BIOSTATISTICS. Lecture 1 Data Presentation and Descriptive Statistics. dr. Petr Nazarov

Integration of sensory information within touch and across modalities

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

Saeed Ghanbari, Seyyed Mohammad Taghi Ayatollahi*, Najaf Zare

An Approach to Discover Dependencies between Service Operations*

Length of Hospital Stay After Acute Myocardial Infarction in the Myocardial Infarction Triage and Intervention (MITI) Project Registry

Investigation of zinc oxide thin film by spectroscopic ellipsometry

AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THRESHOLDING AND SVM

HIV/AIDS AND POVERTY IN SOUTH AFRICA: A BAYESIAN ESTIMATION OF SELECTION MODELS WITH CORRELATED FIXED-EFFECTS

Rainbow trout survival and capture probabilities in the upper Rangitikei River, New Zealand

Are National School Lunch Program Participants More Likely to be Obese? Dealing with Identification

Recent Trends in U.S. Breast Cancer Incidence, Survival, and Mortality Rates

N-back Training Task Performance: Analysis and Model

Supplement. PART A: Methods. In order to estimate population-wide HIV transmission and progression rates, we

AUTOMATED CHARACTERIZATION OF ESOPHAGEAL AND SEVERELY INJURED VOICES BY MEANS OF ACOUSTIC PARAMETERS

Concentration of teicoplanin in the serum of adults with end stage chronic renal failure undergoing treatment for infection

Latent Class Analysis for Marketing Scales Development

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

Transcription:

Optmal probablty weghts for estmatng causal effects of tme-varyng treatments wth margnal structural Cox models Mchele Santacatterna, Cela García-Pareja Rno Bellocco, Anders Sönnerborg, Anna Ma Ekström and Matteo Botta Karolnska Insttutet, Stockholm, Sweden 17177 The authors are grateful to the KID grant program at Karolnska Insttutet and an ALF grant, Stockholm, Sweden, for the provded support. The authors also thank Dr. Erca E.M. Moode and Dr. Xao Yonglng for ther help on the smulaton setup. 1

Abstract Margnal structural Cox models have been used to estmate the causal effect of a tme-varyng treatment on a survval outcome n the presence of tme-dependent confounders. These methods rely on the postvty assumpton, whch states that the propensty scores are bounded away from zero and one. Practcal volatons of ths assumpton are common n longtudnal studes, resultng n extreme weghts that may yeld erroneous nferences. Truncaton, whch conssts of replacng outlyng weghts wth less extreme ones, s the most common approach to control for extreme weghts to-date. Whle truncaton reduces the varablty n the weghts and the consequent samplng varablty of the estmator, t can also ntroduce bas. Instead of truncated weghts, we propose usng optmal probablty weghts, defned as those that have a specfed varance and the smallest Eucldean dstance from the orgnal, untruncated weghts. The set of optmal weghts s obtaned by solvng a constraned quadratc optmzaton problem. The proposed weghts are evaluated n a smulaton study and appled to the assessment of the effect of treatment on tme to death among people n Sweden who lve wth human mmunodefcency vrus and nject drugs. Keywords: Causal nference, longtudnal data, postvty assumpton, probablty weghts, survval analyss. 2

1 Introducton Margnal structural Cox models (MSCM) (Robns et al., 2000; Hernán et al., 2000) have been used to estmate the causal effect of a tme-varyng treatment on a survval outcome wth observatonal data. The ncreasng popularty of MSCM derves from ther ablty to handle tme-dependent confounders, whch are confounders that are affected by prevous treatments and affect future ones (Danel et al., 2013). For example, the HIV-Causal Collaboraton (HIV-Causal Collaboraton, 2011) used MSCM to evaluate the optmal tmng of human mmunodefcency vrus (HIV) treatment ntaton on tme to death, where CD4 cell count was both a predctor of treatment ntaton and survval, as well as beng tself nfluenced by pror treatment. Standard procedures, such as regresson adjustment or matchng, fal to control for tme-dependent confoundng, thus ntroducng post-treatment bas (Blackwell, 2013; Robns, 2000). MSCM are estmated va nverse probablty of treatment weghtng (IPTW) (Hernan and Robns, 2010), whch controls for tme-dependent confoundng by creatng a hypothetcal populaton where tme-dependent and tme-nvarant confounders are balanced over tme (Cole and Hernán, 2008). These weghts are constructed as the nverse of the product of the probabltes of beng assgned to the treatment condtonal on covarates and treatment hstory,.e. the propensty scores (Rosenbaum and Rubn, 1983) estmated separately at each tme pont (Cole and Hernán, 2008). Despte ther theoretcal appeal and ther wde range of applcatons, IPTW-based methods are senstve to volatons of the postvty assumpton, also referred to as the expermental treatment assgnment assumpton (Imbens and Rubn, 2015). Ths states that the propensty score of each unt under study s bounded away from zero and one. Postvty s practcally volated when subjects n specfc strata of the populaton under study have a low probablty of recevng the treatment, leadng to extreme weghts, erroneous 3

nferences, and low precson (Robns et al., 1995; Scharfsten et al., 1999; Robns et al., 2007; Kang and Schafer, 2007). Several methods have been developed to allevate the problems caused by extreme weghts when consderng one sngle tme pont (Santacatterna and Botta, 2017; Zubzarreta, 2015; Hanmueller, 2012; Athey et al., 2016). Wth longtudnal data, truncaton, whch conssts of replacng outlyng weghts wth less extreme ones, remans the most popular soluton to ths problem (Cole et al., 2005). However, whle truncaton reduces the varablty of the weghts, thus ncreasng nferental precson, t can also ntroduce consderable bas. Ad-hoc and emprcal crtera have been proposed to choose the truncaton threshold. Under the assumpton that the MSCM estmates are unbased, Cole et al. Cole et al. (2005) suggested choosng the truncaton level by progressvely truncatng the weghts untl a trade-off between bas and varance s found. Xao et al. Xao et al. (2013) compared dfferent truncaton levels for MSCM, and proposed a data-adaptve approach to select the best level of truncaton that mnmzes the mean squared error. The authors showed an mprovement n the MSCM estmates when truncatng the weghts at hgh percentles of ther dstrbuton. Methods other than truncaton have been proposed, ncludng hstoryrestrcted MSCM Neugebauer et al. (2007) where nformaton on a restrcted porton of the treatment hstory s used to estmate the causal effects, trmmng Stürmer et al. (2010) where observatons that volate the postvty assumpton are excluded, and G-computaton Robns (2000), a non-iptw-based method. The purpose of ths paper s to ntroduce optmal probablty weghts Santacatterna and Botta (2017) (OPW) to the estmaton of the causal effect of a tme-varyng treatment wth longtudnal data when the postvty assumpton s practcally volated. OPW are the soluton to a constraned quadratc optmzaton problem, whch fnds the closest set of weghts to the orgnal, untruncated weghts whle controllng the precson of the resultng 4

weghted estmator. Dfferently from Santacatterna and Botta Santacatterna and Botta (2017), ths paper focuses on repeated observatons. In addton, the constrant s placed on the varance of the weghts nstead of the varance of the weghted estmator. Ths formulaton of the optmzaton problem s novel and has two man advantages: (1) t s quadratc and convex and therefore admts a unque soluton; and (2) t s ndependent of both the chosen estmator for the causal parameter of nterest and that for ts standard error. The followng secton brefly revews MSCM. Secton 3 ntroduces the quadratc problem used to obtan the set of optmal probablty weghts, descrbes ther propertes, and dscusses the choce of the parameter that controls precson. Secton 4 shows the results of a smulaton study. Secton 5 presents an applcaton of the optmal probablty weghts to the evaluaton of the effect of HIV treatment ntaton on tme to death among people n Sweden who nject drugs. Fnal conclusons are gven n Secton 6. 2 Margnal structural Cox models We consder a longtudnal study where n unts are observed at regular tme ntervals k = 1,..., K (e.g. every 3 months). For each unt = 1,..., n, we denote by T the observed follow-up tme, and by V the vector of baselne covarates. For each unt at tme t, we denote by A (t) not beng treated at tme t, and A (t) tme-dependent covarates. the bnary tme-varyng treatment varable, where A (t) = 0 means = 1 means beng treated at tme t, and by X (t) We assume that the treatment A (t) the and the covarates X (t) do not change between two tme ntervals (k, k + 1). We denote by A (t) the treatment hstory up to tme t and, X (t) the covarates hstory up to tme t,.e. the tme-dependent confounders hstory. We defne Y (t) the event at tme t, whch equals 1 f the subject had 5

the event at tme t, and 0 otherwse. Fnally, we denote by T a (t) the counterfactual falure tme, had the subject followed the treatment hstory a (t) = {a (t) ; 0 t < }. For each a (t), we defne the MSCM as follows, λ Ta (t) (t V ) = λ 0 (t)(t)exp(β 1γ(a (t) ) + β 2 V ) (2.1) where λ Ta (t V ) s the hazard at tme t gven baselne covarates V had, contrary to (t) fact, the subject followed the treatment hstory a (t), λ 0 (t)(t) s the baselne hazard at tme t for a never-treated subject a (t) = 0 (t) wth V = 0, γ( ) s a known functon for the treatment hstory, and β 1 s the causal parameter of nterest. Under the assumptons of postvty, consstency, no unmeasured confounders, and correct specfcaton of the models, the causal parameter β 1 can be consstently estmated usng IPTW Hernán et al. (2000); Cole and Hernán (2008). The stablzed verson of the nverse probablty of treatment weghts can be obtaned as follows Hernán et al. (2000) m(t) w (t) = k=1 P r(a (k) = a (k) A (k 1) = a (k 1), V = v) P r(a (k) = a (k) A (k 1) = a (k 1), X (k) = x (k), V = v) (2.2) where m(t) s the number of vsts up to tme t. When nformatve censorng s present, under all the aforementoned assumptons, and wth the addtonal assumpton of no unmeasured nformatve censorng, the causal parameter β 1 can be consstently estmated usng weghts obtaned by the product of nverse probablty of treatment and nverse probablty of censorng weghts Hernán et al. (2001). The set of nverse probablty of censorng weghts s computed smlarly to that of equaton (2.2). Parametrc models, such as logstc regresson, are commonly used to estmate w (t), along wth machne learnng methods, such as support vector machnes and classfcaton and regresson trees Karm et al. (2017). Throughout ths paper, we refer to ŵ (t), the estmated weghts used to control for 6

tme-dependent confoundng, as the set of target weghts. 3 Optmal probablty weghts When the postvty assumpton s practcally volated, the estmated set of target weghts ŵ (t) may contan outlers, whch may yeld low precson and erroneous nferences on the causal parameter β 1. As suggested by Santacatterna and Botta Santacatterna and Botta (2017), rather than truncatng, we propose to obtan weghts ŵ (t) o that are the closest to ŵ (t) wth respect to the Eucldean norm, whle constranng the varance of the weghts ŵ (t) o to be less or equal to a specfed level ξ. The resultng quadratc optmzaton problem can be formulated as follows. mnmze w w o (t) o (t) ŵ (t) 2 (3.1) R n t subject to w (t) o w (t) o 2 2 ξ (3.2) w (t) o 0 (3.3) where w (t) o s the mean of the weghts w o (t). Constrant (3.2) controls the varance of the weghts, and therefore the precson of the resultng weghted estmator. Constrant (3.3) ensures that the weghts are non-negatve. We refer to ŵ o (t), soluton to the problem (3.1)-(3.3), as the set of optmal probablty weghts (OPW). Santacatterna and Botta Santacatterna and Botta (2017) showed that the weghted estmator that uses optmal weghts ŵ (t) o weghts ŵ (t) s consstent. They also showed that f the weghted estmator that uses target s unbased, mnmzng the dstance between ŵ (t) o and ŵ (t) s equvalent to mnmzng the bas of the weghted estmator that uses ŵ o (t). They concluded that hgh precson could be reached wth a low ncrease n bas, n all the scenaros consdered n ther 7

smulatons. Fnally, the objectve functon and the constrant n the proposed quadratc problem (3.1)-(3.3) are convex, therefore admttng a unque soluton. 3.1 On the choce of ξ The soluton to the quadratc problem (3.1)-(3.3) depends on the constant ξ, whch controls the varance of the weghts and consequently the precson of the estmates. We suggest choosng ξ n functon of the ams of the study. The followng are some practcal gudelnes. 1. Varance of weghts obtaned by truncaton. Xao et al. Xao et al. (2013) suggested that truncaton at hgh percentles, such as the 99th or the 99.5th percentle of the dstrbuton of the target weghts mproves the IPTW estmators. Therefore, one can truncate the target weghts at hgh percentles, compute ther varance, and set ξ equal to the varance of the obtaned truncated weghts. In Secton 4.2, we show how the MSCM that uses OPW performs better, n terms of mean squared error, than that usng truncated weghts especally when the weghts are truncated at hgh percentles. 2. Evaluaton of the Lagrange multpler. Constrant (3.2) n the quadratc problem (3.1)-(3.3) has an assocated Lagrange multpler, λ L, whch can provde nsght on the relatonshp between the optmal soluton and the constrant. Specfcally, small values of λ L suggest that a small decrease n ξ would lead to a small ncrease n the optmal value of the objectve functon (3.1). Large values of λ L suggest that a small decrease n ξ would lead to a large ncrease n the optmal value of the objectve functon. Consequently, λ L may be used to select the level of precson ξ. In Secton 4.2 we show how λ L reflects the behavor of the bas across dfferent levels of precson. 8

3. Bas-varance trade-off. Cole and Hernán Cole and Hernán (2008) suggested usng truncaton as a means to trade off bas and varance. If the untruncated IPTW estmate, weghted by the set of target weghts ŵ (t), s unbased for the causal parameter of nterest, mnmzng the objectve functon n (3.1) leads to mnmzng the bas of the IPTW estmator that uses the set of optmal weghts, whle controllng the precson of the resultng IPTW estmator. A grd of values for ξ may be used to evaluate the bas-varance trade-off. As n Cole and Hernán Cole and Hernán (2008), an acceptable value for ξ may be selected after nvestgatng the values of the estmated weghted parameter and ts estmated standard error aganst the grd of levels of ξ. 4. Pre-specfed level of precson. Smlarly to sample sze and power calculatons, the level of ξ may be set to match a pre-specfed, desred precson of the resultng MSCM estmates. 5. Varance of the weghts obtaned wth smplfed weghts models. Deep classfcaton trees and logstc regresson models wth many covarates and hgher-order nteractons can estmate the set of target weghts. Ths yelds nearly unbased but hghly varable estmates of the causal parameters. Smplfyng these models by consderng, for example, a logstc regresson model wth only the man effects or a less deep tree, may ncrease the precson. The value for ξ may be set to be equal to that obtaned wth the smplfed model. 9

4 Smulatons In ths secton, we present the setup and results of a smulaton study desgned to compare OPW, soluton to (3.1)-(3.3), and weghts truncated at dfferent levels wth respect to mean squared error (MSE), bas, and standard error of the MSCM estmator. The study s amed at mmckng data from a longtudnal study of a hypothetcal cohort of HIV-postve patents Xao et al. (2013), smlar to that dscussed n Secton 5. 4.1 Setup We randomly generated 1,000 samples, each of whch comprsed 200 or 1,000 observatons usng a maxmum follow-up tme of K = 10 byearly vsts. For each nterval k = 1,..., K, we generated the expected survval tme t by usng the quantle functon of an exponental dstrbuton wth the nterval-specfc hazard rate computed from the followng model λ,k (t A (k), X (k) ) = λ 0 (t )exp(θ 1 A (k) + θ 2 X (k) ) (4.1) wth λ 0 (t ) = 0.12, θ 1 = log(0.5), θ 2 = 0.0016, A (k) Bnomal(π), π = (1 + exp(3.623 2.605I[X (k) > 500] 0.022(X (k) 0.405A (k 1) ) 1 for k 1 K, A (0) = 0, X (k) 200) + 0.009(X (k) = X (k 1) 200)I[X (k) > 500] + + 70A (k 1) + + ε, ε Normal(0, 3), Unform( 80, 5) for k 2 K, and X (1) = V = Lognormal(6,1). We defned the observed follow-up tme as t = mn(t, C, 5), where T = 0.5(k 1) + t (k) for 1 k < K and T = 5 for k K, and C Unform(0, 40). The true causal parameter of nterest, the hazard rato (HR), was set to be equal to HR = 0.5. A detaled explanaton of the data generatng process s provded by Xao et al. Xao et al. (2010). We also consdered two addtonal scenaros n whch the practcal postvty assumpton was weakly and strongly volated. Specfcally, under the weak volaton scenaro we consdered 10

π = (1 + exp(1.623 0.605I[X (k) > 500] 0.0015(X (k) 200) + 0.405A (k 1) ) 1, whch provded almost unformly dstrbuted weghts, whle under the strong volaton scenaro we consdered π = (1 + exp(4.623 2.605I[X (k) 200)I[X (k) > 500] 0.02(X (k) 200) + 0.009(X (k) > 500]+0.405A (k 1) ) 1, whch provded more extreme weghts than the orgnal settng aforementoned. In partcular, for n = 1, 000, under the weak volaton scenaro, the mean of the weghts across smulatons ranged between 0.9659 and 1.0660, under the scenaro provded by Xao et al. Xao et al. (2010) between 0.6203 and 18.3, whle under the strong volaton scenaro between 0.3390 and 51.77. We consdered the set of stablzed nverse probablty weghts as the target weghts of nterest. Truncated weghts were obtaned by truncatng the set of target weghts across dfferent quantles defned as a grd of twenty equally-spaced values between 0.8 and 1. OPW were obtaned by solvng (3.1)- (3.3) wth ξ equal to the varance of the truncated weghts for each of the dfferent levels of truncaton. In each smulated sample, we estmated the causal parameter of nterest by usng the followng Cox regresson model λ,k (t A (k), V ) = λ 0 (t)exp(β 1 A (k) + β 2 A (k 1) + β 3 V ) (4.2) weghted by the truncated weghts and by the set of OPW. We used a robust estmator of the standard error Austn (2016). We estmated the stablzed nverse probablty of treatment weghts usng the R package pw van der Wal et al. (2011), and we solved the quadratc problem (3.1)-(3.3) by usng the package Ipoptr Wächter and Begler (2005) and the MA57 sparse symmetrc system as lne-search method HSL (2017). We provde the R code for the smulatons as Supportng Informaton. 11

4.2 Results The top-left panels of Fgure 1 and Fgure 2 show the MSE rato between the hazard rato estmated wth truncated weghts and that estmated wth OPW across truncaton levels when n = 200 and n = 1, 000, respectvely. The optmally weghted MSCM performed better than the truncated MSCM at all truncaton levels, especally between the 98th and the 99.5th percentle of the dstrbuton of the target weghts. In partcular, the value between the 98th and the 99.5th percentle for whch the MSE rato s largest s equal to the 99.5th percentle when n = 200 and equal to the 99th percentle when n = 1, 000. When truncatng at lower percentles, optmally weghted and truncated MSCM performed equally n small samples (n = 200), but not n larger samples (n = 1, 000) where the optmally weghted MSCM showed a substantally smaller MSE. At the lowest truncaton levels and wth the smaller sample sze, the dstrbutons of truncated weghts and that of the OPW were almost unformly dstrbuted, resultng n a smlar MSE. In the larger samples, the bas of the truncated MSCM ncreased wth ncreasng levels of truncaton whle that of the optmally weghted MSCM remaned almost constant. The top-rght panels of Fgure 1 and Fgure 2 show the MSE (sold lne), varance (dotted), and bas (dashed) of the estmated hazard rato that uses OPW across truncaton levels. Settng the constant ξ based on hgh-percentle truncaton weghts mproves the behavour of the MSCM by ntroducng small bas but consderably ncreasng precson. The mean solvng tme of the algorthm was below 0.22 seconds n the smaller samples and below 1.0 second n the larger samples (bottom-left panels of Fgure 1 and Fgure 2). The standardzed mean Lagrange multpler assocated wth constrant (3.2) partally reflected the behavour of the bas (bottom-rght panels of Fgure 1 and Fgure 2), and t may be used to choose ξ as dscussed n Secton 3.1. Fgure 3 shows scatter-plots and hstograms of the mean truncated weghts, (X-axs), and the mean OPW, (Y-axs), across smulatons when n = 1, 000 for each of the four 12

thresholds, 1, 0.99, 0.85 and 0.80. Weghts were frst scaled to have mean 0 and varance 1 and then log-transformed. In partcular, the top-left panel of Fgure 3 shows the orgnal untruncated dstrbuton of the weghts, whch was asymmetrc wth a long rght tal. For the remanng thresholds, 0.99, 0.85 and 0.80, truncated weghts showed a wder dstrbuton compared wth OPW. For nstance, when the threshold was set to be equal to 0.80, the set of OPW ranged between -0.0188 and 0.0504, whle that of the truncated weghts between -0.1705 and 0.2767. The left panels of Fgure 4 show the MSE rato between the hazard rato estmated wth truncated weghts and that estmated wth OPW across truncaton levels when n = 1, 000 under the weak and strong volaton scenaros. Under weak volaton of the postvty assumpton we observed no dfferences between the truncated MSCM and the optmally weghted MSCM. Under the strong volaton scenaro, however, the optmally weghted MSCM showed a consstently smaller MSE across truncaton levels, and a greater precson compared wth the scenaro presented n Fgure 2,.e., the MSE ratos were larger under the strong scenaro. The rght panels of Fgure 4 show the MSE (sold lne), varance (dotted), and bas (dashed) of the estmated hazard rato that uses OPW across truncaton levels. Under weak volaton, no dfferences were seen across truncaton levels, whle under strong volaton, smlarly to the scenaro presented n Fgure 2, the constant ξ based on hgh-percentle truncaton weghts mproved the behavour of the MSCM by ntroducng small bas but sgnfcantly ncreasng precson. We conclude that OPW were more narrowly dstrbuted, thus leadng to more precse nferences, than truncated weghts across all consdered thresholds, and that OPW outperformed truncated weghts across both sample szes and dfferent scenaros of practcal postvty volaton, especally under strong volatons of the practcal postvty assumpton. 13

MSE ratos 0.8 0.9 1.0 1.1 1.2 1.3 0.80 0.85 0.90 0.95 1.00 Truncaton level Bas, Var, MSE, OPW HR 0.00 0.10 0.20 0.80 0.85 0.90 0.95 1.00 Truncaton level Mean computatonal tme (sec) 0.14 0.16 0.18 0.20 0.80 0.85 0.90 0.95 1.00 Truncaton level Lagrange Multpler 0.2 0.6 1.0 1.4 0.80 0.85 0.90 0.95 1.00 Truncaton level Fgure 1: Sample sze n = 200. Top-left panel: rato between the observed mean squared error of the estmated hazard rato that uses truncated weghts and that of the estmated hazard rato that uses OPW across truncaton levels. Top-rght panel: mean squared error (sold lne), varance (dotted), and bas (dashed) of the estmated hazard rato that uses OPW across truncaton levels. The value between the 98th and the 99.5th percentle for whch the MSE rato s largest s the 99.5th percentle. Bottom-left: mean computatonal tme n seconds to solve the quadratc problem across levels of truncaton. Bottom-rght panel: mean standardzed Lagrange multpler assocated wth constrant (3.2) across truncaton levels. 14

MSE ratos 0.8 0.9 1.0 1.1 1.2 0.80 0.85 0.90 0.95 1.00 Truncaton level Bas, Var, MSE, OPW HR 0.00 0.02 0.04 0.06 0.80 0.85 0.90 0.95 1.00 Truncaton level Mean computatonal tme (sec) 0.65 0.75 0.85 0.80 0.85 0.90 0.95 1.00 Truncaton level Lagrange Multpler 0.0 0.5 1.0 1.5 0.80 0.85 0.90 0.95 1.00 Truncaton level Fgure 2: Sample sze n = 1, 000. Top-left panel: rato between the observed mean squared error of the estmated hazard rato that uses truncated weghts and that of the estmated hazard rato that uses OPW across truncaton levels. Top-rght panel: mean squared error (sold lne), varance (dotted), and bas (dashed) of the estmated hazard rato that uses OPW across truncaton levels. The value between the 98th and the 99.5th percentle for whch the MSE rato s largest s the 99th percentle. Bottom-left: mean computatonal tme n seconds to solve the quadratc problem across levels of truncaton. Bottom-rght panel: mean standardzed Lagrange multpler assocated wth constrant (3.2) across truncaton levels. 15

Truncaton level: 1 Truncaton level: 0.99 3 0.2 0.1 2 OPW 0.0 1 0.1 0.2 0 0 1 2 0.3 0.3 3 0.2 Truncaton level: 0.85 0.1 0.0 0.1 0.2 Truncaton level: 0.80 0.2 0.1 0.1 OPW 0.2 0.0 0.0 0.1 0.1 0.1 0.0 0.1 0.2 0.1 Truncated 0.0 0.1 0.2 Truncated Fgure 3: Scatter-plots and hstograms of the mean truncated weghts, (X-axs), and the mean OPW, (Y-axs), across smulatons when n = 1, 000 for each of the four thresholds, 1, 0.99, 0.85 and 0.80. Weghts were frst scaled to have mean 0 and varance 1 and then log-transformed. 16

Weak volaton MSE ratos 0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 Truncaton level Bas, Var, MSE, OPW HR 0.000 0.010 0.020 0.030 0.80 0.85 0.90 0.95 1.00 Truncaton level Strong volaton MSE ratos 0.8 1.0 1.2 1.4 0.80 0.85 0.90 0.95 1.00 Truncaton level Bas, Var, MSE, OPW HR 0.00 0.05 0.10 0.15 0.20 0.80 0.85 0.90 0.95 1.00 Truncaton level Fgure 4: Left panels: MSE ratos between the hazard rato estmated wth truncated weghts and that estmated wth OPW across truncaton levels when n = 1, 000 under the scenaros of weak and strong volatons of the postvty assumpton. Rght panels: mean squared error (sold lne), varance (dotted), and bas (dashed) of the estmated hazard rato that uses OPW across truncaton levels under the scenaros of weak and strong volatons of the postvty assumpton. 17

5 HIV treatment ntaton on tme to death The HIV epdemc s a leadng global burden wth major economc and socal consequences. Drug njecton s responsble for more than 10% of all HIV nfectons globally Mathers et al. (2008). Consequently, the effcacy of the HIV treatment s of prmary concern when treatng people who nject drugs (PWID). Several studes have shown the benefcal effect of HIV treatment among PWID Wood et al. (2008); Mathers et al. (2010). We evaluated the effect of HIV treatment ntaton on tme to death among PWID. To control for tmedependent confoundng and nformatve censorng we used OPW obtaned by solvng (3.1)- (3.3). We computed the set of target weghts as the product between the nverse probablty of treatment and censorng weghts Robns et al. (2000). As dscussed n Secton 3.1, we truncated the set of target weghts at dfferent truncaton levels, computed the varance of the resultng truncated weghts and used t as a value for ξ n constrant (3.2). 5.1 Study populaton We used prospectve observatonal data from the Swedsh InfCare HIV regstry Sönnerborg (2017), whch contans soco-demographcal, clncal and vrologcal nformaton, collected longtudnally from all clncs that treat people lvng wth HIV. The number of people dagnosed between 1987 and 2017 n Sweden was 10,015. Our study was restrcted to those who were alve, HIV treatment-nave and under follow-up after January 1996, when HIV treatment became readly avalable n the country. We excluded 1,055 people who had both ther frst and last vst before January 1996 (due to emgraton or death) and 1,187 who started HIV treatment before January 1996. The baselne vst was set equal to the frst avalable vst for each person. For those enrolled n the HIV montorng program before January 1996, t was set at the frst avalable vst after January 1996. People lvng wth HIV were 18

montored and vsted repeatedly from baselne onward contrbutng from a mnmum of 2 to a maxmum of 102 vsts. At each vst, data on soco-demographcal characterstcs, type of HIV treatment, laboratory measurements ncludng absolute CD4 cell count and HIV-RNA load were collected. HIV treatment was defned as a combnaton of at least 3 drugs, classfed n 4 major categores: based on non-nucleosde reverse-transcrptasenhbtors, rtonavr-boosted protease nhbtors, protease nhbtors, and others. Out of the 7,773 people, 459 lacked nformaton on absolute CD4 cell count, 199 had only one absolute CD4 cell count observaton, and 1,110 dd not have suffcent nformaton on the route of nfecton. We consdered only people lvng wth HIV nfected by njectng drugs. The fnal sample was comprsed of 538 treatment-nave PWID and a total of 9,247 clncal vsts. 5.2 Treatment and censorng models We used logstc regresson to estmate the set of target weghts that control for tmedependent confoundng and nformatve censorng. We used tme-nvarant and tmedependent confounders to construct the set of stable nverse probablty of treatment weghts as shown n (2.2). Specfcally, we dentfed the followng varables as potental tme-nvarant confounders of the effect of HIV treatment ntaton on tme to death: baselne absolute CD4 cell count (<200, 200-350, 350-500, and >500 cells/ml); baselne HIV-RNA vral load ( 100.000 vs >100.000 copes/ml) ; age at baselne (0-30, 31-40, 41-50, and >50 years); gender (female vs male); country of brth (Sweden vs. outsde Sweden); type of HIV treatment regmen (4 drug categores) and calendar year of HIV treatment ntaton. We consdered the followng potental tme-dependent confounders: absolute CD4 cell count, modelled as cubc splnes wth 3 knots placed at the 25th, 50th and 75th percentles, cumulatve follow-up tme, modelled as a cubc splnes wth 5 knots 19

at 5th, 25th, 50th, 75th and 95th percentles, undetectable HIV-RNA vral load and HIV treatment at prevous tme ponts. Undetectable HIV-RNA vral load was consdered undetectable f t was lower than 50 copes/ml. We constructed the set of nverse probablty of censorng weghts smlarly. Fnally, we obtaned the set of target stablzed weghts as the product between nverse probablty of treatment and censorng weghts. 5.3 Results We consdered the followng MSCM to evaluate the effect of HIV treatment on tme to death among PWID, λ,k (t A (k), V ) = λ 0 (t)exp(β 1 A (k) + β 2 A (k 1) + β 3 V ) (5.1) where V was the baselne absolute CD4 cell count for each PWID. We estmated the MSCM n (5.1) by a weghted Cox proportonal hazard model. The unweghted estmated hazard rato was equal to HR= 1.65 wth a robust estmate for the standard error equal to 0.36, suggestng the presence of confoundng. When usng the set of target weghts constructed as prevously descrbed, the estmated hazard rato was equal to 0.68, suggestng a protectve effect of the HIV treatment on tme to death. The standard error was equal to 0.74, more than twce that of the unweghted analyss. In partcular, when analyzng the dstrbuton of the target weghts, few subjects (n=2) were assgned a weght of more than 500, showng a possble practcal volaton of the postvty assumpton, although not large. To allevate the presence of extreme weghts we computed the set of OPW and used t to estmate the hazard rato. Specfcally, we consdered a grd of truncaton levels between 0.8 and 1 and computed the truncated weghts. We obtaned the set of OPW by settng ξ equal to the varance of the truncated weghts for each of the consdered trun- 20

caton levels. When the truncaton level was equal to the 99.5th percentle of the target weght dstrbuton, the set of OPW had a mnmum value of 0.86 and a maxmum value of 27. Fgure 5 shows the value of the estmated hazard rato for the rsk of death among PWID and 95% confdence nterval across the consdered truncaton levels. Smlarly to our smulatons results, we based the conclusons of ths study on an estmated HR of 0.71 (95% CI 0.19-2.73, standard error equal to 0.68), obtaned by usng OPW wth ξ set to be equal to the varance of the truncated weghts at 99.5th level. We concluded that wth adequate support, PWID can beneft from HIV treatment. 6 Conclusons In ths paper, we ntroduced OPW to the estmaton of causal effects of tme-varyng treatments on survval outcomes wth MSCM under practcal volaton of the postvty assumpton. Xao et al. Xao et al. (2013) and Cole and Hernán Cole and Hernán (2008) suggested truncatng the weghts at hgh percentles of ther observed sample dstrbuton. In our smulatons, OPW outperformed the truncated weghts across all the consdered truncaton levels, especally at hgh percentles. The results were smlar n both small and large samples. In addton, the results showed that OPW were generally more narrowly dstrbuted than truncated weghts across all consdered threshold levels and that OPW outperformed truncated weghts across all scenaros of practcal volaton of the postvty assumpton, especally under strong volaton. Ths suggests that OPW may be used nstead of truncated weghts regardless of the sample sze and the strength of practcal postvty volaton. By usng OPW, we showed the benefcal effect of treatment on tme to death among people n Sweden who lve wth HIV after beng nfected by njectng drugs. We consdered MSCM, but other methods, such as pooled logstc regresson, can also 21

3.0 2.5 HR for the rsk of death 2.0 1.5 1.0 0.5 0.0 0.80 0.85 0.90 0.95 1.00 Truncaton level Fgure 5: Estmated hazard rato for the rsk of death and 95% confdence nterval comparng treated vs. untreated ndvduals across levels of truncaton of the target stablzed weghts. 22

be used Robns et al. (2000). Dfferent methods to estmate the standard error may also be appled, such as the bootstrap. In any gven settng these may be preferable to the robust estmator we used n the present paper. The optmzaton problem (3.1)-(3.3) and ts nterpretaton reman unchanged whchever estmator s used. We show ths by provdng addtonal smulatons as Supportng Informaton. We derved the target weghts by usng logstc regresson. However, a number of alternatve technques have been proposed Karm et al. (2017); Lee et al. (2010, 2011). We consdered scenaros where the treatment and censorng models were well specfed. When they are suspected to be msspecfed, Karm et al. Karm et al. (2017) suggested usng boosted regresson and classfcaton trees. These can be used to estmate the set of target weghts employed n (3.1)-(3.3). The convex optmzaton problem (3.1)-(3.3) can be solved by usng exstng software, lke gurob, quadprog, Ipoptr, and nloptr packages n R. The sample sze has an mpact on the computaton tme of the proposed method. For nstance, n our smulatons the average tme was 0.2 seconds wth n = 200 and about 1 second wth n = 1, 000. Decreasng ξ ncreases the computatonal tme and may mpact the feasblty of the problem. Wth small values of ξ, an optmal soluton may not exst. In ths case, we suggest ncreasng the value of ξ. Future work may focus on extensons and applcatons of OPW to a varety of other settngs. For example, they may prove useful when comparng dynamc treatment regmes, where treatment decsons are made based on the tme-varyng state of ndvdual patents and weghts are appled to control for tme-dependent confoundng, and nformatve and artfcal censorng Hernán et al. (2006, 2009). Further work may mprove the robustness to msspecfcaton of the treatment model and volatons of the postvty assumpton. 23

References Athey, S., G. W. Imbens, and S. Wager (2016). Approxmate resdual balancng: De-based nference of average treatment effects n hgh dmensons. arxv preprnt arxv:1604.07125. Austn, P. C. (2016). Varance estmaton when usng nverse probablty of treatment weghtng (ptw) wth survval analyss. Statstcs n Medcne 35 (30), 5642 5655. Blackwell, M. (2013). A framework for dynamc causal nference n poltcal scence. Amercan Journal of Poltcal Scence 57 (2), 504 520. Cole, S. R. and M. A. Hernán (2008). Constructng nverse probablty weghts for margnal structural models. Amercan Journal of Epdemology 168 (6), 656 664. Cole, S. R., M. A. Hernán, J. B. Margolck, M. H. Cohen, and J. M. Robns (2005). Margnal structural models for estmatng the effect of hghly actve antretrovral therapy ntaton on CD4 cell count. Amercan Journal of Epdemology 162 (5), 471 478. Danel, R., S. Cousens, B. De Stavola, M. Kenward, and J. Sterne (2013). Methods for dealng wth tme-dependent confoundng. Statstcs n Medcne 32 (9), 1584 1618. Hanmueller, J. (2012). Entropy balancng for causal effects: A multvarate reweghtng method to produce balanced samples n observatonal studes. Poltcal Analyss 20 (1), 25 46. Hernán, M. A., B. Brumback, and J. M. Robns (2000). Margnal structural models to estmate the causal effect of zdovudne on the survval of HIV-postve men. Epdemology 11 (5), 561 570. 24

Hernán, M. A., B. Brumback, and J. M. Robns (2001). Margnal structural models to estmate the jont causal effect of nonrandomzed treatments. Journal of the Amercan Statstcal Assocaton 96 (454), 440 448. Hernán, M. A., E. Lanoy, D. Costaglola, and J. M. Robns (2006). Comparson of dynamc treatment regmes va nverse probablty weghtng. Basc & Clncal Pharmacology & Toxcology 98 (3), 237 242. Hernán, M. A., M. McAdams, N. McGrath, E. Lanoy, and D. Costaglola (2009). Observaton plans n longtudnal studes wth tme-varyng treatments. Statstcal Methods n Medcal Research 18 (1), 27 52. Hernan, M. A. and J. M. Robns (2010). Causal nference. CRC Boca Raton, FL. HIV-Causal Collaboraton (2011). When to ntate combned antretrovral therapy to reduce mortalty and AIDS-defnng llness n HIV-nfected persons n developed countres: an observatonal study. Annals of Internal Medcne 154 (8), 509. HSL (2017). A collecton of Fortran codes for large scale scentfc computaton. http: //www.hsl.rl.ac.uk. Imbens, G. W. and D. B. Rubn (2015). Causal nference n statstcs, socal, and bomedcal scences. Cambrdge Unversty Press. Kang, J. D. and J. L. Schafer (2007). Demystfyng double robustness: A comparson of alternatve strateges for estmatng a populaton mean from ncomplete data. Statstcal Scence 22 (4), 523 539. Karm, M. E., J. Petkau, P. Gustafson, H. Tremlett, and T. B. S. Group (2017). On the applcaton of statstcal learnng approaches to construct nverse probablty weghts n 25

margnal structural Cox models: Hedgng aganst weght-model msspecfcaton. Communcatons n Statstcs-Smulaton and Computaton 0 (0), 1 30. Lee, B. K., J. Lessler, and E. A. Stuart (2010). Improvng propensty score weghtng usng machne learnng. Statstcs n Medcne 29 (3), 337 346. Lee, B. K., J. Lessler, and E. A. Stuart (2011). Weght trmmng and propensty score weghtng. PloS one 6 (3), e18174. Mathers, B. M., L. Degenhardt, H. Al, L. Wessng, M. Hckman, R. P. Mattck, B. Myers, A. Ambekar, S. A. Strathdee, et al. (2010). HIV preventon, treatment, and care servces for people who nject drugs: a systematc revew of global, regonal, and natonal coverage. The Lancet 375 (9719), 1014 1028. Mathers, B. M., L. Degenhardt, B. Phllps, L. Wessng, M. Hckman, S. A. Strathdee, A. Wodak, S. Panda, M. Tyndall, A. Toufk, et al. (2008). Global epdemology of njectng drug use and HIV among people who nject drugs: a systematc revew. The Lancet 372 (9651), 1733 1745. Neugebauer, R., M. J. van der Laan, M. M. Joffe, and I. B. Tager (2007). Causal nference n longtudnal studes wth hstory-restrcted margnal structural models. Electronc Journal of Statstcs 1, 119. Robns, J., M. Sued, Q. Le-Gomez, and A. Rotntzky (2007). Comment: Performance of double-robust estmators when nverse probablty weghts are hghly varable. Statstcal Scence 22 (4), 544 559. Robns, J. M. (2000). Margnal structural models versus structural nested models as tools 26

for causal nference. In Statstcal models n epdemology, the envronment, and clncal trals, pp. 95 133. Sprnger. Robns, J. M., M. A. Hernan, and B. Brumback (2000). Margnal structural models and causal nference n epdemology. Epdemology 11 (5), 550 560. Robns, J. M., A. Rotntzky, and L. P. Zhao (1995). Analyss of semparametrc regresson models for repeated outcomes n the presence of mssng data. Journal of the Amercan Statstcal Assocaton 90 (429), 106 121. Rosenbaum, P. R. and D. B. Rubn (1983). The central role of the propensty score n observatonal studes for causal effects. Bometrka 70 (1), 41 55. Santacatterna, M. and M. Botta (2017). Optmal probablty weghts for nference wth constraned precson. Journal of the Amercan Statstcal Assocaton 0 (ja), 0 0. Scharfsten, D. O., A. Rotntzky, and J. M. Robns (1999). Adjustng for nongnorable drop-out usng semparametrc nonresponse models. Journal of the Amercan Statstcal Assocaton 94 (448), 1096 1120. Sönnerborg, A. (2017). InfCare HIV database. http://nfcare.se/hv/sv/. Accessed: 2017-08-16. Stürmer, T., K. J. Rothman, J. Avorn, and R. J. Glynn (2010). Treatment effects n the presence of unmeasured confoundng: dealng wth observatons n the tals of the propensty score dstrbuton - a smulaton study. Amercan Journal of Epdemology 172 (7), 843 854. van der Wal, W. M., R. B. Geskus, et al. (2011). Ipw: an R package for nverse probablty weghtng. Journal of Statstcal Software 43 (13), 1 23. 27

Wächter, A. and L. T. Begler (2005, Aprl). On the mplementaton of an nteror-pont flter lne-search algorthm for large-scale nonlnear programmng. Mathematcal Programmng 106 (1), 25 57. Wood, E., T. Kerr, M. W. Tyndall, and J. S. Montaner (2008). A revew of barrers and facltators of HIV treatment among njecton drug users. AIDS 22 (11), 1247 1256. Xao, Y., M. Abrahamowcz, and E. E. Moode (2010). Accuracy of conventonal and margnal structural Cox model estmators: a smulaton study. The Internatonal Journal of Bostatstcs 6 (2). Xao, Y., E. E. Moode, and M. Abrahamowcz (2013). Comparson of approaches to weght truncaton for margnal structural Cox models. Epdemologc Methods 2 (1), 1 20. Zubzarreta, J. R. (2015). Stable weghts that balance covarates for estmaton wth ncomplete outcome data. Journal of the Amercan Statstcal Assocaton 110 (511), 910 922. 28