Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Similar documents
Case-Cohort Approach to Assessing Immunological Correlates of Risk, With Application to Vax004. Biostat 578A: Lecture 11

HIV Vaccine Trials. 1. Does vaccine prevent HIV infection? Vaccine Efficacy (VE) = 1 P(infected vaccine) P(infected placebo)

A novel approach to estimation of the time to biomarker threshold: Applications to HIV

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

Generalized Estimating Equations for Depression Dose Regimes

Strategies for handling missing data in randomised trials

Bayesian approaches to handling missing data: Practical Exercises

Comparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Evaluating health management programmes over time: application of propensity score-based weighting to longitudinal datajep_

MLE #8. Econ 674. Purdue University. Justin L. Tobias (Purdue) MLE #8 1 / 20

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

THE UNIVERSITY OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF STATISTICAL ANALYSIS MODELING APPROACHES FOR STEPPED-

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

CAUSAL INFERENCE IN HIV/AIDS RESEARCH: GENERALIZABILITY AND APPLICATIONS

Package speff2trial. February 20, 2015

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait.

A Comparison of Methods for Determining HIV Viral Set Point

The Statistical Analysis of Failure Time Data

Statistical Science Issues in HIV Vaccine Trials: Part I

How to analyze correlated and longitudinal data?

GENERALIZED ESTIMATING EQUATIONS FOR LONGITUDINAL DATA. Anti-Epileptic Drug Trial Timeline. Exploratory Data Analysis. Exploratory Data Analysis

Department of Statistics, Biostatistics & Informatics, University of Dhaka, Dhaka-1000, Bangladesh. Abstract

On the Use of Local Assessments for Monitoring Centrally Reviewed Endpoints with Missing Data in Clinical Trials*

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Instrumental Variables Estimation: An Introduction

Longitudinal data monitoring for Child Health Indicators

Comparing Dynamic Treatment Regimes Using Repeated-Measures Outcomes: Modeling Considerations in SMART Studies

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

Section on Survey Research Methods JSM 2009

Implementing double-robust estimators of causal effects

MS&E 226: Small Data

Propensity scores: what, why and why not?

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Estimation of delay to diagnosis and incidence in HIV using indirect evidence of infection dates

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Supplementary Appendix

COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA

Models for HSV shedding must account for two levels of overdispersion

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

On Regression Analysis Using Bivariate Extreme Ranked Set Sampling

In this module I provide a few illustrations of options within lavaan for handling various situations.

Statistical data preparation: management of missing values and outliers

Analysis of TB prevalence surveys

Use of GEEs in STATA

Studying the effect of change on change : a different viewpoint

Analyzing incomplete longitudinal clinical trial data

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

How should the propensity score be estimated when some confounders are partially observed?

Imputation approaches for potential outcomes in causal inference

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

Eliciting a Counterfactual Sensitivity Parameter

Analysis of Hearing Loss Data using Correlated Data Analysis Techniques

Module 14: Missing Data Concepts

MEA DISCUSSION PAPERS

THE ANALYSIS OF METHADONE CLINIC DATA USING MARGINAL AND CONDITIONAL LOGISTIC MODELS WITH MIXTURE OR RANDOM EFFECTS

Analysis of left-censored multiplex immunoassay data: A unified approach

Complier Average Causal Effect (CACE)

A re-randomisation design for clinical trials

Session 9. Correlate of Protection (CoP) (Plotkin and Gilbert, CID 2012)

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Modelling variable dropout in randomised controlled trials with longitudinal outcomes: application to the MAGNETIC study

EU regulatory concerns about missing data: Issues of interpretation and considerations for addressing the problem. Prof. Dr.

1 Missing Data. Geert Molenberghs and Emmanuel Lesaffre. 1.1 Introduction

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Mediation Analysis With Principal Stratification

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Should a Normal Imputation Model Be Modified to Impute Skewed Variables?

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

Recommendations for the Primary Analysis of Continuous Endpoints in Longitudinal Clinical Trials

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials

Estimating the impact of community-level interventions: The SEARCH Trial and HIV Prevention in Sub-Saharan Africa

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

SEMI-MARKOV MULTI-STATE MODELING OF HUMAN PAPILLOMAVIRUS

Biostatistics II

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Kelvin Chan Feb 10, 2015

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

Pharmaceutical Statistics Journal Club 15 th October Missing data sensitivity analysis for recurrent event data using controlled imputation

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Comparisons of Dynamic Treatment Regimes using Observational Data

Supplementary Material*

Methods for adjusting survival estimates in the presence of treatment crossover a simulation study

Transcription:

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints p.1/40

Data Collected in Phase IIb/III Vaccine Trial Longitudinal data from all randomized subjects: HIV tests Vaccine-induced immune responses to a panel of HIV isolates Covariates (e.g., risk behavior) Longitudinal data from infected subjects: 1. Viral loads/cd4 cell counts 2. ART initiation 3. Clinical events 4. Immune responses to the infecting HIV strain and to a panel of HIV isolates 5. Genetic sequences (and phenotypes) of (multiple clones of) infecting HIVs Analysis of Vaccine Effects on Post-Infection Endpoints p.2/40

Approach to Post-infection Endpoints for Assessing V E P /V E I Assess vaccine effects both in the randomized cohort and in the infected subcohort Clinical endpoints (HIV-related conditions, WHO stage 2/3, etc.) Surrogate endpoints for AIDS and secondary transmission (viral load, CD4, etc.) Analytic challenges of surrogate endpoints: Surrogate effects may not predict clinical effects ART use obscures direct assessment of mid/late vaccine effects on viral load/cd4 count Post-randomization selection bias Gilbert PB, DeGruttola V, Hudgens M, Self S, Hammer S, Corey L. (2003). JID 21:2933-2947. Analysis of Vaccine Effects on Post-Infection Endpoints p.3/40

Time-to-Event Late Endpoints ART initiation HIV-related clinical events Composite endpoints ART initiation or viral failure > 100,000 copies/ml ART initiation or CD4 failure < 200 copies/ml E.g., surrogate efficacy parameter V E C = Percent reduction (vaccine vs placebo) in composite endpoint rate by 2 years Analysis of Vaccine Effects on Post-Infection Endpoints p.4/40

Conduct Both Intent-to-Treat (ITT) and Infected Subcohort Analyses Pros randomized cohort analysis: Intent-to-treat (ITT) [unbiased] Interpretation Risks/benefits determined for all vaccinees Cons: Cannot separate vaccine effects on infection and post infection Follow-up period for counting endpoints restricted to infection monitoring period Difficult to follow randomized cohort long-term Analysis of Vaccine Effects on Post-Infection Endpoints p.5/40

Why Must Restrict Follow-up to the Infection Monitoring Period Suppose all subjects followed for 3 years for infection, and all infected subjects are followed 5 more years for a disease endpoint Ideal design would follow all subjects 8 years from entry Too expensive to follow 3000-10,000 subjects for 8 years Kaplan-Meier analysis that censors those uninfected at 3 years would underestimate the survival probability Alternatively, assuming that all subjects uninfected at 3 years would not experience the disease endpoint would overestimate the survival probability Kaplan-Meier analysis following everyone for 3 years only is unbiased Analysis of Vaccine Effects on Post-Infection Endpoints p.6/40

Conduct Both ITT and Infected Subcohort Analyses Pros infected subcohort analysis: Interest to infer vaccine effects on HIV progression in HIV infected persons Feasible to follow the infected subcohort long-term [ 10% of randomized cohort] Cons infected subcohort analysis: Post-randomization selection bias: Infected vaccine and infected placebo groups may differ in other ways besides vaccine/placebo assignment Conduct sensitivity analyses (causal inference methods) Analysis of Vaccine Effects on Post-Infection Endpoints p.7/40

Analysis of Longitudinal Viral Loads and CD4 Cell Counts Notation: T scheduled fixed visit times post-infection Dx Y t = log 10 viral load at visit t X t = all covariates measured up to time t R t = indicator of whether the subject attends visit t and is ART-naive at visit t Estimands of interest: E[Y t Vaccine] E[Y t Placebo] E[Y t Vaccine,Xt other ] E[Y t Placebo,Xt other ] for a hypothetical situation where no one starts ART during follow-up Analysis of Vaccine Effects on Post-Infection Endpoints p.8/40

Analysis of Longitudinal Viral Loads and CD4 Cell Counts Consider analyses that count all Y t s missing that are measured after ART initiation Types of missing data Missing Completely at Random (MCAR): The probability of missing Y does not depend on Y (it may depend on X) Missing at Random (MAR): The probability of missing Y depends on observed Y s, but not on unobserved Y s Not Missing at Random (NMAR): The probability of missing Y depends on unobserved Y s Analysis of Vaccine Effects on Post-Infection Endpoints p.9/40

Analysis of Longitudinal Viral Loads and CD4 Cell Counts Two ways that Y s may be missing: ART initiation and study drop-out Study drop-out may be MCAR In Vax004, viral load over time did not predict the rate of study drop-out: RR = 1.23 per 1log 10 higher viral load, 95% CI 0.88-1.72, p = 0.23 Starting ART is not MCAR In Vax004, viral load predicted the rate of starting ART: Viral load at 1 month: RR = 1.57 per 1log 10 higher viral load, 95% CI 1.22-2.01, p = 0.0003 Viral load over time: RR = 1.88 per 1log 10 higher viral load, 95% CI 1.51-2.34, p < 0.0001 Analysis of Vaccine Effects on Post-Infection Endpoints p.10/40

Methods for Addressing the MAR Data Weighted Generalized Estimating Equations (GEE) methods Likelihood based linear mixed effects (LME) models Other Approaches Analysis of Vaccine Effects on Post-Infection Endpoints p.11/40

Notation for Complete Data T visit times common to all n subjects Y i = (Y i1,,y it ) : Complete data response vector X i = (X i1,,x it ) : Covariate matrix X it a p 1 covariate vector at visit time t Analysis of Vaccine Effects on Post-Infection Endpoints p.12/40

Notation for Observed Data Subject i has data at visits 1,,T i,1 T i T Y o i = (Yi1 o,,y o it i ) : Observed response vector Xi o = (Xi1 o,,xo it i ) : Observed covariate matrix X o it a p 1 covariate vector at time t Analysis of Vaccine Effects on Post-Infection Endpoints p.13/40

Weighted GEE Approach Estimand of interest: E(Y it X it ) = µ it Generalized Linear Model (GLM): g(µ it ) = X it β g a known link function For Y continuous, e.g., log 10 viral load, g(x) = x, so that the GLM is E(Y it X it ) = X itβ Analysis of Vaccine Effects on Post-Infection Endpoints p.14/40

Weighted GEE Approach Example models, with V i = 1 if vaccine and V i = 0 if placebo; t i a fixed visit time among 1,,T i X it = (1,t i,v i ) : E(Y it X it ) = β 0 + β 1 t i + β 2 V i β 2 = E[Y it V i = 1,t i ] E[Y it V i = 0,t i ] [Constant vaccine effect over time] X it = (1,t i,v i,t i V i ) : E(Y it X it ) = β 0 +β 1 t i +β 2 V i +β 3 t i V i β 2 + β 3 t i = E[Y it V i = 1,t i ] E[Y it V i = 0,t i ] [Vaccine effect changing linearly over time] Analysis of Vaccine Effects on Post-Infection Endpoints p.15/40

Weighted GEE Methods for Addressing MAR Data Standard GEE of Liang and Zeger (Biometrika, 1986): n D o i (Xi o,β)(vi o ) 1 [Yi o µ i o (β)] = 0 i=1 where D o i (X i o,β) = µo i / β is a T i p matrix and Vi o a working covariance matrix for Yi o i = diag(var(y 1/2 it ))Ci o 1/2 (ρ)diag(var(yit )) [T i T i matrix] (ρ) is a working correlation matrix depending on an unknown vector parameter ρ, which is estimated V o C o i is Analysis of Vaccine Effects on Post-Infection Endpoints p.16/40

Weighted GEE Methods for Addressing MAR Data Examples Independent working correlation: C o i = identity matrix Exchangeable working correlation: C o i = 1 s on the diagonal and common correlation ρ for all off-diagonal elements Auto-regressive-1: C o i = 1 s on the diagonal and ρ k for k steps off the diagonal (logical choice for repeated measures data) Analysis of Vaccine Effects on Post-Infection Endpoints p.17/40

Weighted GEE Methods for Addressing MAR Data Solution β obtained by iteratively reweighted estimation of β (McCullogh and Nelder, 1989, Generalized Linear Models) Variance of β estimated by sandwich variance estimator" Analysis of Vaccine Effects on Post-Infection Endpoints p.18/40

Weighted GEE Methods for Addressing MAR Data Standard GEE provides unbiased estimation of β under MCAR, but not under MAR GEE not valid for vaccine trials Weighted GEE valid for MAR data (Robins, Rotnitzky, Zhao (RRZ), 1995, JASA) Analysis of Vaccine Effects on Post-Infection Endpoints p.19/40

Weighted GEE Methods for Addressing Weighted GEE (RRZ): n D i(x i,β)(v i ) 1 W i [Y i µ i (β)] = 0 i=1 MAR Data where D i (X i,β) = µ i / β and V i = A i C i A i is a working covariance matrix for Y i W i is a T T diagonal matrix of time-specific weights: W i = diag(r i1 w i1,,r it w it ) R it = I(Y it observed) = indicator of whether ith subject has a pre-art value at visit t w it > 0 for observed pre-art visits; = 0 o/w Analysis of Vaccine Effects on Post-Infection Endpoints p.20/40

Weighted GEE Methods for Addressing MAR Data w it = reciprocal of the probability the ith subject is observed (pre-art) at the tth visit If observed with probability 1, assign weight 1 If observed with probability 1/2, assign weight 2 If observed with probability 1/10, assign weight 10 Heuristically, reconstruct the complete dataset by weighting the observed data Analysis of Vaccine Effects on Post-Infection Endpoints p.21/40

Weighted GEE Methods for Addressing MAR Data w it is estimated using a missing data model Let λ it = Pr(R it = 1 R i(t 1) = 1,X i,y i,α) For the first time point, assume R i1 = 1 and set λ i1 = 1 Second and later time-points: The MAR assumption implies λ it = Pr(R it = 1 R i(t 1) = 1,X i,(y i1,,y i(t 1) ),α) i.e., Missingness depends only on observed data and a parametric model with unknown parameter α Analysis of Vaccine Effects on Post-Infection Endpoints p.22/40

Weighted GEE Methods for Addressing MAR Data Example missingness model: logit{λ it (α)} = Z itα where Z it may include anything (covariates and/or responses) observed prior to time t The MLE of α is computed, and λ it is estimated by λit = Z it α Analysis of Vaccine Effects on Post-Infection Endpoints p.23/40

Weighted GEE Methods for Addressing MAR Data w it is then estimated as the reciprocal of the product of conditional probabilities: ŵ it = [ λi1 λ it ] 1 β and Var( β ) computed similarly as in standard GEE Analysis of Vaccine Effects on Post-Infection Endpoints p.24/40

Comments on Weighted GEE Approach for MAR Data If the marginal mean model is correctly specified, and the parametric model for missingness is correctly specified, then weighted GEE provides unbiased estimation of β, and the sandwich variance estimator is consistent, even if the working correlation matrix is misspecified If the missingness model is correctly specified, weighted GEE peforms better than standard GEE If the missingness model is misspecified, weighted GEE can perform poorly, and standard GEE sometimes does better Analysis of Vaccine Effects on Post-Infection Endpoints p.25/40

Comments on Weighted GEE Approach for MAR Data Weighted GEE can have problems (biased estimation and huge variance estimates) if the ŵ it s are large (i.e., estimated probability of being observed is near zero) Analysis of Vaccine Effects on Post-Infection Endpoints p.26/40

Comments on Weighted GEE Approach for MAR Data In vaccine trials, the estimated probability of having pre-art viral loads might be near zero! This is due to standardized treatment guidelines Current HVTN policy: Provide ARTs to all infected participants when they meet pre-specified criteria E.g., CD4 < 300 cells/mm 3, viral load > 100,000 copies/ml, or HIV-related clinical symptoms Z it may predict perfectly whether a participant will start ART! w it = if and only if Pr(subject i drops out by t or starts ART by t) = 1 If everybody receives ART when they should, then w it = for some subjects and weighted GEE breaks down Analysis of Vaccine Effects on Post-Infection Endpoints p.27/40

Comments on Weighted GEE Approach for MAR Data Irony with weighted GEE: Want to predict missingness to handle MAR missingness, but if predict missingness too well then the method fails Weighted GEE does not handle well the censoring of responses Can use ad hoc approaches, such as assigning all left-censored viral loads a value equal to half the detection limit Can study the marginal mean conditional on non-censored response (truncated mean) Analysis of Vaccine Effects on Post-Infection Endpoints p.28/40

Linear Mixed Effects (LME) Models Approach LME models provide unbiased inferences under assumptions: 1. Multivariate normality of viral loads 2. All predictors of ART initiation are captured in observables The LME models can accommodate censored viral values (Jim Hughes, 1999, Biometrics) Analysis of Vaccine Effects on Post-Infection Endpoints p.29/40

Standard LME Model Y i = X i β + Z i γ i + e i β a vector of fixed effects γ a vector of random effects for subject i e i a vector of random errors Assume γ i and e i are independent with γ i N(0,Σ) e i N(0,σ 2 I) Analysis of Vaccine Effects on Post-Infection Endpoints p.30/40

Standard LME Model Hughes (1999, Biometrics) developed an EM algorithm to obtain the MLEs of β, Σ, and σ 2, allowing for any number of subjects to have left-censored or right-censored viral loads Louis (1982, JRSS B) method used to estimate Var( β ) This method adjusts the variances of the esimated fixed effects for the information lost due to censoring Analysis of Vaccine Effects on Post-Infection Endpoints p.31/40

Standard LME Model Viral load assay used in the VaxGen trials: quantification range 400-750,000 copies/ml Left-censoring: Y it 400 copies/ml Right-censoring: Y it 750,000 copies/ml In Vax004, 259 values 400 and 43 values 750,000 (18.2% of all values censored) In simulations Hughes (1999) showed that his method does much better (less bias in estimating β, Σ, and σ 2 ) than typically applied hoc methods that assign an arbitrary value for censored values (e.g., the detection limit or one-half the detection limit) Efficiency gain derived from the parametric distributional assumption Analysis of Vaccine Effects on Post-Infection Endpoints p.32/40

Comparison of Weighted GEE and LME Models LMEs handle MAR data without needing a missingness model, and performance improves the extent to which the variables included in the model predict missingness LMEs can handle left-censoring and right-censoring of viral loads LMEs preferred if ART initiation is predicted very well, if a credible missingness model cannot be built, or if left/right censoring is heavy Analysis of Vaccine Effects on Post-Infection Endpoints p.33/40

Comparison of Weighted GEE and LME Models Weighted GEE avoids the assumption of multivariate normal viral loads, and is more robust to specification of the correlation structure of measurements over time Weighted GEE preferred if its assumptions are credible, ART is not predicted very well, and left/right censoring is light Analysis of Vaccine Effects on Post-Infection Endpoints p.34/40

Example: Analysis of VAX004 Data Marginal mean model of Y = log 10 viral load with covariates: vaccination status white/nonwhite baseline risk score (0-7) time after infection diagnosis in years education (4 levels) region (6 regions) calendar time of infection diagnosis (4 intervals) age (5 levels) Analysis of Vaccine Effects on Post-Infection Endpoints p.35/40

Methods to be Compared Weighted GEE of RRZ Multiple imputation for GEE (Paik, JASA, 1997) Robust efficient score (Annie Qu, 2006, unpublished manuscript) Analyze 319 subjects with pre-art viral load data Analysis of Vaccine Effects on Post-Infection Endpoints p.36/40

Results WGEE Mult. Imp. Rob. Effic. Score Variable β j s.e. Z β j s.e. Z β j s.e. Z Intercept 4.084 0.342 11.928 4.277 0.326 13.086 4.420 0.344 12.849 Vaccine -0.013 0.097-0.138-0.030 0.092-0.334 0.016 0.080 0.199 White 0.058 0.164 0.354 0.208 0.131 1.586 0.198 0.114 1.735 Risk score 0.035 0.038 0.920 0.0196 0.033 0.587 0.015 0.025 0.602 Time (years) 0.025 0.058 0.441-0.048 0.057-0.844 0.062 0.036 1.723 Educ.2 0.362 0.172 2.097 0.048 0.162 0.298 0.177 0.136 1.302 Educ.3 0.131 0.154 0.851-0.033 0.159-0.207 0.110 0.131 0.840 Educ.4 0.076 0.200 0.380-0.133 0.185-0.721-0.021 0.159-0.134 Region.1-0.164 0.228 0.113-0.252 0.251-1.006-0.399 0.289-1.376 Region.2 0.034 0.236 0.147 0.036 0.266 0.138-0.307 0.302-1.016 Region.3 0.197 0.228 0.863 0.064 0.251 0.254-0.212 0.287-0.739 Region.4 0.098 0.229 0.431-0.091 0.252-0.362-0.315 0.289-1.090 Region.5 0.144 0.245 0.588-0.030 0.268-0.114-0.263 0.293-0.890 P < 0.05 Analysis of Vaccine Effects on Post-Infection Endpoints p.37/40

Results, Continued WGEE Mult. Imp. Rob. Effic. Score Variable β j s.e. Z β j s.e. Z β j s.e. Z CaltimeD.2-0.383 0.218-1.756-0.188 0.173-1.084-0.228 0.160-1.426 CaltimeD.3-0.591 0.199-2.970-0.309 0.174-1.769-0.406 0.156-2.600 CaltimeD.4-0.164 0.185-0.886 0.050 0.158 0.320 0.001 0.144 0.007 CaltimeD.5-0.445 0.208-2.139-0.292 0.180-1.623-0.390 0.163-2.393 age.2 0.184 0.174 1.053-0.017 0.138-0.129 0.011 0.130 0.082 age.3 0.171 0.184 0.930-0.068 0.140-0.488-0.001 0.125-0.006 age.4 0.389 0.188 2.066 0.168 0.160 1.050 0.130 0.148 0.881 age.5-0.333 0.221-1.509-0.196 0.243-0.805-0.147 0.236-0.624 P < 0.05 Analysis of Vaccine Effects on Post-Infection Endpoints p.38/40

Notes on Results The 3 methods are all stable (standard errors reasonably small) and perform fairly similarly Estimates differ for region and time The assumption that 1/w it > 0 is evidently reasonable Robust efficient score approach provides lower standard errors except for intercept and region Analysis of Vaccine Effects on Post-Infection Endpoints p.39/40

Other Approaches to Analyzing the MAR Data If fewer than < 50% of infected subjects start ART by the late time-point t, can minimize dependent censoring problem by estimating the median difference (vaccine vs placebo) in pre-art viral load Utility" ITT approach: Analyze ranks of outcomes for all randomized participants E.g., ranks from best to worst: 1. Not infected 2. Infected and did not start ART within 2 years -Rank by level of viral loads 3. Infected and started ART within 2 years Analysis of Vaccine Effects on Post-Infection Endpoints p.40/40