Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification

RESEARCH HIGHLIGHT Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification Yong Zang 1, Beibei Guo 2 1 Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431, USA 2 Department of Experimental Statistics, Louisiana State University, Baton Rouge, LA 70803, USA Correspondence: Yong Zang E-mail: zangy@fau.edu Received: February 24, 2016 Published online: March 28, 2016 With the advance of targeted therapies, there is a growing trend that the physicians tend to select personalized medicine to treat cancer according to the patient's biomarker profile. To efficiently evaluate the treatment and marker effects for targeted therapies, various biomarker-guided clinical trial designs have been proposed. The implementation and analysis of these biomarker-guided designs require that the biomarkers are accurately measured, which may not always be feasible in practice. Recently, we have investigated this topic and proposed a series of two-stage designs to correct the biomarker misclassification. In this article, we review these two-stage methods which provide a practical guideline to accommodate the biomarker misclassification for the biomarker-guided designs. Keywords: Clinical trial; Biomarkers; Marker-stratified design; Enrichment design; Measurement error; Optimal design; Personalized medicine. To cite this article: Yong Zang, et al. Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification. Precis Med 2016; 2: e1230. doi: 10.14800/pm.1230. Copyright: 2016 The Authors. Licensed under a Creative Commons Attribution 4.0 International License which allows users including authors of articles to copy and redistribute the material in any medium or format, in addition to remix, transform, and build upon the material for any purpose, even commercially, as long as the author and original source are properly cited or credited. Introduction Targeted therapy revolutionizes the way that physicians treat cancer by enabling them to select personalized treatment adaptively according to patients' specific biomarker profile [1]. As targeted therapy blocks the growth of cancer cells by identifying and attacking specific functional units needed for carcinogenesis and tumor growth while sparing normal tissue, it is expected to be more effective and less toxic than conventional chemotherapy and radiotherapy [2]. Upon its development, the targeted therapy has been used to treat breast cancer, multiple myeloma, prostate cancer and other types of cancer [3]. Various biomarker-guided clinical trial designs have been proposed to detect and evaluate the treatment effect for the targeted therapy and the marker effect for the candidate biomarkers [4, 5, 6, 7]. The biomarker-guided clinical trial design requires precise measurement of the biomarker to carry out the trial. One practical obstacle to implement the trial is that due to logistic or cost issues, such as specimens not being submitted or insufficient tumor tissue, the biomarker status may be classified with error for a subset of patients in the trial. The common practice of excluding such Page 1 of 5

Figure 1. Diagram of the enrichment design. patients from the trial wastes precious patient resources and biases the inference results. Therefore, novel statistical method to efficiently correct for the biomarker measurement error for the biomarker-guided designs is in urgent demand. Recently, we have investigated this issue and developed a two-stage strategy to correct for the biomarker measurement error. We have applied this method to accommodate a series of biomarker-guided designs including the enrichment design, the marker adaptive design and the marker stratified design [8, 9, 10]. The proposed two-stage method supplements the original design with an additional stage of standard randomization for the purpose of gathering necessary information of the misclassification error. In addition, to accommodate the data analysis, we have developed several rigorous statistical tests to adjust for the bias caused by the biomarker misclassification. In this research highlight, we would like to provide a brief review to summarize our work on this topic. Enrichment design and marker adaptive design The enrichment design is probably one of the earliest biomarker-guided designs. As shown in Figure (1), the enrichment design classifies patients into the marker-positive subgroup and marker-negative subgroup based on their biomarker statuses. The marker-positive patients are then randomized to receive either the molecularly targeted agent (MTA) or the standard treatment whereas the marker-negative patients are treated off the protocol. When the trial ends, the treatment effect of the MTA is evaluated by comparing the response rates between the MTA and the standard treatment within the marker-positive subgroup. As the biomarker is involved in both the conduct and the analysis of the enrichment trial, the biomarker measurement error can seriously undermine the integrity of the trial (i.e. recruit or exclude inappropriate patients) and substantially bias the inference results (i.e. distort the type I and type II errors). We proposed an optimal two-stage design to address this issue [10]. The optimal rule is obtained by decomposing the correct assignment rate conditional on the surrogate marker. Therefore, the design is optimal in the sense that it can minimize the mis-assignment rate based on the surrogate marker information. Specifically, in the first stage, we enroll a subset of patients and measure their precise biomarker and surrogate marker information. Each patient is then randomized to receive either a MTA or a standard treatment. Then, in the second stage, we use the data in the first stage to estimate the misclassification rates and build the optimal allocation rule. We use the optimal allocation rule to classify the remaining patients in the second stage to receive the MTA or treat off the protocol, based solely on the patients' surrogate marker measurement. When the trial ends, the next step is to analyze the data. Before doing that, we first of all analytically derived the bias caused by the biomarker misclassification during the data analysis. Based on our derivation, the bias is substantial and does not vanish even under the null hypothesis of no treatment effect. Thus, necessary adjustment is required. For this purpose, we developed a likelihood ratio test (LRT) to correct for the bias. The maximum likelihood estimates (MLEs) were calculated based on an expectation-maximization (EM) algorithm [11, 12]. In particular, the missing precise biomarker values were iteratively imputed by the expected probabilities conditional on the observed values for the surrogate marker and response outcomes. Finally, we use the LRT to test the treatment effect of the MTA within the marker-positive patients for the Page 2 of 5

Figure 2. Diagram of the marker stratified design. enrichment design in the presence of biomarker misclassification. The enrichment design relies on the assumption that there is no suitable treatment for the marker negative-patients. But in practice, certain standard treatment may be appropriate for treating the marker-negative patients. If this reliable treatment exists, then it is unethical to treat the marker-negative patients off the trial. Based on this setting, the marker-adaptive design has been proposed. Under this design, the treatment allocation is deterministically determined by the biomarker status. Marker-positive patients are treated with the MTA and marker-negative patients are treated with the standard therapy. The only difference between the enrichment design and the marker-adaptive design is that the former design excludes the marker-negative patients from the trial whereas the latter design assigns them to the standard therapy. We also applied the proposed two-stage method to accommodate the marker-adaptive design with imperfectly measured biomarker [8]. In particular, we proposed two optimal designs referred to as the MinError and MaxResp designs. The MinError design is similar to the optimal design we used for the enrichment trial which can minimize the mis-assignment rate. The MaxResp focuses on the group ethics by maximizing the overall response rate of the trial. Also, to correct for the bias, we developed a Wald-type test based on the profile-likelihood function. The biomarker measurement errors were treated as the nuisance parameters in the profile-likelihood function and were estimated through a regression calibration model. Moreover, to optimally allocate patients between the first and second stages of the trial, we developed the asymptotic power function of the proposed Wald test. Marker stratified design The marker stratified design is another widely used biomarker-guided design. The purpose of the marker stratified design is to evaluate the treatment effect and the marker effect. According to different functionalities in the process of diagnosing and selecting treatments for individuals with cancer, marker effect can be broadly categorized as prognostic or predictive [5, 6]. A prognostic biomarker is one that separates a population with respect to the risk of a specific outcome, such as disease progression, in the absence of treatment or despite receiving a non-targeted standard treatment. A predictive biomarker is one that is used to foretell the differential efficacy of a particular therapy based on the presence or absence of the biomarker, e.g., only patients whose tissues highly express the biomarker are expected to respond favorably to a specific targeted therapy. Figure (2) depicts the diagram of the marker stratified design. The marker stratified design stratifies patients into a marker-positive subgroup and a marker-negative subgroup based on the patient's biomarker profile. Then, this design randomizes the patients to receive either the targeted therapy or the standard therapy within each subgroup. The predictive biomarker effect can be evaluated by comparing the difference in the treatment effects within the marker-positive subgroup to that within the marker-negative subgroup. The prognostic biomarker effect can be evaluated by comparing the responses of the patients in the marker-positive versus marker-negative subgroups who receive the standard therapy only. The treatment effect can be evaluated by comparing the responses to the different treatments within each marker subgroup. We note that in the marker stratified design, the treatment allocation is Page 3 of 5

independent of the patient's biomarker status. Consequently, the biomarker measurement error does not affect the trial implementation. That is because, whether a patient is correctly classified or mis-classified, he or she is always randomized to receive different treatments with the same randomization ratio. However, similar to the enrichment design and the marker adaptive design, the inference procedure for the marker stratified design is also vulnerable to the biomarker misclassification. Hence, corresponding adjustment for the data analysis is still necessary. Recently, Liu et al [13] proposed a Wald test to analytically correct for biomarker misclassification under the marker stratified design. Unfortunately, this analytic method suffers from two drawbacks. First of all, this method assumes that the biomarker misclassification rate is known a priori, which rarely holds in practice. Secondly, this method cannot incorporate other covariates. To overcome these limitations, we extended the proposed two-stage method and applied it to the marker stratified design [9]. Our method does not require per-specification of the misclassification rate and can accommodate any covariates in addition to the biomarker and treatment outcomes. Along the same line as the enrichment design, we also analytically derived the bias of the treatment, prognostic marker and predictive marker effects arising from biomarker misclassification. We found that the bias arose for all the treatment and marker effects although the measurement error only happened for the biomarker. To correct for such bias, we developed a series of Wald-type tests based on the EM algorithm, which is a modification of the EM algorithm we adopted for the enrichment design. We note that the proposed method does not need to prespecify the misclassification rates, which are iteratively estimated in the proposed EM algorithm. However, when the misclassification rates are known a priori, the above EM algorithm still applies. The only modification is that we skip the update of the misclassification rates and fix them at their prespecified values. We used EM-I and EM-II to denote the EM method without and with the specification of the misclassfication rates, respectively. According to our simulation studies, the difference between these two methods is negligible when the misclassification rates can be correctly pre-specified. However, considering that the EM-II method is sensitive to the specification of the misclassification rates, we generally recommend the EM-I method to be used. Conclusions The era of targeted therapy has arrived. The adoption of a targeted therapy requires the identification of biomarkers associated with the targeted therapy. To evaluate the treatment effect for the targeted therapy and the corresponding biomarker effect, many biomarker-guided clinical trial designs have been proposed. Unfortunately, biomarker-guided trial designs are vulnerable to biomarker misclassification. This misclassification can seriously distort the evaluation of the treatment effects and biomarker effects and undermine the implementation of the trial. In this research highlight, we briefly review the two-stage method we recently proposed and introduce how this method can be used to accommodate a series of biomarker guided designs subject to the biomarker measurement error, including the enrichment design, marker adaptive design and marker stratified design. The proposed method can be extended from different angles. For example, we only consider the binary endpoint for our study. It is also of interest to extend the proposed method to handle other types of outcomes, such as time-to-event outcomes and ordinal outcomes. Additionally, other biomarker guided designs such as the marker strategy design are also worth studying. We hope that our study can inspire more research from both the statistical and clinical communities dedicated to this topic. Conflicting interests The authors have declared that no conflict of interests exist. Author contributions Y. Z. proposed the idea and wrote the manuscript. B. G. discussed with Y. Z. and revised the manuscript. References 1. Sawyers C. Targeted cancer therapy. Nature 2004; 432: 294-297. 2. Sledge GW. What is targeted therapy. J Clin Oncol 2005; 23: 1614-1615. 3. Wu HC, Chang DK, Huang CT. Targeted therapy for cancer. Journal of Cancer Molecules 2006; 2: 5766. 4. Simon R, Maitournam A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clin Cancer Res 2004; 10: 6759-6763. 5. Mandrekar SJ, Sargent DJ. Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J Clin Oncol 2009; 27: 4027-4034. 6. Sargent DJ, Conley BA, Allegra C, Collette L. Clinical trial designs for predictive marker validation in cancer treatment trials. J Clin Oncol 2005; 23: 2020-2027. 7. Sargent D, Allegra C. Issues in clinical trial design for tumor marker studies. Semin Oncol 2002; 29: 222-230. 8. Zang Y, Liu S, Yuan Y. Optimal marker-adaptive designs for targeted therapy based on imperfectly measured biomarkers. Journal of the Royal Statistical Society Series C 2015; 64: Page 4 of 5

635-650. 9. Zang Y, Lee J, Yuan Y. Two-stage marker-stratified clinical trial design in the presence of biomarker misclassification. Journal of the Royal Statistical Society Series C 2016; DOI: 10.1111/rssc.12140. 10. Zang Y, Guo B. Optimal two-stage enrichment design correcting for biomarker misclassification. Stat Methods Med Res 2016; pii: 0962280215618429. 11. Dempster AP, Laird NM, Rubin DB. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B 1977; 39: 1-38. 12. Ibrahim JG. Incomplete data in generalized linear models. Journal of the American Statistical Association 1990; 85: 765-769. 13. Liu C, Liu A, Hu J, Yuan V, Halabi S. Adjusting for misclassification in a stratified biomarker clinical trial. Stat Med 2014; 33: 3100-3113. Page 5 of 5