Title: Stopping randomized trials early for benefit: a protocol of the Study Of Trial Policy Of Interim Truncation-2 (STOPIT-2)

Author's response to reviews Title: Stopping randomized trials early for benefit: a protocol of the Study Of Trial Policy Of Interim Truncation-2 (STOPIT-2) Authors: Matthias Briel (brielm@uhbs.ch) Melanie Lane (lane.melanie@mayo.edu) Victor M Montori (montori.victor@mayo.edu) Dirk Bassler (dirk.bassler@med.uni-tuebingen.de) Paul Glasziou (paul.glasziou@dphpc.ox.ac.uk) German Malaga (gmalaga01@gmail.com) Elie A Akl (elieakl@buffalo.edu) Ignacio Ferreira-Gonzalez (nacho_ferreira@hotmail.com) Pablo Alonso-Coello (palonso@santpau.cat) Gerard Urrutia (gurrutia@santpau.cat) Regina Kunz (rkunz@uhbs.ch) Carolina Ruiz Culebro (caro.ruiz09@gmail.com) Suzana Alves da Silva (suzana.silva@procardiaco.com.br) David N Flynn (dnflynn@gmail.com) Mohamed B Elamin (elamin.mohamed@mayo.edu) Brigitte Strahm (brigitte.strahm@uniklinik-freiburg.de) M-Hassan Murad (murad.mohammad@mayo.edu) Benjamin Djulbegovic (bdjulbeg@health.usf.edu) Neill KJ Adhikari (neill.adhikari@utoronto.ca) Edward J Mills (emills@cfenet.ubc.ca) Femida Gwadry-Sridhar (femida.gwadrysridhar@lhsc.on.ca) Haresh Kirpalani (kirpalanih@email.chop.edu) Heloisa P Soares (soareshp@gmail.com) Nisrin O Abu Elnour (abuelnour.nisrin@mayo.edu) John J You (jyou@mcmaster.ca) Paul J Karanicolas (pjkarani@uwo.ca) Heiner C Bucher (hbucher@uhbs.ch) Julianna F Lampropulos (lampropulos.julianna@mayo.edu) Alain J Nordmann (anordmann@uhbs.ch) Karen EA Burns (burnsk@smh.toronto.on.ca) Sohail M Mulla (mullasm@mcmaster.ca) Heike Raatz (hraatz@uhbs.ch) Amit Sood (sood.amit@mayo.edu) Jagdeep Kaur (kaurj4@mcmaster.ca) Clare R Bankhead (clare.bankhead@dphpc.ox.ac.uk) Rebecca J Mullan (mullan.rebecca@mayo.edu) Kara A Nerenberg (nerenbk@mcmaster.ca) Per Olav Vandvik (perolav.vandvik@sykehuset-innlandet.no) Fernando Coto-Yglesias (fernandocoto@racsa.co.cr) Holger Schünemann (hjs@buffalo.edu)

Fabio Tuche (fabiotuche@gmail.com) Deborah J Cook (debcook@mcmaster.ca) Kristina Lutz (kristina.lutz@learnlink.mcmaster.ca) Christine M Ribic (christine.ribic@medportal.ca) Patricia J Erwin (erwin.patricia@mayo.edu) Rafael Perera (rafael.perera@dphpc.ox.ac.uk) Qi Zhou (qzhou@mcmaster.ca) Diane Heels-Ansdell (ansdell@mcmaster.ca) Stephen D Walter (walter@mcmaster.ca) Gordon H Guyatt (guyatt@mcmaster.ca) Version: 2 Date: 28 May 2009 Author's response to reviews: see over

To the Trials Editorial Team Rochester, May 27 th 2009 MS #: 1381969708248214 First revision: Stopping randomized trials early for benefit: a protocol of the Study Of Trial Policiy Of Interim Truncation-2 (STOPIT-2) Dear Editors, We thank you for reviewing our manuscript and for reconsidering a revised version for publication in Trials. We addressed all of the points raised by the reviewers in the point-by-point reply and thank them all for their constructive comments. We checked the revised manuscript that it conforms to the journal style; in particular, we included legends to our figures in the main manuscript file and additionally included a section for Abbreviations. All contributors have approved this revised version of the manuscript and fulfill criteria for authorship. We hope that this revision meets your and the reviewers expectations and that it can be accepted for publication in the present form. We are looking forward to your reply. With kind regards, Victor M. Montori MD MSc Address for correspondence to: Victor M. Montori, MD, MSc Knowledge and Encounter Research Unit,Mayo Clinic, Plummer 3-35, 200 First Street SW, Rochester, MN 55905-0001, USA Fax: 507.538.0850 Phone: 507.293.0175 Email: montori.victor@mayo.edu Conflicts of interest: None to declare.

Point by point reply: MS #: 1381969708248214 First revision: Stopping randomized trials early for benefit: a protocol of the Study Of Trial Policiy Of Interim Truncation-2 (STOPIT-2) The reviewers comments are in bold font and our replies in regular font. A) Comments from Reviewer 1: Kenneth Schulz I would suggest that the authors define objective criteria for determining the methodological quality of allocation concealment, blinding, loss to follow-up, and patients analyzed in the groups to which they were randomized. They also should caution readers on their ability to measure all these aspects, even with objective criteria, particularly with the poor reporting that currently exists in medical journals. For example, losses to followup frequently are not reported. Thank you for raising this important point. We followed your suggestion and revised the manuscript on page 15, para 1 as follows: 2. Methodological quality: allocation concealment (documented as central independent randomization facility or numbered/coded medication containers prepared and distributed by an independent facility (e.g. pharmacy); blinding of participants, care providers, and outcome adjudicators (blinding of participants and care providers will be rated as probably yes when trial report states double blinded or placebo controlled ), loss to follow-up (we will collect the number of participants randomized and the number of participants with outcome data for the outcome of interest allowing for an estimation of loss to follow-up). We further added the following to the limitations mentioned in the discussion section of our manuscript (page 20, para 1): Despite objective criteria, when assessing the methodological quality of RCTs we are limited by the quality of the reporting of the trials. B) Comments from Reviewer 2: Janet Wittes I have a few small questions and suggestions. 1.) The authors ask whether Bayesian methods would reduce bias. They might also consider frequentist methods see Proschan, Lan, Wittes, Statistical Monitoring of Clinical Trials, who described several frequentist methods. Thank you for this suggestion. As we say on page 18 of our manuscript ( we will compare possible methods for correcting the estimates from trcts for possible bias, in particular the use of Bayesian methods. The basic approach here is to use a conservative prior for trials and 2

combine this information with the data from the trct to obtain a posterior estimate of effect. ), we are not specifically excluding frequentist methods to correct the treatment effect estimate of trials that were stopped early for benefit. However, we are currently not aware of such frequentist methods and we could not find any in the book the reviewer mentioned. Proschan, Lan, and Wittes present a lucid discussion of various methods to statistically monitor clinical trials but not how to correct an effect estimate of a trial after it was stopped for apparent benefit. If the reviewer could point us to the exact section in her or other books or reports that document the use of frequentist methods for this purpose, we would be happy to include those. 2.) I found the plethora of acronyms irritating. SR was particularly unpleasant. We reduced the number of acronyms and made sure that we defined each acronym at first use. In addition, we included a section of Abbreviations in the revised manuscript. 3.) On page 10, the words in duplicate are unclear. Do the authors mean that two independent reviewers. We rephrased the corresponding sentence on page 10 as follows: From each eligible systematic review we will blind each RCT s results and two independent reviewers will determine eligibility. 4.) On page 14, the statistical analysis says that the authors will calculate relative risks. That assumes the studies will all have binary or time-to-event variables. What if some studies are investigating means? Thank you for raising this point. For our analysis of studies that provide results as continuous data (means, standard deviations), we will assume normal distributions of the results. We will assume that half a standard deviation represents the minimally important change (Norman et al. 2003). Using baseline data we will obtain the 0.5 standard deviation threshold from the baseline distribution and calculate the proportion of each follow-up distribution above or below (depending on the direction of the outcome) the threshold, i.e. the proportion of patients in each treatment arm who did worse. This will allow us to specify relative risks and associated confidence intervals. If baseline data is not available, we will use the follow-up distribution of the control group to stand for the 0.5 standard deviation threshold. Similarly, the proportion of patients in each treatment arm who did worse will then be calculated to specify relative risks and corresponding confidence intervals. We added this information to the revised manuscript on page 15, last paragraph as follows: For studies that provide results as continuous data (means, standard deviations), we will estimate an approximate dichotomous equivalent. To do this we will assume normal distributions of the results and that half a standard deviation represents the minimal important change.[10] 3

Using baseline data we will obtain the 0.5 standard deviation threshold from the baseline distribution and calculate the proportion of each follow-up distribution above or below (depending on the direction of the outcome) the threshold, i.e. the proportion of patients in each treatment arm who did worse. This will allow us to specify relative risks and associated confidence intervals. If baseline data are not available, we will use the follow-up distribution of the control group to substitute for the 0.5 standard deviation threshold. 5.) The figures look like slides not figures. We checked the figures and made sure that they conform to the journal style. Additional comments by the authors: Sparked by simulations from members of our group we had an intense discussion about the most appropriate comparator for our analysis to estimate the magnitude of bias due to stopping early. We carefully considered non-truncated randomized trials only and truncated plus non-truncated randomized trials as the potential comparators. Given that both approaches have compelling strengths and compelling limitations, we decided to conduct both analyses. We chose nontruncated RCTs only as the comparator in our primary analysis. In a second analysis, we will compare the trct and the pooled estimate of all trials including the trct. In order to provide the readers with more details of our considerations, we added a Table in the revised manuscript that compares non-truncated randomized trials only with truncated plus non-truncated randomized trials as comparators for our analysis. We highlighted changes to the manuscript in red. 4