Flaws, Bias, Misinterpretation and Fraud in Randomized Clinical Trials Steven E. Nissen MD Chairman, Department of Cardiovascular Medicine Cleveland Clinic Disclosure Consulting: Many pharmaceutical companies Clinical Trials: AstraZeneca, Eli Lilly, Takeda, Sankyo, Sanofi-Aventis, Resverlogix, Roche, and Pfizer. Companies are directed to pay any honoraria, speaking or consulting fees directly to charity so that neither income nor tax deduction is received.
Common Flaws in Clinical Trials Inappropriate design and/or comparators Entry criteria not generalizable to population Errors of omission (selective reporting of results) Type I (particularly multiplicity) and Type II error Excessive emphasis on subgroups Use of unblinded study designs Ascertainment bias Misleading composite endpoints Problems with adherence or crossovers Issues with censoring rules/truncation Sponsor/CRO manipulation or misconduct
Inappropriate Comparators
Sample size = 2425 patients Conclusions: Esomeprazole superior Healing rates 93.7% vs 84.2%, p< 0.001 Comparators: 40 mg esomeprazole vs.20 mg omeprazole
Issues of Generalizability
Eligibility Criteria and Recruitment Physician s Health Study Male Physicians 40-84 years on 1/1/82 216,248 Sent letter 112,528 Responded 59,285 Willing to participate 33,223 Eligible 22,071 Randomized (after 18-week run in) NEJM 318:262-264, 1988.
Selective Reporting of Results
Rofecoxib (Vioxx ): The VIGOR Trial A seemingly routine clinical trial report November 23, 2000
VIGOR: General Safety Section The safety of both rofecoxib and naproxen was similar to that reported in previous studies. The mortality rate was 0.5% in the rofecoxib group and 0.4% in the naproxen group. Ischemic cerebrovascular events occurred in 0.2 percent of the patients in each group. Myocardial infarctions were less common in the naproxen group than in the rofecoxib group (0.1 percent vs. 0.4 percent; 95% confidence interval for the difference, 0.1 to 0.6 percent; relative risk 0.2; 95% confidence interval 0.1 to 0.7).
VIGOR: Thrombotic Events (FDA Hearing) Vioxx N=4047 Naproxen N=4029 Any CV Thrombotic Event 45* 19 Cardiac Events 28 10 Fatal MI/Sudden Death 5 4 Nonfatal MI 18 4 Unstable Angina 5 2 Cerebrovascular Event 11 8 Ischemic Stroke 9 8 TIA 2 0 Peripheral 6 1 *p=0.002 p=0.006 Table deleted from NEJM Manuscript by Merck
Cumulative Incidence % VIGOR: Thrombotic Cardiovascular Events 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 Rofecoxib Naproxen 0 2 4 6 8 10 12 14 Months of Follow-up FDA Advisory Committee, Feb 8, 2001
Inappropriate Censoring Rules
Vioxx to Prevent Pre-Cancerous Colon Polyps: APPROVe stopped by DMC (2004) Cumulative Incidence Thrombotic Events (%) Per protocol analysis CV Events censored 2 weeks after drug discontinuation
Rofecoxib Epilogue: ITT Analysis of APPROVe Original publication reported outcomes censoring events occurring more than 14 days after drug discontinuation Documents surfaced in Vioxx liability litigation that revealed a previously undisclosed intention-to-treat analysis. Appropriate Kaplan-Meier curves show an early hazard with no 18 month delay.
Misleading Composite Endpoints
CREST: Carotid Stenting vs. Surgery Among patients with symptomatic or asymptomatic carotid stenosis, the risk of the composite primary outcome of stroke, myocardial infarction, or death did not differ significantly in the group undergoing carotid-artery stenting and the group undergoing carotid endarterectomy. Stenting CEA HR p value Composite 66 56 1.18 0.38 Death 9 4 2.25 0.18 Stroke 52 29 1.79 0.01 MI 14 28 0.50 0.03
Inflated Type I Error Multiplicity of Endpoints
Type I Error and Multiplicity of Tests 0.6 0.5 Probability of Type I error 0.4 0.3 0.05 0.2 0.1 0 0 10 20 1 Number of tests
Primary Endpoints in Dalcetrapib Plaque Primary Endpoints Number p value PET-CT inflammation 3 anat. sites 2 time points 0.51 MRI wall area 4 time points 0.12 MRI total vessel area 4 time points 0.04 MRI change in vessel area 3 time points NS MRI wall thickness 4 time points 0.45 MRI normalized wall index 4 time points 0.57 Total potential endpoints 6 to 25 --- Conclusion: On MRI, significantly less progression in total vessel area was seen with dalcetrapib
Type I Error and Multiplicity of Tests 0.6 0.5 Probability of Type I error 0.4 0.3 0.05 0.2 0.1 0 0 10 20 1 Number of tests
Angiographic variables, demographic variables and many IVUS variables, more than 100 in total
Inappropriate Emphasis On Subgroups
The combination of fenofibrate and simvastatin did not reduce the rate of fatal cardiovascular events, nonfatal myocardial infarction, or nonfatal stroke, as compared with simvastatin alone. These results do not support the routine use of combination therapy with fenofibrate and simvastatin However, in the presentation and subsequent spin, the authors repeatedly emphasized the benefits observed in the subgroup with triglycerides 204 mg/dl and HDL 34 mg/dl. Interaction p value = NS!
Misleading Subgroup Findings Amlodipine in CHF: The PRAISE Trials PRAISE 1 Non-Ischemic Stratum PRAISE 1 Ischemic Stratum PRAISE 1 (1153 patients) Amlodipine vs placebo Overall p = 0.07 Interaction p = 0.004 Ischemic p = NS Non-Ischemic p < 0.001 PRAISE 2 (1653 patients) Repeated non-ischemic stratum RR ~ 1.0 no difference
VALHEFT: Primary Composite Endpoint HR 0.87 (0.79 0.97), p = 0.009
Harm in Triple Therapy Subgroup? 1610 of 5010 patients: ACEi, ARB and ß blocker Composite Endpoint Death ß blocker ARB + ACE Enormous attention given to this subgroup that showed the hazards of triple therapy. FDA Advisory Panel denied label claim
Meaningless Subgroup Findings ISIS-2 Trial: Aspirin in Acute MI Astrologic Sign Odds Ratio & 95% CI Placebo ASA Gemini or Libra 2799 15.2% 11.9% Others 14.4% 11.0% 14,388 0.5 Aspirin Better 1 Placebo Better 1.5 ISIS-2 Investigators. Lancet 1988;2:349.
Bias Related to Unblinding
The Effects of Unblinding NIH Trial of Vitamin C for Common Cold Duration of Cold (Days) Blinded Subjects Unblinded Subjects Placebo 6.3 8.6 Ascorbic Acid 6.5 4.8 Karlowski et al. JAMA 1975;231:1038.
Odds ratio for myocardial infarction 1.43 (95% CI 1.03-1.98) Odds ratio for cardiovascular death 1.64 (95% CI 0.98-2.74)
2009: The RECORD Trial (Lancet). Demonstrates Non-Inferiority for Rosiglitazone Myocardial Infarction HR = 1.14 (95% CI 0.80-1.63) Academic Steering Committee, independent external validation of results
2010 FDA Advisory Panel GSK requests elimination of the black box warning placed in 2007 The FDA schedules a new 2010 Advisory Committee to consider this request. Fortunately, FDA reviewer Tom Marciniak is assigned to review the RECORD trial.
RECORD: How Not to Perform a Safety Study An completely unblinded study Patients and physicians knew who was taking rosiglitazone. Extraordinary unblinding Unrestricted availability of treatment codes to Quintiles and GSK! The company censored silent MI s (10 to 5, rosiglitazone vs. control) analysis AFTER analyzing RECORD data
HR = 1.38 (0.99-1.93)
September 23, 2010 Pharmageddon for rosigltazone European regulators ban the drug entirely FDA restricts use to patients who have failed all other diabetes drugs
September 2006: DREAM Appears in Lancet, Placebo Controlled Diabetes Prevention Trial RSG n=2635 Placebo n=2634 HR (95% CI) p value MI 15 9 1.66 (0.73-3.80) 0.2 Stroke 7 5 1.39 (0.44-4.40) 0.6 CV Death 12 10 1.20 (0.52-2.77) 0.7 Adj. CHF 14 2 7.03 (1.6-30.9) 0.01 New Angina 24 20 1.2 (0.66-2.17) 0.5 Revasc. 35 27 1.29 (0.78-2.14) 0.3 Composite 75 55 1.37 (0.97-1.94) 0.08
Publication Bias Significant Studies Are More Likely to be Published 218 Studies at a Single Institution over 10-Year Period Stern and Simes. BMJ 1997;315:640. AML