Demographics, Subgroup Analyses, and Statistical Considerations in Cluster Randomized Trials Monique L. Anderson, MD MHS Assistant Professor of Medicine Division of Cardiology Duke Clinical Research Institute Duke University School of Medicine
Aims Definition of Terms to Describe HTE Review of Federal Policies for HTE analyses Systematic Review on Demographic Reporting and Subgroup Analyses Simulation Models to explore impact of imbalance of demographic subgroups
Definitions Cluster Randomized Trial 1,2 Experiments in which intact social units, rather than independent individuals, are randomly assigned to intervention groups CRT designs may be chosen to avoid contamination, randomization of individuals is not possible, to evaluate interventions that operate at a group level or manipulate the social or physical environment. Treatment Effect 3 - A comparison between treatment groups in a trial. Usually measured by relative risk, odds ratio, or arithmetic difference Subgroup Analysis 3- Any evaluation of treatment effects for a specific endpoint in subgroups of patients defined by baseline characteristics Undertaken to investigate consistency of trial conclusions across different subpopulations who are defined by multiple baseline characteristics. Also called heterogeneity of treatment effect analysis Heterogeneity of Treatment Effect 3 - Refers to circumstances in which the treatment effects vary across levels of baseline characteristics E.g. Men versus Women, Black race versus White Race Expressed in a statistical model as an interaction term between treatment group and baseline variable. 1. Donner, A. Appl. Statistics (1998) 47, 95-113 2. Murray, D et al. J Natl Cancer Ist 2008:100:483-491 3. Wang, R wt al. NEJM 2007; 357:2189-2194
FDA Policies and Guidance for Race and Ethnicity Reporting and HTE Analyses 1988 1998 2005 2007 2012 Guidelines for the Format and Content of Clinical and Statistical Sections of NDAs Emphasized the importance of subgroup analyses, specified race and ethnicity subgroups should be analyzed Demographic Rule- ½ NDAs have sufficient analyses Sponsors of IND applications to submit annual demographics of enrolled population NDA required to submit effectiveness and safety data for demographic subgroups FDA Guidance on Reporting Race and Ethnicity Reporting in Clinical Research OMB Categories Recommended FDAAA 801- Reporting of Basic Results Mandatory for Applicable Clinical Trial Race and Ethnicity Reporting is Optional ; Age and Sex Mandatory 2016- Final rule for FDAAA now mandates reporting of race and ethnicity to CT.gov Section 907 FDASIA Action Plan to improve demographic reporting and HTE 2016 Race and ethnicity now required in submission of results to clinicaltrials.gov NIH now requires all funded clinical trials to report data to website.
NIH Policies on Minority Inclusion and HTE 1993 NIH Revitalization Act Directs the NIH to establish guidelines for inclusion of women and minorities in clinical research Established Office of Minority Health and Office of Women s Health 1994 NIH Guidelines on The Inclusion of Women and Minorities as Subjects in Clinical Research Inclusion of minorities to be addressed in funding proposals and annual progress reports Phase III trials must examine HTE where applicable 1997 2000 2001 OMB standards revised Guidelines Updated Research Plan, Progress Reports, Competitive Renewal Apps, Final Progress Reports to include plan for subgroup analyses Subgroup analyses strongly encouraged in all publication submissions NIH Policy on Reporting Race and Ethnicity Data: Subjects in Clinical Research OMB revised standards adopted by the NIH Inclusions Guidelines Updated to reflect OMB categories
NIH Inclusion Policy: Additional Guidance on HTE for Clinical Trials Prior Data HTE Analysis Required Sufficient Power Needed to Detect Difference in Subgroups Support HTE by race/ethnicity Mandatory Yes Neither support or negate HTE Yes No Does not support HTE Encouraged No
Heterogeneity of Treatment Effect in CRTs for Baseline Demographics Pragmatic clinical trials may be optimal for understanding how standard treatments impact demographic subgroups. It is unclear how to address demographic subgroups in CRTs (if they should be addressed at all). Cluster randomization Currently, no federal policies or guidance exist for addressing demographic inclusion and analysis in CRTs. design and analysis
Research Aims 1. Systematic Review: Cluster Randomized Trials 1. Define proportion of CRTs that report on demographics. 2. Report on the frequency of addressing demographic subgroups in design or analysis of CRTs in literature. 3. Describe the degree to which heterogeneity of treatment effect analyses are conducted overall and for demographic subgroups. 2. Simulation Modeling: 1. Describe preliminary results of simulation computer models to quantify the impact of racial imbalance on treatment effect bias. 2. Understand whether any detected bias due to imbalance is overcome by commonly used statistical approaches in CRTs.
Systematic Review PubMed and EMBASE database search: CRTs published between Jan 2010 and March 2016. Key Questions How often and with what methods are heterogeneity of treatment effects (HTE) explored in published CRTs targeting the top 3 leading causes of death? How often are HTE analyses conducted for demographic subgroups in published CRTs targeting the top 3 leading causes of death? How do these findings differ given clinical area, intervention type, and key characteristics of the population studied? Definitions Center for Disease Control Definition of Leading Causes of Death Heart Disease, Chronic Lower Respiratory Diseases, and Cancer ICD 9/10 codes used to identify diseases in each clinical area Prevention trials only included if patients had the diseases of interest (ex. study aimed to prevent coronary artery disease, but not a study that prevented diabetes or hypertension).
Literature Flow Diagram 1939 citations identified by literature search: PubMed: 1422 Embase: 517 257 duplicates 1682 unique citations identified 117 passed abstract screening 65 articles (65 studies) passed full-text screening and were abstracted 65 abstracted studies: Cardiovascular Disease: 18 studies Chronic Lower Respiratory Disease: 31 studies Cancer: 16 studies 1565 abstracts excluded 52 articles excluded: - Not a publication type of interest (study protocol, editorial, systematic/ nonsystematic review, meta-analysis, letter): 11 - Not a cluster-randomized trial: 3 - Study population is not individuals with a disease of interest: 24 - Does not report any patient-level outcomes: 5 - Not the primary/main report of the study results: 9
CRT Systematic Review: Study Characteristics Study Characteristics of CRTs All Cancer Cardiovascular Pulmonary N= 65 N= 16 N= 18 N= 31 Geographic Location, n (%) US 13 (20%) 2 (12.5%) 4 (22.2%) 7 (22.6%) Non-US 49 (75.4%) 14 (87.5%) 11 (61.1%) 24 (77.4%) Mixed 3 (4.6%) 0 (0%) 3 (16.7%) 0 (0%) Funding Source, n(%) Government 20 (30.8%) 5 (31.25%) 5 (27.8%) 10 (32.3%) Industry 2 (3.1%) 0 (0%) 1 (5.5%) 1 (3.2%) Non-Gov, Non-Industry 6 (9.2%) 1 (6.25%) 2 (11.1%) 3 (9.7%) Mixed sources 22 (33.8%) 6 (37.5%) 5 (27.8%) 11 (35.5%) Unclear 15 (23.1%) 4 (25%) 5 (27.8%) 6 (19.4%) Setting, n (%) Clinic 41 (63.1%) 10 (62.5%) 6 (33.3%) 25 (80.6%) Hospital 13 (20%) 6 (37.5%) 6 (33.3%) 1 (3.2%) Emergency Medical Services 6 (9.2%) 0 (0%) 4 (22.2%) 2 (6.5%) School/Community 2 (3.1%) 0 (0%) 1 (5.6%) 1 (3.2%) Other 3 (4.6%) 0 (0%) 1 (5.6%) 2 (6.5%) Intervention, n (%) Devices 3 (4.6%) 0 (0%) 2 (11.1%) 1 (3.2%) Drug or Biologic 0 (0%) 0 (0%) 0 (0%) 0 (0%) Quality Improvement 33 (50.8%) 11 (68.75%) 9 (50%) 13 (41.9%) Behavioral Interventions 12 (18.5%) 2 (12.5%) 4 (22.2%) 6 (19.4%) Mixed Interventions 7 (10.8%) 2 (12.5%) 0 (0%) 5 (16.1%) Other 10 (15.4%) 1 (6.25%) 3 (16.7%) 6 (19.4%) Number of Patients Enrolled, median (IQR) 484.5 (263, 1108) 297.5 (243.5, 505.5) 1405 (637, 4307.5) 427 (307, 878) Number of Clusters Enrolled, median (IQR) 40 (18.5, 95) 17 (12, 87.8) 98 (39, 174) 39 (23, 49)
Reporting of Baseline Characteristics in CRTs Baseline Characteristics All Cancer Cardiovascular Pulmonary N= 65 N=16 N=18 N=31 Baseline Characteristics Reported, % 62 (95.4%) 16 (100%) 18 (100%) 28 (90.3%) Age, % reported 56 (86.1%) 14 (87.5%) 18 (100%) 24 (77.4%) Sex, % reported 58 (89.2%) 15 (93.8%) 18 (100%) 25 (80.6%) Race and/or Ethnicity, % reported 12 (18.4%) 3 (18.8%) 4 (22.2%) 5 (16.1%) All NIH Categories 1 1 0 0 White Race Only 3 0 2 1 1 Category, other than white race 1 1 0 0 > 1 Race Category 7 1 2 4 Socioeconomic Status, % reported 25 (38.5%) 9 (56.3%) 5 (27.8%) 11 (35.5%) Income Level 3 0 1 2 Level of Education 23 9 5 7 Social Class 4 0 1 0 Insurance Status 4 1 1 2 Other 7 0 2 4
Primary Outcomes in CRTs Primary Outcomes All Cancer Cardiovascular Pulmonary N =65 N = 16 N = 18 N = 31 Primary Outcome Identified 60 (92.3%) 15 (93.8%) 18 (100%) 27 (87.1%) Type of Outcome Patient Reported Outcome 27 (45.0%) 10 (66.7%) 2 (11.1%) 15(55.6%) Clinical Outcome 15 (25.0%) 3 (20.0%) 5 (27.8%) 7 (25.9%) Mortality/Survival 8 (30.0%) 0 7 (38.9%) 1 (3.7%) Process Outcome 9 (15.0%) 2 (13.3%) 5 (27.8%) 2(6.5%) Economic Outcome 1 (1.7%) 0 0 1 (3.7%) Behavioral Outcome 1 (1.7%) 0 0 1 (3.7%) Power Analysis Reported 50 (76.9%) 10 (62.5%) 16 (88.9%) 24 (77.4%) ICC presented 35 (53.8%) 7 (43.8%) 10 (55.5%) 18 (58.1%) mean ICC, range 0.05 (0.01-0.283) 0.04 (0.01-0.05) 0.09 (0.01-0.283) 0.04 (0.01-0.1) Statistical test for Primary Outcome 61 (93.9%) 14 (87.5%) 17 (94.4%) 30 (96.8%) Statistical test accounts for clustering 49 (75.4%) 14 (87.5%) 14 (77.8%) 21 (66.7%) Statistical test adjusts for baseline characteristics 36 (55.4%) 6 (37.5%) 12 (66.7%) 18 (58.1%) Treatment effect present 20 (30.8%) 5 (31.25%) 5 (27.8%) 15 (48.4%)
Heterogeneity Analysis in CRTs Subgroup Analyses for Primary Outcomes All Cancer Cardiovascular Pulmonary N =65 N = 16 N = 18 N = 31 Any Subgroup Analysis Performed 14 (21.5%) 2 (16.4%) 9 (50%) 3 (9.7%) Demographic Subgroup Analysis 6 0 4 2 Age 4 0 3 1 Sex 4 0 3 1 Race/Ethnicity 0 0 0 0 Socioeconomic Status 1 0 0 1 Power analysis for Subgroup 1 (1.5%) 0 (0%) 0 (0%) 1 (3.2%) Statistical Test Interaction Testing 10 1 7 2 Other 1 0 1 Not Reported 2 0 1 1 HTE found for any subgroup, n (%) 3 0 3 0 HTE found for demographic subgroup, n (%) 0 0 0 0
Quality Assessment of CRT Design 4, 5 Quality Assessment All Cancer Cardiovascular Pulmonary N =65 N = 16 N = 18 N = 31 Justification for cluster design, n (%) 23 (35.4%) 8 (50%) 4 (22.2%) 11 (35.5%) Uses at least 4 clusters per treatment group 58 (89.2%) 13 (81.25%) 16 (88.9%) 29 (93.5%) Allows for clustering in sample size 39 (60%) 10 (62.5%) 12 (66.7%) 17 (54.8%) Uses matching, stratification, or minimization 34 (52.3%) 6 (37.5%) 11 (61.1%) 17 (54.8%) Allows of clusterning in analysis 49 (75.4%) 14 (87.5%) 14 (77.8%) 21 (66.7%) 4. Toklahi Sat al. Occupation, Participation and Health 2016, Vol. 36(1) 14 24 5. Eldridge S et al. Clinical Trials, 1(1), 80-90. doi:10.1191/1740774504cn006rr
Consolidated Standards of Reporting Trial (CONSORT) Guidelines for CRTs 6 Reporting Quality All Cancer Cardiovascular Pulmonary N =65 N = 16 N = 18 N = 31 Cluster RCT in title 36 (55.4%) 11 (68.75%) 8 (44.4%) 17 (54.8%) ICC estimate included 39 (60%) 10 (62.5%) 10 (55.6%) 19 (61.3%) Lists number of clusters randomized 61 (93.8%) 15 (93.75%) 17 (94.4%) 29 (93.5%) Describes baseline comparison of clusters 21 (32.3%) 5 (31.25%) 5 (27.8%) 11 (35.5%) Describes baseline comparison of individuals 62 (95.4%) 16 (100%) 18 (100%) 28 (90.3%) Average cluster size listed 23 (35.4%) 6 (37.5%) 6 (33.3%) 11 (35.5%) Explains whether analysis conducted at the cluster or individual level 38 (58.5) 10 (62.5%) 8 (44.4%) 20 (64.5%) Reports on loss to follow up of clusters 56 (86.2%) 13 (81.25%) 17 (94.4%) 26 (83.9%) Reports on loss to follow up individuals 60 (92.3%) 15 (93.75%) 17 (94.4%) 28 (90.3%) 100% Reporting Quality 7 (10.8%) 2 (12.5%) 1 (5.5%) 4 (12.9%) 6. Cameron, M et al. BMJ 2012;345:e5661 doi: 10.1136/bmj.e5661
Systematic Review: Conclusions Demographic reporting were high for age and sex in CRTs, but less common for race and ethnicity (18.4%). NIH categories rarely listed when race reported. Most trials used matching or stratification for 1-2 cluster variables, but rare to ensure balance by demographics or individual covariates (1 study matched on age). Baseline adjustment in statistical models is common. Most adjust for age and over half adjust for sex. It is rare for race and ethnicity to to be included. HTE analyses in general are very uncommon HTE for race and ethnicity did not occur in our sample
Should We Care About Demographic Subgroup Imbalance? Over past 12 months, team of statisticians in biostatistics core: Derived theoretical expressions for treatment effect in CRTs Continuous and binary outcomes Imbalance and HTE Validated assumptions with computer simulation models Examined in real clinical trial stimulations Will focus on results HF-ACTION trial data assessing a continuous outcome
HF-ACTION Trial Multicenter, individually randomized controlled trial of exercise training vs usual care 4 Patients with left ventricular dysfunction 35% NYHA class II-IV symptoms despite optimal therapy for 6 weeks Patients randomized from April 2003- Feb 2007 at 82 hospitals in the US, Canada, and France Intervention- 3 months of supervised aerobic exercise program with a goal of 3 sessions per week for 36 sessions, followed by home-based exercise 5 times per week for 40 minutes. Primary end-point of all cause mortality or hospitalization. After adjusting for 4 important variables, all cause mortality or all-cause hospitalization (HR 0.89, 95% CI 0.81-0.99, P=0.03) Trial found significant difference in secondary outcome of 6 minute walk test Connor et al. JAMA. 2009;301(14):1439-1450. doi:10.1001/jama.2009.454
CRT Simulations in HF-ACTION Population restricted to include only black and white patients 2175/2331 patients (93.3%) of HF-ACTION Recreated cluster randomized trial by assigning one intervention (treatment or usual care) to all patients in a cluster. Each site served as a cluster 1000 CRTs created by randomly assigning intervention to cluster sites We examined the degree of imbalance and bias created in recreated CRTs. For this analysis, we focus on the continuous outcome of 6- minute walking distance measured at 3 months.
Baseline Characteristics in HF-ACTION: Restricted to Blacks and Whites Selected Baseline Exercise Training Usual Care Total Characteristics (N=1075) (N=1100) (N=2175) p value Age at randomization, mean (SD) 0.83 Mean (SD) 59.4 (12.4) 59.1 (13.0) 59.3 (12.7) 0.18 Male Sex 70.3% 72.9% 71.6% Race 0.54 Black or African American 35.1% 33.8% 34.4% White 698 (64.9%) 728 (66.2%) 1426 (65.6%) Distance walked at baseline (m) 363.2 (102.8) 365.1 (106.6) 364.1 (104.7) 0.71 Mean (SD)
Results of Walking Distance at 3 Months Using Linear Regression in HF- ACTION RCT Covariate Estimate (meters) 95% CI P-value Exercise Training 23.6 (11.4-35.8) <0.001 Black Race -32.41 (-47.7, -17.1) <0.001 Treatment * Race Interaction -10.1 (-31.2, 11.0) 0.349
Racial Variation Among 82 HF-ACTION Sites
Imbalance Among Recreated CRT Using HF-ACTION Data Among the 1000 simulated CRTs, the mean ICC is 0.14, with min 0.07 and max 0.21
Bias Estimation in CRTs with Continuous Outcomes y ax qt b x t ij ij ij ij ij ij Several effects, but primary interest is to estimate treatment effect (q) accounting for others. Treatment effect heterogeneity (b) is uncommon Effect of covariates (a), e.g. race common Inadequate adjustment for covariates can lead to bias in treatment effect estimate. Bias is proportional to degree of covariate imbalance. Bias U = E( ˆ q ) U -q =a ( X t - X ) c + b ( X t - X ) c
Impact of Imbalance of Cluster Race on Bias in 6 Minute Walk Test at 3 Months
The Impact of Imbalance of Race on Bias in Treatment Effect: Adjustment Models
Summary of Bias in Treatment Effect Using Different Model Adjustment Scenarios Model Adjustment 0% Imbalance 7% Imbalance 14% Imbalance 20% Imbalance Crude -0.45-4.28-8.11-11.39 Adj. for race -0.62-2.82-5.02-6.9 Adj. for race and other covariates 0.03-1.65-3.33-4.77 Pre-post difference in baseline walking distance 0.63-1.65-3.92-5.87 Conclusion: Bias remains irrespective of adjustment method.
Relationship Between Imbalance and Cluster Size
Conclusions Imbalance of demographic subgroups may introduce significant bias in treatment effect. If bias greater than treatment effect, can nullify results or reverse treatment effect. If HTE analyses by demographic subgroups are expected, imbalance issues may need to be addressed. Demographics should be consistently collected and reported. Strategies to ensure balanced design may be needed if demographic HTE analyses are to be expected for relevant CRTs.
Acknowledgements Systematic Review Gillian Sanders, PhD Remy Coeytaux, MD, PhD Isaretta Riley, MD MPH Larry Jackson, MD MPH Hussein Al-Khalidi, PhD Amanda McBroom Brooks, PhD Kathryn Lallinger Jennifer Gierisch, PhD Simulation Modeling Kingshuk Roy Choudhury, PhD Siyun Yang, MS HF-ACTION investigators- data access Mentor Adrian Hernandez, MD, MHS Grant Management Jill George Tammy Reece
Disclosure Research supported by Common Fund Research Supplements To Promote Diversity In Health Related Research under Award Number 3U54AT007748-04S1 and the Health Care Systems Research Collaboratory Coordinating Center under Award Number 4U54AT007748-04 the National Center for Complementary and Integrative Health, a center of the National Institutes of Health. The views presented here are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health.