Lessons learned for the conduct of a successful screening trial Christine D. Berg, M.D. Adjunct Professor Department of Radiation Oncology Johns Hopkins Medicine IOM State of the Science in Ovarian Cancer Research April 7, 2015
Disclosures Consultant for Medial Cancer Screening
Objectives Review PLCO and ovarian cancer screening results Contrast with NLST trial UKCTOCS Important design issues Important conduct issues Next steps
What is the PLCO? Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial Screening Centers: 10 Coordinating Center Participants: 154,935 Gender: 50:50 Age: 55-74 years Recruitment: 1993-2001 Screening: 1993-2006 Baseline risk factor questionnaire Dietary questionnaires Follow-up: Annual surveys Monitoring and QA Mortality searches Interim analyses regularly
PLCO Trial: Protocol 39,115 Female Randomization 76,705 Male 78,237 Female 55-74 years of age 13 + year follow-up Screened Arm 38,350 Male 39,115 Female Chest X-ray (T0-T2, T3 for current and former smokers) Flexible sigmoidoscopy (T0, T5) Control Arm 38,355 Male 39,122 Female Routine medical care CA-125 (T0-T5) Transvaginal ultrasound (T0-T3) 38,350 Male PSA (T0-T5) Digital rectal examination (T0-T3)
Ovarian Study Design Screening intervention CA-125 annually for 6 years; 35 Units/mL TVU annually for 4 years; ovarian volume >10cm 3 ; cyst volume >10cm 3 ; any solid area or papillary projection extended into the cavity of a cystic ovarian tumor of any size; or there was any mixed (solid/cystic) component within a cystic ovarian tumor 88% power to detect 35% reduction in mortality Average follow-up 12.4 years
Females Randomized 78,216 total 39,105 intervention 39,111 usual care 4,852 prior bilateral oophorectomy 34,253 ovaries intact 34,304 ovaries intact 4,807 prior bilateral oophorectomy
Figure 2. Ovarian Cancer Cumulative Cases and Deaths Buys, S. S. et al. JAMA 2011;305:2295-2303 Copyright restrictions may apply.
NLST design Prospective, randomized trial comparing low-dose helical CT screening to chest x-ray screening for three annual screens with the endpoint of lung cancer specific mortality in 53,454 high risk participants Eligibility Age 55-74; asymptomatic current or former smoker; 30 pack year smoking history; former smokers: quit within preceding 15 years Parameters 90% power to detect 20% difference in lung cancer mortality; α = 0.05 Median follow-up for outcomes ~ 6.5 years (Maximum: 7.4) Vital status known for 97% LDCT 96% CXR http://radiology.rsna.org/content/early/2010/10/28/radiol.10091808.full
Lung cancer mortality: 10-20-2010 Arm Person Years (py) Lung cancer deaths Lung cancer mortality per 100,000 py Reduction in lung cancer mortality (%) Value of test statistic Efficacy boundary LDCT 144,102.6 356 (118) 247 20.0 3.2 2.033 CXR 143,367.5 443 (100) 309 p = 0.0041 Deficit of lung cancer deaths in CT arm exceeds that expected by chance, even allowing for multiple looks at the data. CXR arm compared with matched 30,000 cohort in PLCO, no benefit of CXR seen.
UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) Ian Jacobs, Usha Menon, Steven Skates etc. 202,638 women aged 50-74 enrolled from 2001 2006 and screened until December 31, 2011 and followed until December 31, 2014 Three arms: 101,359 in control Multi-modal screening (50,640): annual CA125 with risk of ovarian cancer algorithm applied with women recalled for repeat CA125; transvaginal ultrasound as second-line test Annual TVU (50,639) Results perhaps by end of 2015 Ongoing UK Familial ovarian cancer screening study
BRCA1 and BRCA 2 Carriers Impact of oophorectomy on cancer incidence and mortality in women with a BRCA1 or BRCA 2 mutation (Narod s cohort) Finch APM, Lubinski J, Moller P et al JCO 2014:1547-1553 5,783 carriers followed prospectively for average of 5.6 years 186 new ovarian, fallopian tube and peritoneal cancers 3660 patients followed: 108 diagnosed clinically with symptoms or screening 1390 chose prophylactic surgery: 46 women with occult cancers: 27 ovarian, 18 primary fallopian tube and one peritoneal Five year survival rates for cancer cases Prophylactic group: 91.6% Clinical group: 38.4% Reduction in risk Cancer: Adjusted HR: 0.20 Overall mortality: Adjusted HR to age 70: 0.31 In my opinion, prophylactic surgery does very well and screening cannot come close
FOCUS ON POPULATION BASED SCREENING QUESTION 1: Is it worth doing? QUESTION 2: Is RCT needed? How large? QUESTION 3: Performance characteristics of modalities to test? QUESTION 4: Trial conduct challenges Compliance/contamination Monitoring of harms
Change in Ovarian Cancer Incidence Yang HP, Anderson WF, Rosenberg PS et al JCO 2013:31;2146-2151.
Design Challenges Ovarian cancer is uncommon; 1 in 2500 post-menopausal women per year NLST enrolled with risk of 4 per 1,000 per year Risk model to improve entry criteria for ovarian cancer would be highly useful Therefore to do adequately powered trial need very large sample size such as UKCTOCS Expensive, hard to do in US PLCO recruitment went from 1993 to 2001; High rate of TAH-BSO in U.S. so entry criteria of trial changed to allow women with surgery and age range expanded to add 55 59 due to slow accrual While NLST recruitment occurred over 20 months: September 2002 to April 2004
Risk Prediction for Breast, Endometrial and Ovarian Cancers Pfeiffer RM, Park Y, Kreimer AR et al PLoS Med 2013; 10(7):e1001492. Ovarian model included oral contraceptive use, menopausal hormone therapy (MHT), parity, and family history of breast or ovarian cancer White, non-hispanic women 50+ from the PLCO and AARP Diet and Health Study; validated in Nurses Health Study 10 year absolute risk ranged from 0.28% to 0.96%: AUC 0.59; discriminatory accuracy may not be as good due to decline in MHT use NLST applied to PLCO: AUC 0.689 Model could help enroll fewer patients into trial but many cancers will still occur outside enrollment criteria
Design Challenges Multiple histologies with varied natural history NLST subset analysis benefit varied by histology Adenocarcinoma 0.75 (95% CI (0.60 0.94) LDCT deaths 136 vs CXR 181 Squamous cell carcinoma 1.23 (95 % CI 0.92 1.64) LDCT deaths 102 vs CXR 83 Multiple biomarkers may be needed, imaging may discover more histologies
Performance Characteristics of chosen tests Were performance characteristics of CA125 good enough to launch large study? Sensitivity of CA125 estimated in two case-control studies for a level of 35 U/mL was 20 57% for cases occurring within 3 years; specificity of 95%. In women with ovarian masses levels elevated in 68-100% of cases and in 40 50% with Stage I disease TVU technology rather early in its development How well does a chosen test(s) need to perform: with 98% sensitivity and 99% specificity: screen 250,000 women expect 100 cancers; detect 98 cancers and 2499 women test false-positive
Conduct Challenges Compliance and Contamination in Ovarian Arm of PLCO Compliance with CA-125: assumed > 90% 85% at baseline 79% year 4 73% year 6 Compliance with TVU: assumed > 85% 84% at baseline 78% year 4 Contamination: assumed <10% CA-125 1.4-1.8% TVU 2.9-4.6%
Conduct challenges Contamination may be more of a problem if there is an exciting new modality of moderate cost and risk; population at risk of ovarian cancer probably have more health seeking behaviors than heavy smokers NLST: compliance 95% in LDCT; 93% CXR; contamination in CXR arm with CT was 4.3% Prostate arm of PLCO: 34% had PSA prior to entry; control arm: 40% annual contamination in first year to 52% in sixth year within 12 months; Intervention arm compliance: 85%
Conduct challenges Procedures to assess abnormal findings frequently involve surgical procedures with risks 34,253 ovary screening patients with 3285 (9.5%) false positives but 1080 (32%) surgeries with 163 (15%, 0.47% overall) patients with major complications NLST 26,309 in LDCT arm; 10,287 (39.1%) had positive test; 713 (6.9%) surgical procedures; 87 (12.2%, 0.33% overall) major complications Non-invasive evaluation as often as possible to rule-out malignancy, monitor closely during trial
If you get what you want, then what? Implementation of a successful screening test has many challenges An important one is the population to be screened: should it be all post-menopausal women? Women that meet trial entry criteria? CT screening recommendations have followed NLST criteria with higher age limits, however, the PLCOm2012 model demonstrates that with better selection fewer people can be screened with more cancers detected (8.8% and 12.4%) Tammemagi et al PLoS Medicine 2014; 11: e10001764.
CONCLUSIONS Need randomized controlled trial for ovarian cancer screening, focus on ovarian cancer only If UKCTOCS shows serial screening works then would need to reevaluate Need good preliminary performance data for screening modality/modalities chosen Improved risk model with GWAS or other added may improve patient selection Adequate sample size, as rapid as possible accrual which will require money Attention to detail during trial; close monitoring of compliance/contamination; morbidity/mortality; QA/QC
With Appreciation Foremost, the patients who participated in these trials without whom they could not have been conducted N L S T National Lung Screening Trial National Cancer Institute
Back-up Slides
Limitations PLCO was studying CA125 as a screening tool which introduced bias: the natural history of the marker to onset of clinical disease was altered. Estimates of the performance of CA125 (and markers correlated with CA 125) may be biased downward in specimens more than 1 year remote from diagnosis. Low sample size; particularly when stratified by histology Missing T3 research specimens EDRN/SPORE work done in clinical samples at which time acute phase reactants and other manifestations of advanced disease may affect markers
EDRN/SPORE/PLCO Biomarker Analysis Single Marker Analysis At 95% Specificity: Marker Sensitivity CA125 0.73 HE4 0.57 Transthyretin 0.47 CA15.3 0.46 CA72.4 0.40 Panel Panel Analysis At 98% Specificity: Sensitivity CA125 alone 64.6 CA125, IGF2, Leptin, MIF, Osteopontin, Prolactin CA125, B7-H4, CA15-3, CA19-9, CA72-4, HE4 CA125, HE4, IGFBP2, Mesothelin, MMP-7, SLPI, Spondin2 32.8 64.6 25.4 CA125, Apo-A1, β2m, CTAPIII, Transthyretin 52.3
Lessons Learned Meticulous attention required to many details Specimen acquisition Study design Analytic performance; quality control Biomarker levels and patterns not directly comparable at time of known clinical diagnosis and pre-clinically Difficult to establish appropriate resource for preclinical specimens of sufficient quality and numbers PLCO: 67 cases of various histologic types with specimens suitable for assessment of biomarker performance in screening out of cohort of 75,000 women