The Women s Health Initiative: Lessons Learned

Similar documents
Menopausal hormone therapy currently has no evidence-based role for

COMMENTARY: DATA ANALYSIS METHODS AND THE RELIABILITY OF ANALYTIC EPIDEMIOLOGIC RESEARCH. Ross L. Prentice. Fred Hutchinson Cancer Research Center

Health Risks and Benefits 3 Years After Stopping Randomized Treatment With Estrogen and Progestin. The WHI Investigators

Hormones and Healthy Bones Joint Project of National Osteoporosis Foundation and Association of Reproductive Health Professionals

Kathryn M. Rexrode, MD, MPH. Assistant Professor. Division of Preventive Medicine Brigham and Women s s Hospital Harvard Medical School

Low-Fat Dietary Pattern Intervention Trials for the Prevention of Breast and Other Cancers

R. L. Prentice Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

Learning Objectives. Peri menopause. Menopause Overview. Recommendation grading categories

5. Summary of Data Reported and Evaluation

How HRT Hurts the Heart

CLINICIAN INTERVIEW CARDIOVASCULAR DISEASE IN POSTMENOPAUSAL WOMEN

Modeling the annual costs of postmenopausal prevention therapy: raloxifene, alendronate, or estrogen-progestin therapy Mullins C D, Ohsfeldt R L

Estrogen and progestogen therapy in postmenopausal women

Section Editor Steven T DeKosky, MD, FAAN Kenneth E Schmader, MD

SUPPLEMENTAL MATERIAL

HRT and bone health. Management of osteoporosis and controversial issues. Delfin A. Tan, MD

THE WOMEN S HEALTH INITIAtive

SERMS, Hormone Therapy and Calcitonin

Marian L Neuhouser, PhD, RD

WEIGHING UP THE RISKS OF HRT. Department of Endocrinology Chris Hani Baragwanath Academic Hospital

Supplementary Online Content

The preferred treatment for osteoporosis

WHI, HERS y otros estudios: Su significado en la clinica diária. Manuel Neves-e-Castro

Experimental Design. Terminology. Chusak Okascharoen, MD, PhD September 19 th, Experimental study Clinical trial Randomized controlled trial

BSO, HRT, and ERT. No relevant financial disclosures

Ms. Y. Outline. Updates of SERMs and Estrogen

THE RISE AND FALL OF MENOPAUSAL HORMONE THERAPY

The Practice Committee of the American Society for Reproductive Medicine,

WHI Estrogen--Progestin vs. Placebo (Women with intact uterus)

DINE AND LEARN ENDOCRINOLOGY PEARLS. Dr. Priya Manjoo, MD, FRCPC Endocrinology, Victoria, BC

Effective Health Care Program

Hormone Treatments and the Risk of Breast Cancer

OB/GYN Update: Menopausal Management What Does The Evidence Show? Rebecca Levy-Gantt D.O. PremierObGyn Napa Inc.

UPDATE: Women s Health Issues

CONSORT 2010 checklist of information to include when reporting a randomised trial*

Low-Fat Dietary Pattern and Risk of Invasive Breast Cancer. The Women s Health Initiative Randomized Controlled Dietary Modification Trial

Hormone Therapy for the Primary Prevention of Chronic Conditions in Postmenopausal Women US Preventive Services Task Force Recommendation Statement

HORMONE THERAPY A BALANCED VIEW?? Prof Greta Dreyer

Outline. Estrogens and SERMS The forgotten few! How Does Estrogen Work in Bone? Its Complex!!! 6/14/2013

DIETARY RISK ASSESSMENT IN THE WIC PROGRAM

CLINICAL PROTOCOL DEVELOPMENT

Something has changed? The literature from 2008 to present?

OSTEOPOROSIS: PREVENTION AND MANAGEMENT

This paper is available online at

All medications are a double-edged sword with risks

The Estrogen Question

Reflection paper on assessment of cardiovascular safety profile of medicinal products

North American Menopause Society (NAMS)

The Dietary Guidelines Advisory Committee Report is based on a rigorous, evidence-based evaluation of the best available science.

James H. Liu, M.D. Arthur H. Bill Professor Chair of Reproductive Biology Dept of Obstetrics and Gynecology

Benefit/Risk Assessment A Critical Need

HORMONE REPLACEMENT THERAPY

Federal Judge Sets Trial Dates for Two Hormone Replacement Therapy Cases in Arkansas Thousands More Pending

Difference between vagifem and yuvafem

A response by Servier to the Statement of Reasons provided by NICE

Reflection paper on assessment of cardiovascular risk of medicinal products for the treatment of cardiovascular and metabolic diseases Draft

OVERVIEW WOMEN S HEALTH: YEAR IN REVIEW

Lessons from the WHI HT Trials: Evolving Data that Changed Clinical Practice

News You Can Use: Recent Studies that Changed My Practice

Potential dangers of hormone replacement therapy in women at high risk

Menopause management NICE Implementation

The 6 th Scientific Meeting of the Asia Pacific Menopause Federation

Challenging the Current Osteoporosis Guidelines. Carolyn J. Crandall, MD, MS Professor of Medicine David Geffen School of Medicine at UCLA

3. Factors such as race, age, sex, and a person s physiological state are all considered determinants of disease. a. True

In this second module in the clinical trials series, we will focus on design considerations for Phase III clinical trials. Phase III clinical trials

Introduction to WHI. From inception to current Extension study: Overview of WHI Protocol and study components and results

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Observational Study Designs. Review. Today. Measures of disease occurrence. Cohort Studies

Accelerating Innovation in Statistical Design

Women s Health: Managing Menopause. Jane S. Sillman, MD Assistant Professor of Medicine Harvard Medical School

The COSMOS Trial. (COcoa Supplement and Multivitamins Outcomes Study) JoAnn E. Manson, MD, DrPH Howard D. Sesso, ScD, MPH

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Continuous update of the WCRF-AICR report on diet and cancer. Protocol: Breast Cancer. Prepared by: Imperial College Team

Low-fat dietary pattern and lipoprotein risk factors: the Women s Health Initiative Dietary Modification Trial 1 4

Lessons learned for the conduct of a successful screening trial

Patient Education. Breast Cancer Prevention. Cancer Center

Evidence Synthesis Number 93

A Proposed Randomized Trial of Cocoa Flavanols and Multivitamins in the Prevention of Cardiovascular Disease and Cancer

Breast Cancer Risk Assessment and Prevention

WARNING LETTER DEPARTMENT OF HEALTH & HUMAN SERVICES TRANSMITTED BY FACSIMILE

NATIONAL INSTITUTE FOR HEALTH AND CLINICAL EXCELLENCE

COPING A newsletter from COPN December 23, 2010 Remember: You can live well with osteoporosis!

Hormone Replacement Therapy (HRT) Benefits & Risks - The Facts

INSIDER S GUIDE Interpretation and treatment: Estrogen Metabolism Assessment

Hormone therapy. Dr. med. Frank Luzuy

Nutrition and Cancer. Prof. Suhad Bahijri

Ovarian Cancer Causes, Risk Factors, and Prevention

Managing menopause in Primary Care and recent advances in HRT

Reliability of Reported Age at Menopause

Post-menopausal hormone replacement therapy. Evan Klass, MD May 17, 2018

OVERVIEW OF MENOPAUSE

Virtual Mentor Ethics Journal of the American Medical Association November 2005, Volume 7, Number 11

MENOPAUSE. I have no disclosures 10/11/18 OBJECTIVES WHAT S NEW? WHAT S SAFE?

9: 3 TABLE OF CONTENTS P&T

Summary of the risk management plan (RMP) for Duavive (conjugated oestrogens / bazedoxifene)

Chemo-endocrine prevention of breast cancer

Women s Health Initiative (WHI) Study How July 2002 changed menopausal management and what the study really says. Robert P.

GUIDELINES FOR USE OF NUTRITION AND HEALTH CLAIMS

Update from the 29th Annual San Antonio Breast Cancer Symposium

Transcription:

Annu. Rev. Public Health 2007. 29:131 50 The Annual Review of Public Health is online at http://publhealth.annualreviews.org This article s doi: 10.1146/annurev.publhealth.29.020907.090947 Copyright c 2007 by Annual Reviews. All rights reserved 0163-7525/08/0421-0131$20.00 The Women s Health Initiative: Lessons Learned Ross L. Prentice and Garnet L. Anderson Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109-1024; email: rprentic@whi.org, garnet@whi.org Key Words calcium/vitamin D, dietary fat, disease prevention, postmenopausal hormones, randomized controlled trial, specimen repository Abstract The Women s Health Initiative (WHI) was initiated in 1992 as a major disease-prevention research program among postmenopausal women. The program includes a randomized controlled intervention trial involving 68,132 women and four distinct interventions: conjugated equine estrogens, alone or in combination with medroxyprogesterone acetate, for coronary heart disease prevention with breast cancer as an anticipated adverse effect; a low-fat eating pattern for breast and colorectal cancer prevention; and calcium and vitamin D supplementation for hip fracture prevention. Results from this multifaceted trial have made a substantial impact in clinical practice. A companion cohort study among 93,676 women serves as a source for new risk factor information and provides a comparative observational assessment of the clinical trial interventions. A specimen repository and quality-controlled outcome data for a range of diseases are among the resources that support the ongoing research program. WHI clinical trial contributions and challenges are reviewed and discussed. 131

WHI: Women s Health Initiative CT: clinical trial OS: observational study DM: dietary modification HT: hormone therapy CHD: coronary heart disease E-alone: conjugated equine estrogens E+P: conjugated equine estrogens plus medroxyprogesterone acetate CaD: calcium and vitamin D INTRODUCTION The Women s Health Initiative (WHI) is perhaps the most ambitious population research investigation ever undertaken. The centerpiece of the WHI program is a randomized, controlled clinical trial (CT) to evaluate the health benefits and risks of four distinct interventions among 68,132 postmenopausal women in the age range 50 79 at randomization. Participating women were identified from the general population living in proximity to any of the 40 participating clinical centers throughout the United States. The WHI program also includes an observational study (OS) that comprised 93,676 postmenopausal women recruited from the same population base as the CT. Enrollment into the WHI began in 1993 and concluded in 1998. Intervention activities in the estrogen plus progestin component of the CT ended early on July 8, 2002, when evidence had accumulated that the health risks exceeded the benefits for this study population. Intervention activities in the estrogen-alone component of the CT also ended early, on February 29, 2004, in part because of an increased risk of stroke. Intervention activities in the other two CT components ended as planned on March 31, 2005. Nonintervention follow-up on participating women is ongoing through 2010, which will give an average follow-up duration of 13 years in the CT and 12 years in the OS. The CT used a partial factorial design (Figure 1). Participating women met eligibility for, and agreed to be randomized to, either the dietary modification (DM) or one of the postmenopausal hormone therapy (HT) components, or to both the DM and HT. The DM component randomly assigned 48,835 eligible women to either a low-fat eating pattern (40%) or self-selected dietary behavior (60%), with breast cancer and colorectal cancer incidence as designated primary outcomes and coronary heart disease (CHD) incidence as a secondary outcome. The nutrition goals for women assigned to the DM intervention group were to reduce total dietary fat to 20% of corresponding daily calories, to increase daily servings of vegetables and fruits to at least five and of grain products to at least six, and to maintain these changes throughout the trial intervention period. The HT component of the CT consisted of two parallel randomized, doubleblind, placebo-controlled trials among 27,347 women; CHD was the primary outcome, hip and other fractures were secondary outcomes, and breast cancer was the primary safety outcome. Of these, 10,739 women had a hysterectomy prior to randomization, in which case there was a randomized allocation between conjugated equine estrogens (E-alone) 0.625 mg/day or placebo. The remaining 16,608 women, each having a uterus at baseline, were randomized (aside from an early assignment of 331 of these women to E-alone) to the same preparation of estrogen plus 2.5 mg/day of medroxyprogesterone (E+P) or placebo. A total of 8050 women were randomized to both the DM and HT CT components. At their one-year anniversary from DM and/or HT trial enrollment, all CT women were further screened for possible randomization in the calcium and vitamin D (CaD) component, a randomized, double-blind, placebocontrolled trial of 1000 mg elemental calcium plus 400 international units of vitamin D3 daily vs. placebo. Hip fracture was the designated primary outcome for the CaD component, with other fractures and colorectal cancer as secondary outcomes. In total, 36,282 (53.3% of CT enrollees) women were randomized to the CaD component. Postmenopausal women ages 50 79 years who were screened for the CT but proved to be ineligible or unwilling to be randomized were offered the opportunity to enroll in the OS. The OS is intended to provide additional knowledge about risk factors for a range of diseases, including cancer, cardiovascular disease, and fractures. It has an emphasis on biological markers of disease risk and on riskfactor changes as modifiers of risk. Table 1 132 Prentice Anderson

Table 1 Women s Health Initiative sample sizes [% of total] by age group Annu. Rev. Public. Health. 2008.29:131-150. Downloaded from arjournals.annualreviews.org Postmenopausal hormone therapy Age group Dietary modification With uterus (E+P a ) Without uterus (E-alone a ) Calcium and vitamin D Observational study 50 54 6,961 [16] 2,029 [14] 1,396 [15] 5,157 [16] 12,386 [15] 55 59 11,043 [25] 3,492 [23] 1,916 [20] 8,265 [25] 17,321 [20] 60 69 22,713 [52] 7,512 [50] 4,852 [50] 16,520 [51] 41,196 [49] 70 79 8,118 [19] 3,575 [24] 2,575 [26] 6,340 [19] 22,773 [26] Total 48,835 16,608 10,739 36,282 93,676 a Abbreviations: E-alone, estrogen alone; E+P, estrogen plus progestin. provides information on enrollment by age group in the various WHI components. In addition to the 40 participating clinical centers, the WHI program is implemented through a clinical coordinating center based at the Fred Hutchinson Cancer Research Center in Seattle. Several components of the National Institutes of Health sponsor and oversee the WHI program, with the National Heart Lung and Blood Institute (NHLBI) taking a coordinating role. A research program as massive, complex, and lengthy as the WHI involves a number of challenges. Some of these are described here with an emphasis on results from each of the CT comparisons and on related reporting and communication issues. Additional detail on the background, design, and organization of the WHI has been published elsewhere (18, 56, 57, 71). HORMONE THERAPY TRIALS Background and Design The primary hypothesis of the HT trials regarding CHD benefit arose from a considerable literature derived from observational studies, animal studies, and intermediate endpoint trials. Observational studies suggested as much as a 45% reduction in CHD risk could be achieved with HT (4, 12, 28, 62). The E- alone and E+P trials were designed (71) to have projected power of 81% and 88%, respectively, to detect a more conservative 22% reduction in CHD incidence. The conservatism in the effect size was intended to account for both anticipated lack of adherence to study pills and loss to follow-up in the trial and potential anticonservatism in the estimated effects obtained from observational studies. At trial initiation HT had been approved by the U.S. Food and Drug Administration for the treatment of both menopausal symptoms and osteoporosis. HT was one of the most commonly prescribed medications in the United States: Prescriptions increased to 90 million annually by 2002 (33). Nonetheless, no randomized trial had been conducted with a goal of demonstrating the efficacy of HT for fracture prevention. Hip fracture was therefore designated as a secondary outcome of the trial. The power for observing a 21% reduction in hip fracture rates was somewhat lower than for CHD (65% for E-alone and 73% for E+P) but was excellent (>99% for both trials) for testing effects on a combined osteoporotic fracture outcome. On the basis of a wealth of observational studies showing a modest increase ( 20%) in breast cancer rates with longer-term exposure to exogenous hormone therapy (4, 64), breast cancer incidence was the designated primary safety outcome of the trials. The power to detect a 22% increase in breast cancer in the trials was relatively low (46% for E-alone and 54% for E+P) during the planned nine-year follow-up period. Two options were identified to address this potential safety concern. First, if both therapies yielded similar effects, the results from the two trials could be pooled, which would give nearly 80% power. A second www.annualreviews.org WHI Lessons Learned 133

Table 2 Hazard ratios (HR) and 95% confidence intervals (CIs) for various clinical outcomes in the estrogen plus progestin and estrogen-alone trials a E+P E-alone Annu. Rev. Public. Health. 2008.29:131-150. Downloaded from arjournals.annualreviews.org Hypothesized effect HR 95% CI AR HR 95% CI AR CHD (39, 45) 1.24 1.00 1.54 +6 0.95 0.79 1.15 3 Stroke (32, 68) 1.31 1.02 1.68 +8 1.37 1.09 1.73 +12 Pulmonary embolism (20, 21) 2.13 1.45 3.11 +10 1.37 0.90 2.07 +4 Venous thromboembolism 2.06 1.57 2.70 +18 1.32 0.99 1.75 +8 (20, 21) Breast cancer (15, 63) 1.24 1.02 1.50 +8 0.80 0.62 1.04 6 Colorectal cancer (16, 70) 0.56 0.38 0.81 7 1.08 0.75 1.55 +1 Endometrial cancer (1) 0.81 0.48 1.36 1 NA Hip fractures (13, 44) 0.67 0.47 0.96 5 0.65 0.45 0.94 7 Total fractures (13, 44) 0.76 0.69 0.83 47 0.71 0.64 0.80 53 Total mortality (70, 74) 0.98 c 0.82 1.18 1 1.04 c 0.91 1.12 +3 Global index b (70, 74) 1.15 c 1.03 1.28 +19 1.01 c 1.09 1.12 +2 Diabetes (9, 46) 0.79 0.67 0.93 0.88 0.77 1.01 Gall bladder disease (17) 1.59 1.28 1.97 1.67 1.35 2.06 Stress incontinence (31) 1.87 1.61 2.18 2.15 1.77 2.62 Urge incontinence (31) 1.15 0.99 1.34 1.32 1.10 1.58 Peripheral artery disease (37, 38) 0.89 0.63 1.25 1.32 0.99 1.77 Probable dementia (60, 61) 2.05 1.21 3.48 1.49 0.83 2.66 a Abbreviations: AR, attributable risk per 10,000 person years; E+P, estrogen plus progestin; E-alone, estrogen alone; HR, hazard ratio; CI, confidence interval. Hazard ratio estimates are based on proportional hazards analysis stratified by age (five-year categories) and randomization in the dietary modification trial. b Global index defined for each woman as the time to the earliest diagnosis of CHD, stroke, pulmonary embolism, breast cancer, colorectal cancer, endometrial cancer (for E+P), hip fractures, and death from other causes. c Based on an average 5.2 and 6.8 years of follow-up for E+P and E-alone, respectively. All others based on an average of 5.6 (E+P) and 7.1 (E-alone) years of follow-up. PEPI: Postmenopausal Estrogen and Progestin Intervention (trial) option outlined in the protocol and motivated particularly by the hypothesized lag time to intervention effects was to continue followup without intervention for another 5 years. In this approach, the power to detect a 22% increase was 79% and 87% for E-alone and E+P, respectively. Effects on multiple other disease processes as summarized in Table 2 were also of interest because of results from previous studies. A feature in both the power calculations and the corresponding statistical analysis plan was the acknowledgment of an anticipated lag-time to full intervention effect via a weighted log rank statistic. The purpose of the weighted log rank test was to improve power. Because differences observed very early in the intervention period were more likely attributable to chance, these early events would be given less weight than events occurring later in the intervention period. The weighting function was defined to rise linearly from zero at randomization to a plateau of one after three years postrandomization for cardiovascular and fracture outcomes and after ten years for cancer and mortality outcomes. Results from the NHLBI-sponsored Postmenopausal Estrogen and Progestin Intervention (PEPI) trial (48, 50) generally supported trial hypotheses and contributed to an early change in the E+P trial design because of adherence concerns related to the 134 Prentice Anderson

use of E-alone among women with a uterus (49). Trial Findings The E+P trial was terminated on the advice of the independent Data and Safety Monitoring Board after a mean 5.2 years of follow-up because of an increased risk of breast cancer and an overall assessment of harms exceeding benefits for chronic disease prevention (2, 69, 74). These interim data, also summarized in (5), revealed an early adverse effect on CHD rates, which was consistent with the Heart Estrogen/Progestin Study (HERS) secondary prevention trial results (41), and continuing adverse effects on risk of stroke and venous thromboembolic (VT) disease, which were not offset by the observed reductions in hip fracture and colorectal cancer incidence rates. A prespecified global index, devised to assist in benefit vs. risk monitoring and defined for each woman as time to the first event for any of the designated clinical events (CHD, stroke, pulmonary embolism, breast cancer, colorectal cancer, endometrial cancer, hip fractures, or death from other causes), found a 15% increase in the risk of women experiencing one or more of these events. A summary of the most complete trial results published (Table 2), in most cases including all events occurring through the blinded intervention period ending July 7, 2002 (mean follow-up 5.6 years), confirmed the interim findings. The observed 24% increase in breast cancer incidence (15), the 31% increase in risk of stroke (68), and the doubling of VT rates (21) with E+P suggest attributable risks of 8, 8, and 18 per 10,000 person-years, respectively, in this population. Benefits of 7 fewer colorectal cancer cases (44% reduction) (16), 5 fewer women with hip fractures (33% reduction), and 47 fewer women with any osteoporotic fractures (24% reduction) (13) per 10,000 person-years were found. The observed 24% increase in CHD risk was not quite statistically significant, but the confidence intervals (CIs) clearly rule out any meaningful benefit (45) during this intervention period. A nonsignificant 19% reduction in endometrial cancer incidence and a nonsignificant 56% increase in risk of ovarian cancer were reported (1). No effect was observed on total mortality (74). Approximately 18 months later, the National Institutes of Health (NIH) stopped the E-alone trial. At that point, the Data and Safety Monitoring Board had reached an impasse with regard to stopping or continuing the trial (2, 69). With a mean 6.8 years of follow-up, the E-alone hazard ratio (HR) comparing incidence rates for CHD in the E- alone arm to that in the placebo group was less than one but with only a year of followup remaining and 50% of participants no longer taking study hormones, the likelihood of finding a statistically significant reduction was small. The E-alone effect on stroke risk was approaching the stopping boundary for harm, and the HR was almost identical to that observed with E+P, adding credence to the results. Estrogen alone appeared to increase the risk of thromboembolic events but to a lesser extent than was observed with E+P. The two therapies were closely comparable in their decreased HRs for hip, vertebral, and other fractures. Most surprising was an estimated 23% reduction in breast cancer incidence, which narrowly missed being statistically significant, in stark contrast with the increased risk seen in most observational studies and the E+P trial. The lack of an effect of E-alone on colorectal cancer risk was also inconsistent with previous studies and the sister trial. E-alone HRs for total mortality and the global index were close to one, indicating an overall balance in the number of women randomized to E-alone or to placebo who experienced these designated outcome events. After additional review, the NIH made the decision to stop on the basis of the assessment of an increased risk in stroke and the low probability of detecting either a benefit for CHD or an adverse effect on breast cancer (70). The most complete results available from the E-alone trial (20, 32, 39, 44, 63) are VT: venous thromboembolism CI: confidence interval HR: Hazard ratio www.annualreviews.org WHI Lessons Learned 135

summarized in Table 2, encompassing an average of 7.1 years of follow-up except where noted. A different and more balanced profile of effects is evident for E-alone than was observed with combined HT. Only the increased risk of stroke and the reduction in fracture risk reached statistical significance. The HRs for both the global index and total mortality are close to one. Analyses of several auxiliary outcomes have also been reported, many of which parallel those from the HERS trial (27, 40). Both HT preparations were associated with a lower rate of new diagnoses of diabetes (9, 46), similar increases in risks of gall bladder disease (17), and adverse effects on urinary incontinence (31). The estimated effects on peripheral artery disease were mixed (37, 38). In the WHI Memory Study, an ancillary study among women 65 years and older, a significant increased risk of probable dementia was observed with E+P, and a smaller, nonsignificant increase was observed with E-alone (60, 61). No important effects on health-related quality of life were observed in either trial (11, 29). Implication for Results Presentation The initial E+P publication provided an overall summary of intervention effects on the major clinical outcomes (74). Technical aspects of this report caused confusion for some readers in interpreting the strength of these findings. An appreciation for the clinical importance of the findings was also somewhat challenging to portray. The data were primarily reported using HRs and 95% CI from Cox proportional hazard models, instead of using the weighted log rank statistics specified in the protocol, to provide interpretable estimates of effect size across all the multiple outcomes. Furthermore, the underlying assumption of the weighting function for cardiovascular outcomes was clearly contradicted; that is, any protective effect had not emerged by three years from randomization, so an unweighted analysis was used to provide the most transparent summary of the data. For simplicity, the reporting of other outcomes, including breast cancer, used the same unweighted proportional hazards model regardless of whether the assumptions behind the weighting were satisfied. These unweighted hazard ratios provided a conservative view of the statistical significance of the effect on breast cancer because the lower limit of the corresonding CI was 1, even though the weighted log rank test was highly statistically significant ( p = 0.001). Another technical point was related to the inclusion of two sets of CIs, nominal and adjusted. The commonly used nominal or unadjusted CIs test the treatment effect on each outcome separately at the 5% level. Because they do not acknowledge the multiple outcomes or multiple looks during trial monitoring, the probability is greater than 5% that one or more will be significant by chance alone. The adjusted CIs, in contrast, control the experiment-wise type I error rate. The widths of these adjusted CIs were obtained by incorporating both the Bonferroni correction for multiple outcomes and the group sequential corrections defined for trial monitoring. These adjusted confidence intervals provided a very conservative data summary, with less than a 5% chance that any of the comparisons will produce a false positive result. Neither approach precisely describes the variability associated with the hazard ratio estimates, but the two sets of CIs can be seen as bracketing the true values. See Anderson et al. (2) for additional details. In retrospect, a focus on the nominal CIs, with a cautionary note on the multiple testing issues, may have been more useful to some readers. Additional methodology development is clearly needed to allow more precise adjustments to control the error rates that can arise from multiple hypothesis testing. Both HRs and absolute risks were presented for each of the major outcomes (74), but these two effect summaries provide a somewhat different sense of the magnitude of effects. A 20% 30% increased risk of CHD, stroke, or breast cancer, especially when 136 Prentice Anderson

associated with a prevalent exposure such as E+P, is noteworthy and important for regulatory and policy purposes when considering at-risk populations. However, differences in absolute risk in the range of 1 2 events per year per 1000 women taking E+P seem small when put in the context of an individual woman seeking relief from severe vasomotor symptoms. Clinical Implications These trial data provide the most reliable summary of the risks and benefits of estrogen alone and estrogen plus progestin on chronic disease rates in healthy postmenopausal women curently available. Although the attributable risks in absolute terms for the outcomes of interest were generally small, the observed benefits were even smaller. The overall profile of HT effects is not consistent with recommendations for primary prevention because of the high sensitivity to safety concerns when intervening on otherwise healthy women. The interpretation of the WHI results for clinical practice regarding women with menopausal symptoms remains controversial. For informing decisions regarding the management of symptomatic women, the limitation of the clinical trials should be noted. The trial populations represent those for whom chronic disease prevention, not symptom relief, was the primary objective. The primary results represent an average effect among the diverse study population. The small attributable risks and benefits for future clinical event rates may not be generalizable across populations and must be viewed against the potential for symptom relief in the present. The possibility that either the risks or the benefits might be isolated to a subset of women to facilitate a risk-based decision process has been examined. For both trials, additional detailed analyses of each of the major outcomes have examined potential interactions with multiple baseline characteristics, including a number of cardiovascular disease biomarkers. Very few of these secondary analyses have yielded nominally significant interactions, and no consistent subgroup of women has been found to be at particularly high or low risk across the major diseases. In this regard, interest has focused on the youngest women studied (aged 50 59) for multiple reasons: These women are most likely to seek help for vasomotor symptoms; the observational studies motivating the trials studied women who usually initiated therapy during the menopausal transition; and an argument can be made that early intervention in atherosclerosis may be helpful, whereas later maneuvers could be detrimental. For E- alone, the data suggest that women aged 50 59 may experience some reduction in CHD risk and in the global index (39, 70). There was no evidence of a reduction in CHD risk for the younger women in the E+P trial, however (45). A recent exploratory analysis that pooled the data from the two trials suggests that CHD effects may be a function of time since menopause; women beginning HT soon after menopause displayed a possible reduced risk of CHD. The effects observed for stroke and total mortality did not vary significantly with age or years since menopause, however, suggesting that younger women do not escape all these adverse effects (58). Comparison of HT Effects from the CT and OS WHI investigators have also presented some detailed comparisons of HR estimates for E+P and E-alone between the CT and OS. The relevance of these comparisons is enhanced by the fact that women were concurrently recruited from the same sources for the two cohorts, with much commonality of data collection procedures, including a personal interview to ascertain information on the use of HT prior to WHI enrollment. For CHD, stroke, and VT, HR estimates from the OS were 40% 50% lower than in the CT, for both E+P and E-alone, and were rather similar to those reported from other observational www.annualreviews.org WHI Lessons Learned 137

studies. This discrepancy was reduced to 30% 40% following detailed control for potential confounding factors. However, HR estimates from the CT and OS aligned rather closely for CHD and VT after also accounting for HR dependence on duration of HT use, with some possible residual difference for stroke (52, 53). Evidently cohort studies, such as the WHI OS in which participants may be some years postmenopausal at enrollment, are likely to miss the early adverse effects identified in the CT and produce summary statistics (e.g., average HRs over cohort follow-up periods) that reflect primarily later years of HT use, where cardiovascular effects may be more favorable. This type of joint CT and OS analysis also has potential to augment findings from the HT trials, e.g., in subjects of participants of particular interest (e.g., age 50 59), or effects related to longer-term HT use than was studied in the CT. Additional analyses of the type described above have also been carried out for breast cancer, where hazard ratios from the OS are substantially higher than from the CT, for both E+P and E-alone, and are similar to those from other observational studies (e.g., 47). These analyses have yet to be published. Lessons Concerning Study Design, Monitoring, and Analysis Early criticisms of HT trial methodology have mostly subsided with more complete information. One criticism persists: WHI did not study HT in the way that it is be administered in practice the women were older than would normally be considered for initiating HT and the study did not tailor the study hormones to the individual women. If medical practice is to be evidence based, should not the design of these trials reflect the way medicine would be practiced? As important as this principle is, designing the study to better reflect clinical practice would require that the study limit recruitment to women as they entered menopause or generally those in their 50s. Because CHD rates are still low during this time of life, this design would have required perhaps a tenfold increase in sample size or an intervention and follow-up period beyond any previously attempted both infeasible options for WHI. Also, it is important to recall that the principal motivation for the WHI HT trials was to test CHD-prevention hypotheses, and information on potential CHD prevention was of great interest for women across the entire age range (50 79 years) included in these trials. The absence of a randomized comparison between E+P and E-alone is another limitation of the WHI design. The original design allowed for a direct comparison of E+P to E-alone by randomizing women with an intact uterus to E-alone and requiring rigorous follow-up for endometrial pathology. The PEPI study results (49) indicated the infeasibility and safety concerns of this option and required an early change in the WHI protocol. The effects of the two preparations appear to differ for some outcomes, but our ability to test for these differences is conditional on the assumption that adequate adjustment can be made for differences in the two study populations. One of the most difficult aspects of trial design and monitoring was related to the multiple outcomes. HT was anticipated to have effects on multiple organ systems, differing in direction, magnitude, and timeline. The hypothesized benefits for CHD were expected to be larger and to occur earlier than the anticipated adverse effects for breast cancer. In addition, the age-specific incidence varies notably across outcomes, with CHD and fractures events being more concentrated in women of advanced age compared with breast cancer. With this in mind, the sample size and trial duration were defined to try to generate adequate information for all these important clinical outcomes. See Anderson et al. (2) for additional detail on the CT monitoring plan and on its relationship to the early cessation of the HT trials. Perhaps the most important lesson learned from the HT trials was the value of full-scale 138 Prentice Anderson

randomized trials to assess the overall health impact of preventive interventions. Despite a large number of observational studies examining the associations between HT and various health outcomes, understanding of the risks and benefits of HT, and the differences between E-alone and E+P, was not reached. Implications for Communicating Results to a Diverse Audience The WHI findings were released simultaneously to WHI participants, health care providers, other researchers, and the public. Because of the access to information available through the Internet, it is difficult to release such high-profile data in stages. And although there was deliberate conservatism built into the presentation and interpretation of these results, they undermined a strong tenet of the then-standard practice in women s health, creating a large and immediate demand for recommendations for change in practice. The WHI was not in a position to provide guidance on alternatives to the regimens tested, and the regulatory and professional organizations with interest in HT were caught off guard by the announcement of these results. Considering the technical and substantive nuances described above, it is not surprising that some women and their physicians felt confused by the E+P trial findings. The data in hand made implications for current clinical practice more pressing than would have been the case, for example, had a clear benefit for primary prevention been found. The Escherlike effect of these results, with the magnitude appearing to change depending on the point of focus population or individual was challenging to communicate to a broad audience. DIETARY MODIFICATION TRIAL Study Background, Design, and Implementation The DM trial tested the hypothesis that a change in dietary composition, to a low-fat dietary pattern would reduce breast and colorectal cancer and would have a favorable health benefit vs. risk profile. The breast cancer hypothesis goes back to rodent-feeding studies in the 1940s (66). Subsequent metaanalyses of rodent experiments indicated that both total energy consumption and the percent of energy from fat influenced mammary tumor incidence (24). International correlation studies of dietary fat in relation to various human cancers followed (3, 54) and supported the cancer-prevention potential of a low-fat dietary pattern. A large number of case-control and cohort studies subsequently examined dietary fat and the consumption of other nutrients and foods in relation to a range of chronic diseases, especially cancers. As summarized in later international reviews (72, 73), some of these studies suggested positive associations between dietary fat and the risk of certain cancers, but no associations were judged to have been clearly shown. The National Cancer Institute conducted pilot and feasibility intervention trials of a low-fat dietary pattern for breast cancer prevention beginning in the mid-1980s (10, 42). Even though the intervention aimed to change the composition of the diet, without a goal of calorie reduction, intervention group women did experience some weight loss (42) and had a reduction in blood estradiol (55) within a few weeks of starting the low-fat diet intervention. Participating women also experienced a 5% reduction in plasma cholesterol, leading to the inclusion of CHD as a secondary outcome in the subsequent DM trial. Observational data on diet and cancer that had emerged by 1992 also raised interest in the potential of a low-fat diet to reduce the risk of colon, rectum, ovary, and endometrial cancers. Hence, colorectal cancer was added as a second primary outcome in the DM trial, and all five cancers were listed as dietrelated cancers that may benefit from the lowfat diet intervention. The emerging observational data also led to interest in vegetable and fruit consumption, and grain consumption, for the possible reduction in cancer risk. Therefore, the DM intervention goals were www.annualreviews.org WHI Lessons Learned 139

FFQ: food frequency questionnaire augmented to include not only that of reducing dietary fat to 20% of calories, but also to increasing vegetable and fruit servings to five per day and grain servings up to six per day. A major motivation for an intervention trial with disease outcomes, compared with additional observational study of dietary fat, was uncertainty concerning the reliability of analytic epidemiologic studies on this topic due to measurement error in dietary assessment (23, 30, 65). With this background the dietary modification trial was initiated in 1993. Eligible women were postmenopausal in the age range 50 79 and not already consuming a low-fat diet. It seemed evident from the knowledge of lag time between changes in exposure and corresponding changes in cancer risk that any risk reduction would arise gradually following the adoption of a low-fat dietary pattern, likely over a period of some years or decades. Hence, the DM trial design (71) assumed that the hazard ratio for cancer would decrease linearly from randomization to a minimum value 10 years later. On the basis of international correlation data, this minimum value, for a change from a 40%-energy-fromfat diet to a 20%-energy-from-fat diet, was assumed to be 0.5 for breast cancer and an even smaller 0.3 for colorectal cancer. Even with these rather strong assumptions, upon acknowledging the anticipated difference in percentage of energy from fat between randomization groups on the basis of preceding feasibility studies (42), and making provision for competing risks and loss to follow-up, one obtains a projected breast cancer incidence in the intervention group that is only 14% lower than that in the comparison (usual diet) group, and a 20% lower colorectal cancer incidence in the intervention group (71). These calculations underlie the large sample sizes needed to test this type of dietary hypothesis in a randomized controlled trial setting. For example, formal power calculations using a test statistic (weighted logrank) that is suited to the relative risk model just described produced a target sample size of 48,000 (40% intervention; 60% comparison) to achieve an 86% power for breast cancer and a 90% power for colorectal cancer, with a 9-year average intervention period (71). No controlled dietary intervention trial of this size or duration had previously been attempted. The only major departure from design assumptions that occurred in the DM trial arose from the lower-than-anticipated reported fat content of the diet at baseline for participating women, with resulting reduced differences between percentage of energy from fat between the comparison and intervention groups. Specifically, in preceding feasibility studies, the baseline percentage of energy from fat averaged 38% 39% (42), without any systematic attempt to screen out women already consuming a low-fat diet. To increase study power, 50% of otherwise eligible women were excluded on the basis of an estimated percent of energy from fat <32, as assessed by a food frequency questionnaire (FFQ). With this major exclusion investigators expected that the mean baseline percentage of energy from fat among enrollees would be at least 38. However, this baseline percentage of energy from fat turned out to be about 35. Hence, the difference between the fat content of the diet in the comparison group and the percentage of energy from fat that could be achieved in the intervention group was reduced by 3% from design assumptions, much reducing trial power. WHI investigators reacted to this development by setting rather stringent daily fat gram targets for intervention group women and through a series of specialized initiatives to maintain the dietary differences between intervention and comparison groups. These efforts were partially successful, but the achieved differences in percentage of energy from fat between intervention and comparison groups, as judged by a FFQ, were only 10.7% at 1 year from randomization, 9.5% at 3 years, and 8.1% at 6 years (51), and constituted 70% of the differences projected in the WHI design (71). This single problem reduces the anticipated difference between randomization 140 Prentice Anderson

groups in breast cancer incidence from 14% to 9% 10%, and the colorectal cancer difference went from 20% to 14% under the other trial design assumptions. A smaller departure from design assumption also arose for trial duration. The study plan called for 45 clinical centers to conduct the WHI. However, only 40 applicant centers met review criteria, so these 40 were asked to accept even larger recruitment goals, leading to a somewhat protracted recruitment period and a reduction in average follow-up time from 9 years to 8.1 years at the time the DM trial reached its planned completion date. The lower baseline percentage of energy from fat and the shortened recruitment period combined to yield a projected 8% 9% lower breast cancer incidence in the intervention vs. comparison groups and a 12% lower colorectal cancer incidence under other trial design assumptions (71). Trial Results and Interpretation The DM trial results for the primary and secondary outcomes were published concurrently in 2006 (6, 34, 51). For (invasive) breast cancer (51) the intervention vs. comparison group HR (95% CI) was 0.91 (0.83, 1.01). Some of the media coverage of this suggested benefit took the perspective that previous advice to consume a low-fat diet was clearly contradicted because the breast cancer difference did not meet conventional 0.05 significance level criteria. However, it seems clear that the trial result should not be used as evidence against the low-fat diet and breast cancer hypothesis because the 9% lower risk observed for the intervention group coincides with the 8% 9% lower risk projected from design assumptions, upon acknowledging the reported dietary differences between the randomization groups and the slightly shortened intervention period. Furthermore, the breast cancer HR was lower ( p = 0.04) among women who had a relatively high fat content in their diet at baseline and who, if assigned to the intervention group, made a comparatively larger reduction in percentage of energy from fat (51). One does not expect to see such interaction if the low-fat dietary pattern implemented has no effect on breast cancer. The overall breast cancer HR appeared to vary ( p = 0.04) according to the estrogen and progestin receptor status of the tumor, with the smallest HR (95% CI) of 0.64 (0.49, 0.84) for estrogen receptor positive, progesterone receptor negative tumors. For colorectal cancer (6) the intervention vs. comparison group HR (95% CI) was 1.08 (0.90, 1.29). This finding is evidently inconsistent with the projected 12% reduction mentioned above, based on acknowledging the achieved percentage of energy from fat difference between randomized groups. This negative finding seems important in guiding the diet and cancer research programs. It may not be surprising in light of observational data now available. For example, Howe and colleagues (35) assembled data from 13 colorectal cancer case-control studies conducted in various countries using various dietary assessment methods and found that the apparent association between dietary fat and colorectal cancer largely disappeared upon controlling for total energy, a result quite different from that from corresponding breast cancer analyses (36). By three years after randomization the reduction in total plasma cholesterol concentration (95% CI) was only 3.26 (0.0, 6.5) mg/dl greater in the intervention than in the comparison group, a difference considerably smaller than had been seen in preceding feasibility studies (42). In correspondence, the CHD HR (95% CI) during the intervention period was 0.97 (0.90, 1.06), not significantly different from unity (34). In media coverage of the DM trial results, some researchers suggested that a dietary pattern that focused more on saturated and trans-fat reduction rather than total fat reduction should have been attempted. However, when the DM trial design was being formulated, there was no field-tested intervention that would preferentially reduce saturated fat, while ensuring a major reduction in total fat. Moreover, the www.annualreviews.org WHI Lessons Learned 141

Table 3 Incidence of breast cancer, colorectal cancer, and some other major clinical outcomes in the Dietary Modification Trial (6, 51) Annu. Rev. Public. Health. 2008.29:131-150. Downloaded from arjournals.annualreviews.org Breast cancer incidence Colorectal cancer incidence Total cancer incidence Breast cancer mortality Total cancer mortality Intervention group Comparison group Number of cases (annualized %) Number of cases (annualized %) Hazard ratio a,b (95% CI) Unweighted P-value c Weighted P-value c 655 (0.42%) 1072 (0.45%) 0.91 (0.83, 1.01) 0.07 0.09 201 (0.13%) 279 (0.12%) 1.08 (0.90, 1.29) 0.42 0.29 1946 (1.23%) 3040 (1.28%) 0.96 (0.91, 1.02) 0.15 0.10 27 (0.02%) 53 (0.02%) 0.77 (0.48, 1.22) 0.26 0.27 436 (0.28%) 690 (0.29%) 0.95 (0.84, 1.07) 0.41 0.22 Total mortality 950 (0.60%) 1454 (0.61%) 0.98 (0.91, 1.07) 0.70 0.29 Global index d 2051 (1.30%) 3207 (1.35%) 0.96 (0.91, 1.02) 0.16 0.16 a Abbreviation: CI, confidence intervals. b Proportional hazards model stratified by age (five-year categories), and randomization in the hormone therapy trial. c Weighted log-rank test stratified by age (five-year categories) and randomization in the hormone therapy trial. Weights increase linearly from zero at randomization to a maximum of 1 at 10 years. d Global index defined as the time to earliest diagnosis of breast cancer, colorectal cancer, CHD, or death from other causes. low-fat dietary pattern implemented was expected to have favorable effects on cardiovascular disease risk factors, even though primarily motivated by cancer risk reduction. In fact, small but statistically significant differences did arise between intervention and comparison groups for each of low-density lipoprotein cholesterol, diastolic blood pressure, and clotting factor VIIC (34), with presumably favorable implications for long-term cardiovascular disease risk. Additional reports from the DM trial continue to be developed, including an evaluation of intervention effects on ovarian and endometrial cancers, as well as on other cancers. These reports, along with the continued follow-up of trial participants for clinical outcomes, will contribute importantly to the longer-term findings from the DM trial. Table 3 provides a summary of some key published results from the DM trial. The global index for the trial was defined as time to the earliest diagnosis of breast cancer, colorectal cancer, CHD, or death from any other cause. Additional Lessons and Related Ongoing Research The DM trial exemplifies the challenges inherent in primary disease prevention trials, particularly of behavioral interventions. Even if the intervention has important longterm chronic disease risk reduction potential, lengthy lag times until the full achievement of HR reductions, and the challenges of maintaining intervention goals among free living persons for many years, should not be underestimated. It is not clear how the lower than-anticipated baseline percentage of energy from fat (35 vs. 38) could have been obviated, although using a dietary assessment approach other than the FFQ to exclude women already consuming a low-fat diet may have helped. The DM trial provides evidence, stronger than has arisen from other sources, suggesting a breast cancer benefit with a low-fat dietary pattern. These data, along with recent cohort study reports of positive relationships between fat consumption and breast cancer 142 Prentice Anderson

risk when a food record, rather than a FFQ, is used for dietary assessment (8, 25), can be expected to augment interest in the dietary fat and breast cancer hypothesis. They also illustrate the complementary and reinforcing roles that randomized controlled trials and observational studies can play in assessing diet and chronic disease risk hypotheses. Of course, the issue of measurement error in dietary assessment has a dominant role in nutritional epidemiology observational studies because these studies depend intrinsically on individual dietary assessment. Measurement error could have some influence on the interpretation of secondary analysis of DM trial results if dietary differences between randomized groups are inaccurately ascertained. Analyses of the breast cancer results according to specific dietary changes reported by the intervention group have not been conducted, owing at least in part to the lack of objective information to guide the modeling of measurement error. Hence, the future research agenda in nutritional epidemiology needs to come to terms with longstanding measurement error questions. WHI investigators are attempting to do so through the conduct of nutrient biomarker studies in subsets of the WHI cohorts. A nutrient biomarker study among 544 women in the DM trial (50% intervention, 50% control) has completed data collection. It includes a doubly labeled water assessment of total energy consumption (59) and a urinary nitrogenbased assessment of protein consumption (7). The doubly labeled water measure shows energy to be substantially underestimated using an FFQ, with greater underestimation among obese women and younger women and among women of certain ethnic minorities. These data allow corresponding calibrated (corrected) nutrient consumption estimates to be derived for women participating in the DM trial, or in the OS. WHI investigators are currently using such calibrated dietary data in disease association studies in the WHI cohorts and in explanatory analyses of DM trial results. A companion biomarker study of sev- eral dietary assessment and physical activity assessment methods is also underway in the OS. Results from the DM trial are challenging to communicate to practitioners, researchers, and the general public. The notion that an association of substantial public health potential, along the lines suggested by international correlation studies of dietary fat and breast cancer, may translate to modest differences in disease incidence between randomization groups in an intervention trial (because of lag time and adherence issues) is not easily appreciated by researchers or the public. Clinical researchers are accustomed to seeing targeted incidence rate differences of 30% or more among randomization groups in treatment trials, so there may be a tendency to dismiss observed differences in the 5% 15% range, even though they are entirely consistent with study hypotheses. Another communication challenge relates to potential effects on multiple clinical outcomes. This implies the need for a priority reporting on suitable summary measures of benefit vs. risk, as WHI investigators attempted to do through the specification of a global index. It also implies the need for strategic reporting so that results for primary outcomes receive an appropriate early focus, without distractions related to the reporting of secondary or tertiary outcomes. CALCIUM AND VITAMIN D TRIAL The CaD trial was intended to be a comparatively inexpensive add-on to the CT that would test whether 1000 mg per day of calcium plus 400 international units (IU) of vitamin D 3 would lower the risk of hip fracture, and secondarily other fractures and colorectal cancer, among postmenopausal women. Randomized trials had shown that calcium supplementation slowed bone loss (19, 22), and vitamin D acts to increase intestinal absorption of calcium. There was also one randomized trial that provided evidence of IU: international units www.annualreviews.org WHI Lessons Learned 143

fracture prevention among elderly women in France (14). Uncertainty remained, however, as to whether CaD supplementation would reduce the risk of hip and other fractures among postmenopausal American women, among whom both calcium supplementation and postmenopausal hormone use tended to be frequent. Some observational data supported an inverse relationship between vitamin D consumption and colorectal cancer risk (e.g., 26). Out of concern that it would be too demanding to recruit women into as many as three randomized trials simultaneously, enrollment into the CaD component was delayed until the first anniversary of randomization to one or both of the HT and DM components. A total of 36,282 of the 68,132 CT women (53.3%) were randomized to the CaD trial. Adherence to study medications was good: 59% of participants were taking 80% or more of their study pills at the end of the 7.0-year follow-up period, and another 17% were taking a smaller fraction of pills. The trial proceeded to its planned completion date with results of primary and secondary outcomes presented in 2006 (43, 67). Hip bone density was slightly higher in the active compared with the placebo group at time points considered during the followup period. Table 4 shows key results for clinical outcomes. The estimated incidence of hip fracture was 12% lower in the active vs. the placebo group with HR (95% CI) of 0.88 (0.72, 1.08). Although not reaching conventional 0.05 levels of significance overall, the data suggested some reduced risk among women who were adherent to study pills. There was also evidence of an interaction with age, with suggested benefit among women over 60, and suggested detriment among younger women. Overall fracture incidence was slightly lower in the active compared with the placebo group, but this difference too was not significant. There was no suggestion whatsoever of a colorectal cancer risk reduction with CaD. The principal criticism following the publication of the CaD trial concerned the fact that women with personal supplementation up to 1000 mg per day and vitamin D up to 600 IU per day were allowed to continue and be randomized in the trial. Women already supplementing with CaD were included because the recruitment pool of 68,132 women in the HT and/or DM trials was fixed. Excluding women already supplementing with CaD could only decrease trial enrollment. Moreover, study results could be displayed according to baseline calcium and vitamin D intake from diet, supplements, and medications, allowing the results that would have been obtained by excluding supplementing women to be recovered, along with additional results concerning the benefits and risks of CaD among women already replete with these nutrients. As it turned out, the hip fracture HR did not depend significantly on total calcium or total vitamin D consumption from sources Table 4 Incidence of hip fractures, total fractures, and invasive colorectal cancer in the Calcium and Vitamin D Trial (43, 67) Active (CaD) Placebo Number of cases (annualized %) Number of cases (annualized %) Hazard ratio a (95% Cl) Hip fracture 175 (0.14) 199 (0.16) 0.88 (0.72, 1.08) Total fracture 2102 (1.64) 2158 (1.70) 0.96 (0.91, 1.02) Colorectal cancer 168 (0.13) 154 (0.12) 1.08 (0.86, 1.34) a Hazard ratio and 95% confidence interval (CI) from proportional hazards analysis, stratified on age and randomization assignments in the HT and DM trials and, for fracture outcomes, on presence or absence of prior fracture. 144 Prentice Anderson

other than the study pills, but most of the evidence suggestive of benefit arose from women taking <1200 mg of calcium and <600 IU of vitamin D from other sources. On the other hand, there was no suggestion whatsoever of benefit among women consuming >1200 mg of calcium or >600 IU of vitamin D; this provided valuable public health information against the value of very high calcium consumption, particularly because there was an excess of renal calculi in the active group, with a HR (95% CI) of 1.17 (1.02, 1.34). CONCLUDING REMARKS The WHI program includes the collection and long-term storage of blood specimens at baseline and one year in the CT and baseline and three years in the OS. These specimens, in very well-characterized cohorts having quality-controlled ascertainment of a wide range of clinical outcomes, provide a most valuable population science resource. A large number of publications and ancillary studies continue to emanate from this resource, with considerable involvement by non-whi investigators as described on the Web site http://www.whiscience.org. A portion of the blood specimens (serum, plasma, DNA) have been set aside for use in competitive contract proposals under an NIH Broad Agency Announcement, about which details can be found on the same Web site. The WHI CT, with its four intervention comparisons, was a most challenging trial to conduct and a time-consuming and demanding program for participating women. Each CT intervention had been subject to considerable observational study, but findings were not clear enough to support clinical decision making and public health recommendations, although the HT trials were criticized in the early implementation stages as being unnecessary because it was thought by some to be evident that the health benefits exceeded risks for E-alone and E+P. The initial reports from each CT component were widely publicized and were subject to some controversy, especially that from the E+P trial. The related discussion has been quite valuable in helping to identify the open questions related to the CT interventions and to define future research priorities. Although much of the more detailed analyses of DM and CaD results have yet to be reported, the work already published makes clear the valuable, substantially complementary, roles to be filled by randomized trials and observational studies in population science research. The fact that the HT trials, following decades of observational studies, can provide new information that leads to major changes in product regulation and clinical practice attests to the value of randomized controlled trials when public health implications are sufficiently great. However, preventive intervention trial costs are high enough that few such trials can be conducted, and it will be necessary to rely on observational study designs for information on the benefits and risks of most public health maneuvers that may be entertained. The availability of both randomized trial and observational study data on the same intervention, in the same study population, as in the WHI CT and OS, provides an important opportunity to identify observational study biases, to strengthen the design and analysis methods for both types of studies, and to use observational data to extend the implications of randomized trials. The future population science research agenda may also include an important place for intermediate outcome trials having biomarker outcomes. For example, moderatesized trials having traditional risk factor outcomes, along with high-dimensional plasma proteomic change outcomes in conjunction with new knowledge linking changes in the proteome to risk of various clinical outcomes, may add much specificity to preventive intervention development and help to identify when the next WHI-like clinical trial is needed. With stimulus from WHI and other large population science efforts, and from the technological advances in genetic and molecular www.annualreviews.org WHI Lessons Learned 145

biologic measurement of the past decade, the tools are coming together for a vigorous population science and disease prevention research program in the upcoming years. DISCLOSURE STATEMENT The authors are not aware of any biases that might be perceived as affecting the objectivity of this review. Annu. Rev. Public. Health. 2008.29:131-150. Downloaded from arjournals.annualreviews.org ACKNOWLEDGMENTS The WHI program is supported by contracts from the National Heart, Lung, and Blood Institute. Dr. Prentice s work was partially supported by grants CA53996 and CA119171 from the National Cancer Institute. The authors acknowledge the 161,808 women who participated in WHI and the 200 WHI investigators who contributed valuably to the work summarized here. LITERATURE CITED 1. Anderson GL, Judd HL, Kaunitz AM, Barad DH, Beresford SA, et al. 2003. Effects of estrogen plus progestin on gynecologic cancers and associated diagnostic procedures: the Women s Health Initiative randomized trial. JAMA 290:1739 48 2. Anderson GL, Kooperberg C, Geller N, Rossouw JE, Pettinger M, Prentice RL. 2007. Monitoring and reporting of the Women s Health Initiative randomized hormone therapy trials. Clin. Trials 4:207 17 3. Armstrong B, Doll R. 1975. Environmental factors and cancer incidence and mortality in different countries, with special reference to dietary practices. Int. J. Cancer 15:617 31 4. Barrett-Connor E, Grady D. 1998. Hormone replacement therapy, heart disease and other considerations. Annu. Rev. Public Health 19:55 72 5. Barrett-Connor E, Grady D, Stefanick ML. 2005. The rise and fall of menopausal hormone therapy. Annu. Rev. Public Health 26:115 41 6. Beresford SA, Johnson KC, Ritenbaugh C, Lasser NL, Snetselaar LG, et al. 2006. Low-fat dietary pattern and risk of colorectal cancer: the Women s Health Initiative randomized controlled Dietary Modification trial. JAMA 295:643 54 7. Bingham SA. 2003. Urine nitrogen as a biomarker for the validation of dietary protein intake. J. Nutr. 133:9215 45 8. Bingham SA, Luben R, Welch A, Wareham N, Khaw KT, Day N. 2003. Are imprecise methods obscuring a relation between fat and breast cancer? Lancet 362:212 14 9. Bonds DE, Lasser N, Qi L, Brzyski R, Caan B, et al. 2006. The effect of conjugated equine oestrogen on diabetes incidence: the Women s Health Initiative randomised trial. Diabetologia 49:459 68 10. Bowen D, Clifford CK, Coates R, Evans M, Feng Z, et al. 1996. The Women s Health Trial: feasibility study in minority populations. Design and baseline descriptions. Ann. Epidemiol. 6:507 19 11. Brunner RL, Gass M, Aragaki A, Hays J, Granek I, et al. 2005. Effects of conjugated equine estrogen on health-related quality of life in postmenopausal women with hysterectomy: results from the Women s Health Initiative randomized clinical trial. Arch. Intern. Med. 165:1976 86 12. Bush TL, Barrett-Connor E, Cowan LD, Criqui MH, Wallace RB, et al. 1987. Cardiovascular mortality and noncontraceptive use of estrogen in women: results from the Lipid Research Clinics Program Follow-up Study. Circulation 75:1102 9 146 Prentice Anderson

13. Cauley JA, Robbins J, Chen Z, Cummings SR, Jackson RD, et al. 2003. Effects of estrogen plus progestin on risk of fracture and bone mineral density: the Women s Health Initiative randomized trial. JAMA 290:1729 38 14. Chapuy MC, Arlot ME, Duboeuf F, Brun J, Crouzet B, et al. 1992. Vitamin D3 and calcium to prevent hip fractures in elderly women. N. Engl. J. Med. 327:1637 42 15. Chlebowski RT, Hendrix SL, Langer RD, Stefanick ML, Gass M, et al. 2003. Influence of estrogen plus progestin on breast cancer and mammography in healthy postmenopausal women: the Women s Health Initiative randomized trial. JAMA 289:3243 53 16. Chlebowski RT, Wactawski-Wende J, Ritenbaugh C, Hubbell FA, Ascensao J, et al. 2004. Estrogen plus progestin and colorectal cancer in postmenopausal women. N. Engl. J. Med. 350:991 1004 17. Cirillo DJ, Wallace RB, Rodabough RJ, Greenland P, LaCroix AZ, et al. 2005. Effect of estrogen therapy on gallbladder disease. JAMA 293:330 39 18. Cochrane B, Lund B, Anderson S, Prentice RL. 2003. The Women s Health Initiative: aspects of management and coordination. In Diversity in Health Care Research: Strategies for Multisite, Multidisciplinary and Multiethnic Projects, ed. JW Hawkins, LA Haggerty, pp. 181 207. New York: Springer 19. Cummings RG. 1990. Calcium intake and bone mass: a quantitative review of the evidence. Calcif. Tissue Int. 47:194 201 20. Curb JD, Prentice RL, Bray PF, Langer RD, Van Horn L, et al. 2006. Venous thrombosis and conjugated equine estrogen in women without a uterus. Arch. Intern. Med. 166:772 80 21. Cushman M, Kuller LH, Prentice R, Rodabough RJ, Psaty BM, et al. 2004. Estrogen plus progestin and risk of venous thrombosis. JAMA 292:1573 80 22. Dawson-Hughes B. 1991. Calcium supplementation and bone loss: a review of controlled clinical trials. Am. J. Clin. Nutr. 54:2745 805 23. Day NE, McKeown N, Wong MY, Welch A, Bingham S. 2001. Epidemiologic assessment of diet: a comparison of a 7-day diary with a food frequency questionnaire using urinary markers of nitrogen, potassium and sodium. Int. J. Epidemiol. 30:309 17 24. Freedman LS, Clifford C, Messina M. 1990. Analysis of dietary fat, calories, body weight and the development of mammary tumors in rats and mice: a review. Cancer Res. 50:5710 19 25. Freedman LS, Potischman N, Kipnis V, Midthune D, Schatzkin A, et al. 2006. A comparison of two dietary instruments for evaluating the fat-breast cancer relationship. Int. J. Epidemiol. 35:1011 21 26. Garland CF, Comstock GW, Garland FC, Helsing KJ, Shaw EK, Gorham ED. 1989. Serum 25-hydroxyvitamin D and colon cancer: eight-year prospective study. Lancet 2:1176 78 27. Grady D, Herrington D, Bittner V, Blumenthal R, Davidson M, et al. 2002. Cardiovascular disease outcomes during 6.8 years of hormone therapy: Heart and Estrogen/progestin Replacement Study follow-up (HERS II). JAMA 288:49 57 28. Grady D, Rubin SM, Pettiti DB, Fox CS, Black D, et al. 1992. Hormone therapy to prevent disease and prolong life in postmenopausal women. Ann. Intern. Med. 117:1016 36 29. Hays J, Ockene JK, Brunner RL, Kotchen JM, Manson JE, et al. 2003. Effects of estrogen plus progestin on health-related quality of life. N. Engl. J. Med. 348:1839 54 30. Heitmann BL, Lissner L. 1995. Dietary underreporting by obese individuals: Is it specific or nonspecific? BMJ 311:986 89 31. Hendrix SL, Cochrane BB, Nygaard IE, Handa VL, Barnabei VM, et al. 2005. Effects of estrogen with and without progestin on urinary incontinence. JAMA 293:935 48 www.annualreviews.org WHI Lessons Learned 147

32. Hendrix SL, Wassertheil-Smoller S, Johnson KC, Howard BV, Kooperberg C, et al. 2006. Effects of conjugated equine estrogen on stroke in the Women s Health Initiative. Circulation 113:2425 34 33. Hersh AL, Stefanick M, Stafford RS. 2004. National use of postmenopausal hormone therapy. JAMA 291:47 53 34. Howard BV, Van Horn L, Hsia J, Manson JE, Stefanick ML, et al. 2006. Low-fat dietary pattern and risk of cardiovascular disease: the Women s Health Initiative randomized controlled Dietary Modification trial. JAMA 295:655 66 35. Howe GR, Aronson KJ, Benito E, Castelleto R, Cornee J, et al. 1997. The relationship between dietary fat intake and risk of colorectal cancer: evidence from the combined analysis of 13 case-control studies. Cancer Causes Control 8:215 28 36. Howe GR, Hirohata T, Hislop TG, Iscovich JM, Yuan JM, et al. 1990. Dietary factors and risk of breast cancer: combined analysis of 12 case-control studies. J. Natl. Cancer Inst. 82:561 69 37. Hsia J, Criqui MH, Herrington DM, Manson JE, Wu L, et al. 2006. Conjugated equine estrogens and peripheral arterial disease risk: the Women s Health Initiative. Am. Heart J. 152:170 76 38. Hsia J, Criqui MH, Rodabough RJ, Langer RD, Resnick HE, et al. 2004. Estrogen plus progestin and the risk of peripheral arterial disease: the Women s Health Initiative. Circulation 109:620 26 39. Hsia J, Langer RD, Manson JE, Kuller L, Johnson KC, et al. 2006. Conjugated equine estrogens and coronary heart disease: the Women s Health Initiative. Arch. Intern. Med. 166:357 65 40. Hulley S, Furberg C, Barrett-Connor E, Cauley J, Grady D, et al. 2002. Noncardiovascular disease outcomes during 6.8 years of hormone therapy: Heart and Estrogen/Progestin Replacement Study Follow-up (HERS II). JAMA 288:58 64 41. Hulley S, Grady D, Bush T, Furberg C, Herrington D, et al. 1998. Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. JAMA 280:605 13 42. Insull W, Henderson MM, Prentice RL, Thompson DJ, Clifford C, et al. 1990. Results of a randomized feasibility study of a low-fat diet. Arch. Intern. Med. 150:421 27 43. Jackson RD, LaCroix AZ, Gass M, Wallace RB, Robbins J, et al. 2006. Calcium plus Vitamin D supplementation and the risk of fractures. N. Engl. J. Med. 354:669 83 44. Jackson RD, Wactawski-Wende J, LaCroix AZ, Pettinger M, Yood RA, et al. 2006. Effects of conjugated equine estrogen on risk of fractures and BMD in postmenopausal women with hysterectomy: results from the Women s Health Initiative randomized trial. J. Bone Miner. Res. 21:817 28 45. Manson JE, Hsia J, Johnson KC, Rossouw JE, Assaf AR, et al. 2003. Estrogen plus progestin and the risk of coronary heart disease. N. Engl. J. Med. 349:523 34 46. Margolis KL, Bonds DE, Rodabough RJ, Tinker L, Phillips LS, et al. 2004. Effect of oestrogen plus progestin on the incidence of diabetes in postmenopausal women: results from the Women s Health Initiative Hormone Trial. Diabetologia 47:1175 87 47. Million Women Study Collab. 2003. Breast cancer and hormone replacement therapy in the Million Women Study. Lancet 362:419 27 48. PEPI Trial Writ. Group. 1995. Effects of estrogen or estrogen/progestin regimes on heart disease risk factors in postmenopausal women: the Postmenopausal Estrogen/Progestin Interventions (PEPI) Trial. JAMA 273:199 208 148 Prentice Anderson

49. PEPI Trial Writ. Group. 1996. Effects of hormone replacement therapy on endometrial histology in postmenopausal women: the Postmenopausal Estrogen/Progestin Interventions (PEPI) Trial. JAMA 275:370 75 50. PEPI Trial Writ. Group. 1996. Effects of hormone therapy on bone mineral density: results from the Postmenopausal Estrogen/Progestin Interventions (PEPI) Trial. JAMA 276:1389 96 51. Prentice RL, Caan B, Chlebowski RT, Patterson R, Kuller LH, et al. 2006. Low-fat dietary pattern and risk of invasive breast cancer: the Women s Health Initiative randomized controlled Dietary Modification trial. JAMA 295:629 42 52. Prentice RL, Langer R, Stefanick M, Howard B, Pettinger M, et al. 2005. Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between Women s Health Initiative Clinical Trial and Observational Study Results. Am. J. Epidemiol. 162:404 14 53. Prentice RL, Langer R, Stefanick ML, Howard BV, Pettinger M, et al. 2006. Combined analysis of Women s Health Initiative observational and clinical trial data on postmenopausal hormone treatment and cardiovascular disease. Am. J. Epidemiol. 163:589 99 54. Prentice RL, Sheppard L. 1990. Dietary fat and cancer: consistency of the epidemiologic data and disease prevention that may follow from a practical reduction in fat consumption. Cancer Causes Control 1:81 97 55. Prentice RL, Thompson DJ, Clifford C, Gorbach S, Goldin B, Byar D. 1990. Dietary fat reduction and plasma estradiol concentration among healthy postmenopausal women. J. Natl. Cancer Inst. 82:129 34 56. Rossouw JE, Anderson GL, Oberman A. 2003. Women s Health Initiative: baseline monograph foreword. Ann. Epidemiol. 13:S1 4 57. Rossouw JE, Finnegan LP, Harlan WR, Pinn VW, Clifford C, McGowan JA. 1995. The evolution of the Women s Health Initiative: perspectives from the NIH. J. Am. Med. Womens Assoc. 50:50 55 58. Rossouw JE, Prentice RL, Manson JE, Wu L, Barad D, et al. 2007. Postmenopausal hormone therapy and risk of cardiovascular disease by age and years since menopause. JAMA 297:1465 77 59. Schoeller DA. 1999. Recent advances from application of doubly labeled water to measurement of human energy expenditure. J. Nutr. 192:1765 68 60. Shumaker SA, Legault C, Kuller L, Rapp SR, Thal L, et al. 2004. Conjugated equine estrogens and incidence of probable dementia and mild cognitive impairment in postmenopausal women: Women s Health Initiative Memory Study. JAMA 291:2947 58 61. Shumaker SA, Legault C, Rapp SR, Thal L, Wallace RB, et al. 2003. Estrogen plus progestin and the incidence of dementia and mild cognitive impairment in postmenopausal women: the Women s Health Initiative Memory Study: a randomized controlled trial. JAMA 289:2651 62 62. Stampfer MJ, Colditz GA. 1991. Estrogen replacement therapy and coronary heart disease: a quantitative assessment of the epidemiologic evidence. Prev. Med. 20:47 63 63. Stefanick ML, Anderson GL, Margolis KL, Hendrix SL, Rodabough RJ, et al. 2006. Effects of conjugated equine estrogens on breast cancer and mammography screening in postmenopausal women with hysterectomy. JAMA 295:1647 57 64. Steinberg KK, Thacker SB, Smith SJ, Stroup DF, Zack MM, et al. 1991. A meta-analysis of the effect of estrogen replacement therapy on the risk of breast cancer. JAMA 265:1985 90 www.annualreviews.org WHI Lessons Learned 149

65. Subar AF, Kipnis V, Troiano RP, Midthune D, Schoeller DA, et al. 2003. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: the OPEN study. Am. J. Epidemiol. 158:1 13 66. Tannenbaum A. 1942. Genesis and growth of tumors III. Effect of a high fat diet. Cancer Res. 2:468 75 67. Wactawski-Wende J, Kotchen JM, Anderson GL, Assaf AR, Brunner RL, et al. 2006. Calcium plus Vitamin D supplementation and the risk of colorectal cancer. N. Engl. J. Med. 354:684 96 68. Wassertheil-Smoller S, Hendrix SL, Limacher M, Heiss G, Kooperberg C, et al. 2003. Effect of estrogen plus progestin on stroke in postmenopausal women: the Women s Health Initiative: a randomized trial. JAMA 289:2673 84 69. Wittes J, Barrett-Connor E, Braunwald E, Chesney M, Cohen HJ, et al. 2007. Monitoring the randomized trials of the Women s Health Initiative: the experience of the Data and Safety Monitoring Board. Clin. Trials 4:218 34 70. Women s Health Initiat. Steer. Comm. 2004. Effects of conjugated equine estrogen in postmenopausal women with hysterectomy: the Women s Health Initiative randomized controlled trial. JAMA 291:1701 12 71. Women s Health Initiat. Study Group. 1998. Design of the Women s Health Initiative Clinical Trial and Observational Study. Control. Clin. Trials 19:61 109 72. World Cancer Res. Fund. 1997. Food, Nutrition, and the Prevention of Cancer: A Global Perspective. Washington, DC: Am. Inst. Cancer Res. 73. World Health Organ./Food Agric. Organ. 2003. Diet, nutrition, and the prevention of chronic diseases. WHO Tech. Rep. Ser. 916, Geneva, Switz. 74. Writ. Group Women s Health Initiat. Investig. 2002. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women s Health Initiative randomized controlled trial. JAMA 288:321 33 150 Prentice Anderson

DM 48,835 CaD HT 36,282 27,347 os 93,676 CT = 68,132 WHI = 161,808 Figure 1 Pictorial representation of the partial factorial design of WHI clinical trial (CT) and the WHI observational study (OS). DM, dietary modification; HT, hormone therapy; WHI, Women s Health Initiative. www.annualreviews.org l WHI Lessons Learned C-1

Contents Annual Review of Public Health Volume 29, 2008 Annu. Rev. Public. Health. 2008.29:131-150. Downloaded from arjournals.annualreviews.org Commentary Public Health Accreditation: Progress on National Accountability Hugh H. Tilson xv Symposium: Climate Change and Health Mitigating, Adapting, and Suffering: How Much of Each? Kirk R. Smith xxiii Ancillary Benefits for Climate Change Mitigation and Air Pollution Control in the World s Motor Vehicle Fleets Michael P. Walsh 1 Co-Benefits of Climate Mitigation and Health Protection in Energy Systems: Scoping Methods Kirk R. Smith and Evan Haigler 11 Health Impact Assessment of Global Climate Change: Expanding on Comparative Risk Assessment Approaches for Policy Making Jonathan Patz, Diarmid Campbell-Lendrum, Holly Gibbs, and Rosalie Woodruff 27 Heat Stress and Public Health: A Critical Review R. Sari Kovats and Shakoor Hajat 41 Preparing the U.S. Health Community for Climate Change Richard Jackson and Kyra Naumoff Shields 57 Epidemiology and Biostatistics Ecologic Studies Revisited Jonathan Wakefield 75 Recent Declines in Chronic Disability in the Elderly U.S. Population: Risk Factors and Future Dynamics Kenneth G. Manton 91 vii