Improving Screening Mammography Outcomes Through Comparison With Multiple Prior Mammograms

Similar documents
Recall and Cancer Detection Rates for Screening Mammography: Finding the Sweet Spot

Medical Audit of Diagnostic Mammography Examinations: Comparison with Screening Outcomes Obtained Concurrently

Features of Prospectively Overlooked Computer-Aided Detection Marks on Prior Screening Digital Mammograms in Women With Breast Cancer

Current Strategies in the Detection of Breast Cancer. Karla Kerlikowske, M.D. Professor of Medicine & Epidemiology and Biostatistics, UCSF

Breast Density. Update 2018: Implications for Clinical Practice

now a part of Electronic Mammography Exchange: Improving Patient Callback Rates

The Radiology Aspects

New Palpable Breast Lump With Recent Negative Mammogram: Is Repeat Mammography Necessary?

BCSC Glossary of Terms (Last updated 09/16/2009) DEFINITIONS

Blinded Comparison of Computer-Aided Detection with Human Second Reading in Screening Mammography

Disclosures. Breast Cancer. Breast Imaging Modalities. Breast Cancer Screening. Breast Cancer 6/4/2014

EARLY DETECTION: MAMMOGRAPHY AND SONOGRAPHY

Update in Breast Cancer Screening

Annual Screening Mammography for Breast Cancer in Women 75 Years Old or Older: To Screen or Not to Screen

The U.S. Preventive Services Task Force (USPSTF) CLINICAL GUIDELINE

As periodic mammographic screening is rapidly gaining acceptance, Recall and Detection Rates in Screening Mammography

Dense Breasts, Get Educated

Empowering Women with their Own Breast Health History: Transforming the Experience, Cost and Outcomes of Breast Cancer Screening

Is Probably Benign Really Just Benign? Peter R Eby, MD, FSBI Virginia Mason Medical Center Seattle, WA

Women s Imaging Original Research

BI-RADS Categorization As a Predictor of Malignancy 1

Implementation of Breast Tomosynthesis in a Routine Screening Practice: An Observational Study

EARLY DETECTION: MAMMOGRAPHY AND SONOGRAPHY

5/24/16. Current Issues in Breast Cancer Screening. Breast cancer screening guidelines. Outline

Breast Cancer Characteristics Associated With Digital Versus Film-Screen Mammography for Screen-Detected and Interval Cancers

Update in Breast Cancer Screening

Solitary Dilated Duct Identified at Mammography: Outcomes Analysis

Breast Density and Breast Tomosynthesis. How have they changed our lives?

Outline. Digital Breast Tomosynthesis: Update and Pearls for Implementation. Tomosynthesis Dataset: 2D/3D (Hologic Combo Acquisition)

UW Radiology Review Course Breast Calcifications. BI-RADS 5 th Edition

Breast Tomosynthesis. What is breast tomosynthesis?

Current Status of Supplementary Screening With Breast Ultrasound

Breast Imaging! Ravi Adhikary, MD!

Improving Reading Time of Digital Breast Tomosynthesis with Concurrent Computer Aided Detection

Mammography. What is Mammography? What are some common uses of the procedure?

Breast Cancer Screening

BREAST DENSITY WHAT IS IT? WHY IS IT IMPORTANT? & What IOWA SF250 Means to Patients and Providers

Mammography. What is Mammography?

Comparison of Tomosynthesis Plus Digital Mammography and Digital Mammography Alone for Breast Cancer Screening 1

Digital Breast Tomosynthesis in the Diagnostic Environment: A Subjective Side-by-Side Review

Developing Asymmetry Identified on Mammography: Correlation with Imaging Outcome and Pathologic Findings

Financial Disclosures

Breast Imaging Donald L. Renfrew, MD

Mammographic Breast Density Classification by a Deep Learning Approach

Breast Cancer Screening and Diagnosis

S. Murgo, MD. Chr St-Joseph, Mons Erasme Hospital, Brussels

Since its introduction in 2000, digital mammography has become

Does the synthesised digital mammography (3D-DM) change the ACR density pattern?

Challenges to Delivery of High Quality Mammography

Ge elastography cpt codes

Screening Mammography Policy and Politics. Kevin L. Piggott, MD, MPH August 29, 2015

Tissue Breast Density

The Breast Imaging Reporting and Data System (BI-RADS) has standardized the description and management of findings identified on mammograms, thereby f

Mammographic features and correlation with biopsy findings using 11-gauge stereotactic vacuum-assisted breast biopsy (SVABB)

DESCRIPTION: Percentage of final reports for screening mammograms that are classified as probably benign

BREAST CANCER SCREENING IS A CHOICE

Management of Palpable Abnormalities in the Breast Katerina Dodelzon, MD July 31, 2018, 7:00pm ET

BI-RADS classification in breast tomosynthesis. Our experience in breast cancer cases categorized as BI-RADS 0 in digital mammography

Updates In Cancer Screening: Navigating a Changing Landscape

Innovations in decreasing recall rates for screening mammography

Table 1. Classification of US Features Based on BI-RADS for US in Benign and Malignant Breast Lesions US Features Benign n(%) Malignant n(%) Odds

Accuracy of Screening Mammography Interpretation by Characteristics of Radiologists

Women s Imaging Original Research

Assessing an Emerging Nationwide Population-based Mammography Screening Program in Taiwan

BR 1 Palpable breast lump

DESCRIPTION: Percentage of final reports for screening mammograms that are classified as probably benign

Diagnostic benefits of ultrasound-guided. CNB) versus mammograph-guided biopsy for suspicious microcalcifications. without definite breast mass

Breast cancer screening: Does tomosynthesis augment mammography?

SURVEY OF MAMMOGRAPHY PRACTICE

Supplemental Screening for Dense Breasts. Reagan Leverett, MD, MS

Lung Cancer Risk Associated With New Solid Nodules in the National Lung Screening Trial

Session 4: Test instruments to assess interpretive performance challenges and opportunities Overview of Test Set Design and Use

Session 2: The Role of Specialist Radiology Technologists

Breast asymmetries in mammography: Management

Women s Imaging Original Research

A BS TR AC T. n engl j med 356;14 april 5,

Performance of Screening Mammography: A Report of the Alliance for Breast Cancer Screening in Korea

Accuracy of Diagnostic Mammography and Breast Ultrasound During Pregnancy and Lactation

Untangling the Confusion: Multiple Breast Cancer Screening Guidelines and the Ones We Should Follow

The Comparative Clinical Effectiveness and Value of Supplemental Screening Tests Following Negative Mammography in Women with Dense Breast Tissue

Melissa Hartman, DO Women s Health Orlando VA Medical Center

Shared Decision Making in Breast and Prostate Cancer Screening. An Update and a Patient-Centered Approach. Sharon K. Hull, MD, MPH July, 2017

Women s Imaging Original Research

Testing the Effect of Computer- Assisted Detection on Interpretive Performance in Screening Mammography

Frequently Asked Questions about Breast Density, Breast Cancer Risk, and the Breast Density Notification Law in California: A Consensus Document

Min Jung Kim Department of Medicine The Graduate School, Yonsei University

American College of Radiology/Society of Breast Imaging Curriculum for Resident and Fellow Education in Breast Imaging

BARC/2013/E/019 BARC/2013/E/019. AUDIT OF MAMMOGRAPHY PERFORMED IN OUR HOSPITAL by Surita Kantharia Medical Division

Current issues and controversies in breast imaging. Kate Brown, South GP CME 2015

Page 1. Selected Controversies. Cancer Screening! Selected Controversies. Breast Cancer Screening. ! Using Best Evidence to Guide Practice!

Screening mammograms in women <50 years of age: Low risk is NOT protective

National Diagnostic Imaging Symposium 2013 SAM - Breast MRI 1

Emerging Techniques in Breast Imaging: Contrast-Enhanced Mammography and Fast MRI

Page 1. Cancer Screening for Women I have no conflicts of interest. Overview. Breast, Colon, and Lung Cancer. Jeffrey A.

Scenarios for Clinicians

The Comparative Clinical Effectiveness and Value of Supplemental Screening Tests Following Negative Mammography in Women with Dense Breast Tissue

Interoperability Matters: Impact on Mammography Outcomes Session # 314, February 23, 2017

Screening Mammograms: Questions and Answers

Short-Term Follow-Up of Palpable Breast Lesions With Benign Imaging Features: Evaluation of 375 Lesions in 320 Women

BI-RADS 3 category, a pain in the neck for the radiologist which technique detects more cases?

Transcription:

Women s Imaging Original Research Hayward et al. Comparing Screening Mammograms With Multiple Prior Mammograms Women s Imaging Original Research Jessica H. Hayward 1 Kimberly M. Ray 1 Dorota J. Wisner 2 John Kornak 3 Weiwen Lin 4 Bonnie N. Joe 1 Edward A. Sickles 1 Hayward JH, Ray KM, Wisner DJ, et al. Keywords: cancer detection, mammography, recall rate, screening DOI:10.2214/AJR.15.15917 Received November 25, 2015; accepted after revision April 14, 2016. W. Lin is chief executive officer of Jambeyang Research, LLC. K.M. Ray was supported by seed grant 13-34 from the Department of Radiology and Biomedical Imaging, University of California, San Francisco. Based on a presentation at the Radiological Society of North America 2015 annual meeting, Chicago, IL. 1 Department of Radiology and Biomedical Imaging, University of California, San Francisco, 1600 Divisadero St, Rm C250, Mail Box 1667, San Francisco, CA 94115. Address correspondence to K. M. Ray (kimberly.ray@ucsf.edu). 2 Kaiser San Rafael Medical Center, San Rafael, CA. 3 Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA. 4 Jambeyang Research, LCC, Cupertino, CA. This article is available for credit. AJR 2016; 207:918 924 0361 803X/16/2074 918 American Roentgen Ray Society Improving Screening Mammography Outcomes Through Comparison With Multiple Prior Mammograms OBJECTIVE. The objective of the present study is to evaluate the effect of comparison with multiple prior mammograms on the outcomes of screening mammography relative to comparison with a single prior mammogram. MATERIALS AND METHODS. We retrospectively analyzed 46,288 consecutive screening mammograms performed at our institution for 22,792 women. We divided these examinations into three groups: those interpreted without comparison with prior mammograms, those interpreted in comparison with one prior examination, and those interpreted in comparison with two or more prior examinations. For each group, we determined the rate of examination recall. We also calculated the positive predictive value of recall (i.e., positive predictive value level 1 [PPV1]) and the cancer detection rate (CDR) for both the group of examinations compared with a single prior mammogram and the group compared with multiple prior mammograms. Generalized estimating equations with the logistic link function were used to determine the relative odds ratio of recall as a function of the number of comparisons, with adjustment made for age as a confounding variable. The Fisher exact test was performed to compare the PPV1 and the CDR in the different cohorts. RESULTS. The recall rate for mammograms interpreted without comparison with prior examinations was 16.6%, whereas that for mammograms compared with one prior examination was 7.8% and that for mammograms compared with two or more prior examinations was 6.3%. After adjustment was made for age, the odds ratio of recall for the group with multiple prior examinations relative to the group with a single prior examination was 0.864 (95% CI, 0.776 0.962; p = 0.0074). Statistically significant increases in the PPV1 of 0.05 (p = 0.0009) and in the CDR of 2.3 cases per 1000 examinations (p = 0.0481) were also noted for mammograms compared with multiple prior examinations relative to those compared with a single prior examination. CONCLUSION. Comparison with two or more prior mammograms resulted in a statistically significant reduction in the screening mammography recall rate and increases in the CDR and PPV1 relative to comparison with a single prior mammogram. A large body of evidence that included findings from randomized controlled trials and modern evaluations of organized screening programs indicated that screening mammography reduces the breast cancer mortality rate by approximately 20 40% for women 40 years and older [1]. However, false-positive findings are a frequently cited downside of mammographic screening, which may result in patient anxiety, additional radiation exposure, biopsies, and increased health care costs [2, 3]. The U.S. Preventive Services Task Force cited excessive false-positive results as a reason to not support routine screening of women in their forties and to recommend less frequent screening among older women [4]. Consequently, although the reduced mortality rate associated with screening mammography arguably far outweighs the potential harms of the examination, a reduction in false-positive results should be a priority to ensure that support for and access to screening mammography are maintained. Because millions of women undergo screening mammography each year in the United States, even a small reduction in the recall rate may result in a widespread benefit. Several previous studies have shown that comparison with at least a single prior mammogram is an effective strategy for reducing the screening mammogram recall rate; 918 AJR:207, October 2016

Comparing Screening Mammograms With Multiple Prior Mammograms some of these studies have shown that comparison with a prior mammogram also increases the biopsy yield of cancer (i.e., the chance that a biopsy specimen will include cancerous tissue), although none have shown an increase in the cancer detection rate (CDR) [5 12]. To our knowledge, no established standard currently exists for determining whether comparison with a single prior mammogram or with more than one prior mammogram should be made, although we hypothesized that there would likely be a difference in the outcomes associated with these approaches. Therefore, we chose to test the incremental efficacy of comparison with two or more prior examinations versus a single prior examination. We did not aim to compare the performance within these study groups with interpretation performed without comparison with any prior mammograms because this topic has already been studied. The objective of the present study is to determine whether there is a significant difference in the recall rate, the positive predictive value of recall (i.e., positive predictive value level 1 [PPV1]), and the CDR when a screening examination is compared with a single prior examination versus more than one prior examination. Materials and Methods The present study was approved by the institutional review board at the University of California, San Francisco, and was HIPAA compliant. We performed a retrospective search of the mammography database at our institution to identify screening mammograms obtained at a single screening facility at an academic medical center between June 14, 2010, and March 3, 2015. Although we did not apply specific exclusion criteria to the present study, women seen at our facility during the study period were eligible for screening mammography (and hence were eligible for inclusion in the present study) if they met the following criteria: they were 30 years or older or, if they were BRCA1 gene mutation carriers, older than 25 years; they did not have breast cancer diagnosed during the 5 years before screening mammography; they had no new localized breast symptoms, such as a lump or focal pain; they had no suspicious abnormalities on clinical breast examination; and they did not receive an abnormal assessment (BI-RADS category 0, 3, 4, 5, or 6) on their most recent prior screening or diagnostic mammogram. Use of these eligibility criteria yielded a dataset of 46,288 consecutive screening mammograms performed for 22,792 women. We collected data on patient age, breast composition (i.e., density), personal and family history of breast cancer, the dates that mammograms used for comparison were recorded in the clinical report, recommendation for recall, and subsequent cancer diagnosis. All prior examinations recorded in the mammographic report were counted as prior examinations, whether screening or diagnostic, regardless of how long ago the examination was performed and regardless of whether film-screen or digital mammographic equipment was used. During the study, our facility performed digital mammography only. In accordance with the hanging protocol, screening mammograms were displayed on our PACS workstation, which was standardized for all of the radiologists. In the initial step of this protocol, the two most recent prior mammograms, whether screening or diagnostic, were displayed next to the current mammogram, with six images shown per monitor screen (Fig. 1). The subsequent two steps displayed two images per monitor screen, with the current craniocaudal views shown next to the most recent prior craniocaudal views, followed by the current and most recent prior mediolateral oblique views. In these steps of the hanging protocol, the older prior images were tiled beneath the most recent prior im- Fig. 1 Photograph of workstation hanging protocol with simultaneous display of two prior mammograms used for comparison. Current mammogram is shown at center of display, appearing partially on both monitors. For workstations not capable of displaying six images per monitor, current and most recent examinations may be displayed simultaneously, followed by both current and older examinations. Circles denote skin mole markers. AJR:207, October 2016 919

Hayward et al. ages, but the radiologist could choose whether to view the older prior examinations in this step. In the next step of the hanging protocol, the current images were displayed in full resolution. If at least two prior digital mammograms were not available, then the most recent available filmscreen mammograms were hung on the alternator for comparison, such that the radiologist always had at least two prior mammograms available for comparison. Finally, any additional prior digital or film-screen mammograms were available in the PACS or the patient folder, respectively. These additional prior mammograms were reviewed at the discretion of the radiologist. Women who present for mammography at our institution are encouraged to obtain their prior mammograms from outside facilities for comparison. If these are not available at the time of their visit, they are asked to sign a release form that allows our facility to request such prior mammograms from the outside facilities. The screening mammogram is reported the day after it is performed, regardless of whether prior examinations are available for comparison. If and when prior mammograms are obtained from outside facilities, the radiologist adds an addendum to the original mammogram report and issues a revised final assessment, which is recorded in the mammography database. In the present study, if an addendum was issued to the original screening mammogram report for prior comparison, we counted all prior mammograms used for comparison in the original report and the subsequent addendum, and we recorded the final assessment based on the addendum report. Information regarding cancer diagnosis was obtained through a search of the pathology records at our institution, with a minimum follow-up of 7 months. We divided the screening mammograms into three groups: those interpreted without comparison with mammograms, those interpreted with comparison with one prior examination, and those interpreted with comparison with two or more prior examinations. The latter two groups constitute the comparison groups in the present study. For each of these groups, we subsequently determined the recall rate. The recall rate was defined as the percentage of screening examinations given a BI-RADS category 0, 3, 4, or 5 assessment [13]. In addition, we calculated the PPV1 of the recall rate and the CDR for the groups with one prior mammogram and two or more prior mammograms available for comparison. The PPV1 was defined as the percentage of screening examinations with positive findings (i.e., a BI-RADS category 0, 3, 4, or 5 assessment) that resulted in a cancer diagnosis within 1 year [13]. The CDR was defined as the number of cancers detected per 1000 screening mammography examinations [13]. Generalized estimating equations (GEEs) with a logistic link function were used to determine the relative odds ratio (OR) of recall as a function of the number of comparison examinations with and without adjustment made for age as a confounding variable. The GEE model (rather than conventional logistic regression) was used so we could account for repeated observations regarding recall for the same patient over the study period. The Fisher exact test was performed to compare the PPV1 and the CDR between the study groups. The GEE model was not used for these metrics because patients with breast cancer diagnosed during the study period were removed from the screening pool for the duration of the study, in accordance with the lumpectomy protocol at our institution; therefore, repeated cancer diagnoses would not have been made for the same patient. A nominal p value of less than 0.05 was considered to denote statistical significance. Comparison of PPV1 and CDR was made only between the group with a single prior examination and the group with multiple prior examinations, because these groups, by definition, consisted of incidence screening examinations only. The group without prior mammograms predominantly consisted of prevalence screening examinations and therefore would be expected to have a higher cancer yield relative to incidence screening examinations without prior examinations available for comparison [14]. Therefore, to avoid the confounding effect of prevalence screening mammograms on the PPV1 and the CDR, and to isolate the effect of prior comparison, TABLE 1: Recall Rates for Screening Mammograms and Characteristics of 22,792 Patients, as Defined by the Number of Prior Mammographic Examinations Used for Comparison Characteristic Group With 0 Prior Mammograms Group With 1 Prior Mammogram Group With 2 Prior Mammograms All Groups Screening mammograms Performed, no. 3789 5758 36,741 46,288 Recalled 631 (16.6) 449 (7.8) 2307 (6.3) 3387 (7.3) Time from last prior mammogram (mo), median (range) NA 26 (0.68 321) 14.1 (0.06 251) 14.9 (0.06 321) Breast composition category A or B a 1835 (48) 3259 (57) 22,559 (61) 27,653 (60) C or D b 1954 (52) 2499 (43) 14,182 (39) 18,635 (40) Age (y), mean ± SD 49.5 ± 11.5 55 ± 11.7 61.5 ± 11.1 59 ± 12.2 Personal history of breast cancer Yes 35 (0.9) 178 (3.1) 2995 (8.2) 3208 (6.9) No 2559 (67.5) 3919 (68.0) 24,121 (65.7) 30,599 (61.1) Unknown 1195 (31.5) 1661 (28.8) 9625 (26.1) 12,481 (27.0) Family history of breast cancer None c 1935 (51.0) 2875 (49.9) 17,936 (48.8) 22,746 (49.1) Strong d 986 (26.0) 1718 (29.8) 11,688 (31.8) 14,392 (31.0) Unknown 868 (22.9) 1165 (20.2) 7117 (19.4) 9150 (19.8) Note Except where otherwise indicated, data are the no. (%) of mammograms. NA = not applicable. a Breast composition category A denotes almost entirely fatty breasts, whereas category B denotes scattered areas of fibroglandular density. b Breast composition category C denotes heterogeneously dense breasts, whereas category D denotes extremely dense breasts. c Patient has no first-degree relatives with breast cancer. d Patient has at least one first-degree relative with breast cancer. 920 AJR:207, October 2016

Comparing Screening Mammograms With Multiple Prior Mammograms we studied only the groups of mammograms for which a single prior examination or multiple prior examinations were available for comparison. Results A total of 3789 screening mammograms were interpreted without comparison with prior mammograms, 5758 examinations were interpreted with comparison with a single prior mammogram, and 36,741 examinations were interpreted with comparison with two or more prior (Table 1). For each group of mammograms, the screening recall rate was as follows: for mammograms interpreted without comparison with a prior examination, 16.6%; for those interpreted with comparison with one prior examination, 7.8%; and for those interpreted with comparison with two or more prior examinations, 6.3% (Table 1). The estimated unadjusted OR of recall for mammograms with multiple prior examinations for comparison versus a single prior examination for comparison was 0.789 (95% CI, 0.711 0.877; p < 0.0001) (Table 2). However, because patients with a single prior or no prior mammogram were younger than patients with multiple prior mammograms (Table 1), we repeated our analysis while controlling for age as a confounding variable. After adjustment was made for patient age, the OR of recall for the group with multiple prior mammograms relative to the group with a single prior mammogram was 0.864 (95% CI, 0.776 0.962; p = 0.0074), a finding that was statistically significant (Table 2). Statistically significant increases in the PPV1 and the CDR were also noted for screening mammograms compared with multiple prior mammograms versus a single prior mammogram (Tables 3 and 4). The PPV1 for the group with a single prior mammogram was 0.056 (95% CI, 0.035 0.077), whereas that for the group with multiple prior mammograms was 0.105 (95% CI, 0.093 0.118) (Table 3). For the test of independence on the table, two-tailed p = 0.0009, indicating strong evidence of a difference in PPV1, depending on whether the patient had a single prior examination or multiple prior examinations available. Of the patients with mammograms with a BI-RADS category 4 or 5 assessment, tissue diagnoses were available for all patients who had a single prior mammogram and for all but four patients who had multiple prior mammograms. The CDR per 1000 examinations for the group with a single prior mammogram was 4.3 cases (95% CI, 2.8 6.4), whereas it TABLE 2: Relative Odds Ratio (OR) of Recall of Screening Mammogram as a Function of the Number of Prior Mammograms Used for Comparison Both With and Without Adjustment for Age as a Confounding Variable No. of Comparisons Unadjusted OR a (95% CI) p Adjusted OR b (95% CI) p 1 vs 0 0.430 (0.379 0.489) < 0.0001 0.470 (0.413 0.536) < 0.0001 2 vs 0 0.340 (0.309 0.374) < 0.0001 0.406 (0.366 0.451) < 0.0001 2 vs 1 0.789 (0.711 0.877) < 0.0001 0.864 (0.776 0.962) 0.0074 a Unadjusted OR refers to generalized estimating equation (GEE) logistic model estimates not adjusted for age as a confounding variable. b Adjusted OR refers to GEE logistic model estimates adjusted for age as a confounding variable. TABLE 3: Positive Predictive Value Level 1 (PPV1) of Screening Mammograms Interpreted by Comparison With a Single Prior Examination Versus Two or More Prior Examinations No. of Comparisons With Prior Mammograms No. of True-Positive Findings was 6.6 cases (95% CI, 5.8 7.5) for the group with multiple prior mammograms (Table 4). For the test of independence on the table, twotailed p = 0.0481, which indicated some evidence of a difference in the CDR, depending on whether the patient had a single or multiple prior examinations available for comparison. No statistically significant differences were noted in the characteristics of cancers detected in the group with a single prior mammogram versus those detected in the group with multiple prior mammograms (Table 5). Discussion After adjustments were made for age, the present study showed a statistically significant 14% decrease in the OR of recall of screening mammograms that were compared with two or more prior mammograms relative to those that were compared with a single prior mammogram only. Although several previous studies showed that comparison with at least a No. of False-Positive Findings PPV1 (95% CI) 1 25 424 0.056 (0.035 0.077) 2 243 2064 0.105 (0.093 0.118) Combined 268 2488 0.097 (0.086 0.011) Note For test of independence on the table, two-tailed p = 0.0009, by the Fisher exact test. TABLE 4: Cancer Detection Rate (CDR) for Screening Mammograms Interpreted by Comparison With a Single Prior Examination Versus Two or More Prior Examinations No. of Comparisons No. of Cancers Total No. of Screening Examinations CDR per 1000 Examinations (95% CI) 1 25 5758 4.3 (2.8 6.4) 2 243 36,741 6.6 (5.8 7.5) Combined 268 42,499 6.3 (5.6 7.1) Note For test of independence on the table, two-tailed p = 0.0481, by the Fisher exact test. single prior mammogram reduces the rate of recall of screening mammograms relative to no such comparison [5 12], no study, to our knowledge, has specifically evaluated the effect of comparing a screening mammogram with a single prior examination versus multiple prior examinations. The findings of the present study are relevant because the default approach in many practices is to compare the screening mammogram with a single prior mammogram. However, it has long been the standard in our own practice to make a comparison with at least two prior examinations. We have based this approach on our experience, which has indicated that comparison with prior mammograms may either obviate the recall of borderline findings that have a stable appearance (Fig. 2) or facilitate recall for those cases that show progressive change. The appearance of normal breast tissue often changes from year to year because of variable posi- AJR:207, October 2016 921

Hayward et al. TABLE 5: Characteristics of Cancers Detected on Screening Mammography Examination Interpreted With Comparison With a Single Prior Examination Versus Two or More Prior Comparisons Characteristic Group With 1 Prior Mammogram Group With 2 Prior Mammograms Invasive ductal or lobular carcinoma 14 (56) 134 (55) 0.505 Lesion size < 1 cm 5 (35.7) 68 (50.7) Node negative 5 (35.7) 73 (54.5) Ductal carcinoma in situ 10 (40) 71 (29) 0.505 Minimal b 15 (60) 139 (57) 0.648 Unknown 1 (4) 38 (16) All 25 243 Note Except where otherwise indicated, data are the no. (%) of cases. a Two-tailed p value for test of independence, as calculated by the Fisher exact test. b Invasive cancer 1 cm or ductal carcinoma in situ of any size [13]. p a tioning and technique when mammography is performed; therefore, the likelihood of establishing stability in the appearance of tissue increases with the number of prior examinations used for comparison, because the chance of identifying a prior examination in which the positioning and technique that were used are very similar to those used in the current examination is increased. Similarly, the likelihood of identifying progressive change even slow change in a real lesion increases as the number of prior examinations used for comparison increases. In an environment where analog (filmscreen) images were used, the ability to compare these mammograms with multiple prior examinations was limited by the need to hang the film-screen images used for comparison on an alternator, which required additional effort on the part of clerical staff or radiologist plus the availability of more physical space on the alternator. However, in an environment where digital mammography was performed, comparison with multiple prior examinations was facilitated by the greater ease of display built into the digital workstation. In our digital-based mammography practice, we have established a default hanging protocol that simultaneously displays the current examination and two prior examinations because this overview makes stability or change in the appearance of breast tissue over time more readily apparent (Fig. 1). In the present study, the statistically significant reduction in the rate of recall of screening mammograms in the group with multiple prior examinations available was accompanied by statistically significant increases in the PPV1 and the CDR, compared with screening mammograms in the group with a single prior examination available. Previous studies that compared interpretation of screening mammograms with or without the use of prior examinations showed an increase in the biopsy yield of cancer when prior examinations were used [5 12]; however, none of studies showed a concomitant increase in the CDR. This may reflect the fact that all but one of these previous studies included prevalence screening examinations in the group without prior mammograms, whereas we explicitly performed a comparison of two groups of patients with incidence screening examinations, which, by definition, consisted of studies for which prior examinations were available for comparison. The CDR will be higher for prevalence screening mammograms than for incidence screening examinations [14]. Therefore, to assess the true effect of comparison with a prior mammogram on the CDR, previous investigators would have had to distinguish between incidence screening examinations in- Fig. 2 59-year-old woman with tissue asymmetry in upper right breast (arrows), who was not recalled at screening mammography because of long-term stability compared with multiple prior examinations. Mammograms obtained in right mediolateral oblique view are shown for years 1 (left), 3 (middle), and 4 (right). 922 AJR:207, October 2016

Comparing Screening Mammograms With Multiple Prior Mammograms Fig. 3 75-year-old woman with infiltrating ductal carcinoma of right breast presenting as developing asymmetry (arrows) in medial breast. Screening mammograms obtained at years 1 (left), 2 (middle), and 3 (right) are shown. terpreted without prior mammograms and prevalence screening examinations. To address this issue, Burnside et al. [10] compared only incidence screening mammograms interpreted with and without the use of prior mammograms and found that prior comparisons did increase the CDR of diagnostic mammography but not that of screening mammography. Burnside and colleagues postulated that subtle changes over time might be viewed with greater suspicion in the diagnostic setting, thereby leading to increased sensitivity. However, the results of the present study suggest that similar effects may be observed in the screening environment as well. Indeed, certain signs of malignancy, such as the development of asymmetry, which is, by definition, present as an interval change, would not otherwise be detectable without comparison with prior examinations [15] (Fig. 3). The limitations of the present study include its retrospective design, which may have led to selection bias. Most of the mammograms in the present study were interpreted in comparison with two or more prior examinations (when these were available) because this is the standard approach in our practice. As a result, women in the group with a single prior mammogram were younger than those in the group with multiple prior mammograms. The younger women most likely had fewer prior examinations for comparison because they had undergone screening mammography for a shorter period than had the older women. Because younger age has been shown to be associated with higher recall rates [16], we controlled for age in our statistical analysis. The reduction in the recall rate noted in the group with multiple prior examinations relative to the group with a single prior examination remained statistically significant after controlling for age, which confirmed that the number of prior examinations used for comparison is an independent predictor of recall. Although the higher CDR in the group with multiple prior mammograms could in part be associated with the more advanced age of women in that group, age alone would be unlikely to account for the magnitude of the increase in the CDR observed in the present study. We report a relative increase in the CDR (to 2.3 cases detected per 1000 screening examinations) in the group with multiple prior mammograms versus the group with a single prior examination, whereas the mean age difference between women in these groups was only 6 years (mean age, 55.5 years for women with a single prior mammogram vs 61.5 years for women with multiple prior mammograms). The age-specific incidence of breast cancer for women in the United States, as obtained from the 2008 2012 Surveillance, Epidemiology, and End Results program database, was 2.4 cases per 1000 screening examinations for women 50 59 years and 3.4 cases per 1000 screening examinations for women 60 64 years [17]. On the basis of these data from the Surveillance, Epidemiology, and End Results program, we would expect a relative increase in the CDR of approximately 1 case per 1000 examinations for the group with multiple prior mammograms, on the basis of age, relative to the group with a single prior examination; however, we observed an increase that was more than twice as high, which suggests that comparison with prior examinations has an effect on the CDR that is independent of age. Nevertheless, future studies involving the use of age-matched cohorts would be helpful to confirm our findings. In spite of the age difference between the women in the groups with single and multiple prior examinations, the absolute difference in the percentage of women with dense breasts was small (i.e., 43% vs 39% of women with single versus multiple prior mammograms). Therefore, breast density was not an important confounder of study outcomes. Approximately 31% of our study population had a strong family history of breast cancer, which was defined as having at least one first-degree relative with breast cancer. This likely contributed to the high overall CDR in our study. However, there was a small absolute difference between the groups AJR:207, October 2016 923

with single and multiple prior examinations, with respect to family history of breast cancer (29.8% vs 31.8%, respectively). Therefore, family history was also not an important confounder of study outcomes, and we do not believe that this should limit the generalizability of our study findings. Of note, a higher percentage of patients with a personal history of breast cancer was noted in the group with multiple prior mammograms, compared with the group with a single prior mammogram (8.2% vs 4.5%, respectively). At our institution, the lumpectomy protocol excludes patients with breast cancer diagnosed within the 5 years before screening mammography is performed, and these patients are triaged to diagnostic mammography. In addition, many breast cancer survivors seen at our institution choose continued follow-up with diagnostic examinations rather than screening examinations beyond 5 years after diagnosis. These factors would attenuate any disproportionate contribution of personal history of breast cancer to the CDR and the PPV1 in this cohort. Another limitation of the present study is that we were able to obtain information regarding cancer diagnoses from the pathology database at our institution only, because we did not have linkage to a tumor registry for the study period. However, we cannot posit a sensible reason why the few additional cancers that might be identified in a tumor registry would be more or less likely to be included in the cohort with a single prior examination versus the cohort with multiple prior examinations. Another confounding factor in the present study is the fact that the median interval from the most recent prior mammogram was 12 months longer in the group with a single prior mammogram compared with the group with multiple prior mammograms. Therefore, we cannot exclude the possibility that the availability of more recent prior examinations may have contributed to the reduction in the recall rate in the group with multiple prior examinations. In a small reader study, Sumkin et al. [18] showed that specificity was significantly better when a single mammogram obtained 1 year earlier was used for comparison relative to a single mammogram obtained 2 years earlier. However, there is a paucity of data on this subject. One might speculate that if a prior study is not recent enough, it will bear little resemblance to the current study; in this situation, interpretation of the present study would be akin to reading a prevalence screening examination without the benefit of having prior Hayward et al. examinations for comparison. However, when the examination available for comparison was obtained within a couple years of the current examination, as was the case in the present study, one might speculate that longer-term stability of a finding for which there is a low suspicion of cancer would be more reassuring and thus would be more likely to avert recall than would stability of only 1 year. Further studies that use a more balanced dataset that controls for the time to the most recent examination would be necessary to clarify this issue. A final limitation of the present study was the fact that, in our analysis, we did not differentiate between the number of examinations available for comparison within the group with multiple prior examinations. We sought only to establish that comparison with multiple prior examinations is better than comparison with a single prior examination. Future studies that establish the optimal number of reference examinations, whether it be two or more prior examinations, would be of value. In summary, we have shown that the recall rate for screening mammography decreases, whereas the PPV1 and the CDR increase, when two or more prior examinations are used for comparison relative to comparison with a single prior examination. Our findings suggest that radiologists who make comparisons with more than one prior examination at screening mammography will have more true-positive outcomes and fewer false-positive outcomes. References 1. Feig SA. Screening mammography benefit controversies: sorting the evidence. Radiol Clin North Am 2014; 52:455 480 2. Hubbard RA, Kerlikowske K, Flowers CI, Yankaskas BC, Zhu W, Miglioretti DI. Cumulative probability of false-positive recall or biopsy recommendation after 10 years of screening mammography: a cohort study. Ann Intern Med 2011; 155:481 492 3. Welch HG, Passow HJ. Quantifying the benefits and harms of screening mammography. JAMA Intern Med 2014; 174:448 454 4. United States Preventive Services Task Force (USPSTF). Breast cancer: screening. USPSTF website. www.uspreventiveservicestaskforce.org/page/ Topic/recommendation-summary/breast-cancer- screening. Published 2009. Accessed May 23, 2016 5. Bassett LW, Shayestehfar B, Hirbawi I. Obtaining previous mammograms for comparison: usefulness and costs. AJR 1994; 163:1083 1086 6. Frankel SD, Sickles EA, Curpen BN, Sollitto RA, Ominsky SH, Galvin HB. Initial versus subsequent screening mammography: comparison of findings and their prognostic significance. AJR 1995; 164:1107 1109 7. Wilson TE, Nijhawan VK, Helvie MA. Normal mammograms and the practice of obtaining previous mammograms: usefulness and costs. Radiology 1996; 198:661 663 8. Callaway MP, Boggis CRM, Astley SA, Hutt I. The influence of previous films on screening mammographic interpretation and detection of breast carcinoma. Clin Radiol 1997; 52:527 529 9. Thurfjell MG, Vitak B, Azavedo E, Svane G, Thurfjell E. Effect on sensitivity and specificity of mammography screening with or without comparison of old mammograms. Acta Radiol 2000; 41:52 56 10. Burnside ES, Sickles EA, Sohlich RE, Dee KE. Differential value of comparison with previous examinations in diagnostic versus screening mammography. AJR 2002; 179:1173 1177 11. Roelofs AA, Karssemeijer N, Wedekind N, et al. Importance of comparison of current and prior mammograms in breast cancer screening. Radiology 2007; 242:70 77 12. Yankaskas BC, May RC, Matuszewski J, Bowling JM, Jarman MP, Schroeder BF. Effect of observing change from comparison mammograms on performance of screening mammography in a large community-based population. Radiology 2011; 261:762 770 13. Sickles EA, D Orsi CJ. ACR BI-RADS follow-up and outcome monitoring. In: D Orsi CJ, Sickles EA, Mendelson EB, Morris EA, et al., eds. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. Reston, VA: American College of Radiology, 2013 14. Yankaskas BC, Taplin SH, Ichikawa L, et al. Association between mammography timing and measures of screening performance in the United States. Radiology 2005; 234:363 373 15. Leung JW, Sickles EA. Developing asymmetry identified on mammography: correlation with imaging outcome and pathologic findings. AJR 2007; 188:667 675 16. Feig SA. Age-related accuracy of screening mammography: how should it be measured? Radiology 2000; 214:633 640 17. Howlader N, Noone AM, Krapcho M, et al., eds. SEER cancer statistics review, 1975 2012. National Cancer Institute Surveillance, Epidemiology, and End Results Program website. seer.cancer. gov/csr/1975_2012/. Published April 2015. Accessed June 1, 2016 18. Sumkin JH, Holbert BL, Herrmann JS, et al. Optimal reference mammography: a comparison of mammograms obtained 1 and 2 years before the present examination. AJR 2003; 180:343 346 FOR YOUR INFORMATION This article is available for CME and Self-Assessment (SA-CME) credit that satisfies Part II requirements for maintenance of certification (MOC). To access the examination for this article, follow the prompts associated with the online version of the article. 924 AJR:207, October 2016