Automated Volumetric Breast Density Measurements in the Era of the BI-RADS Fifth Edition: A Comparison With Visual Assessment

Similar documents
Radiologist Assessment of Breast Density by BI-RADS Categories Versus Fully Automated Volumetric Assessment

Mammographic Density Estimation with Automated Volumetric Breast Density Measurement

Mammographic Density Estimation by Volpara software: Comparison with Radiologists' visual assessment and relationship with BI-RADS category

Breast Density. Update 2018: Implications for Clinical Practice

Women s Imaging Original Research

A Comparative Study of Volumetric and Area-Based Breast Density Estimation in Digital Mammography: Results from a Screening Population

Does the synthesised digital mammography (3D-DM) change the ACR density pattern?

Development of a phantom to test fully automated breast density software a work in progress

Breast Density. Information for Health Professionals

Abridged Clinical Results

Diagnostic benefits of ultrasound-guided. CNB) versus mammograph-guided biopsy for suspicious microcalcifications. without definite breast mass

Min Jung Kim Department of Medicine The Graduate School, Yonsei University

Breast asymmetries in mammography: Management

Research summary. European Congress of Radiology 2015

Breast Density into Clinical Practice

Breast density in quantifying breast cancer risk 9 December 2016

Mammographic Breast Density: Comparison of Methods for Quantitative Evaluation 1

Assessment of extent of disease: digital breast tomosynthesis (DBT) versus full-field digital mammography (FFDM)

Breast Density and Breast Tomosynthesis. How have they changed our lives?

Dense Breasts. A Breast Cancer Risk Factor and Imaging Challenge

Improving Screening Mammography Outcomes Through Comparison With Multiple Prior Mammograms

Visual assessment of breast density using Visual Analogue Scales: observer variability, reader attributes and reading time

Do women with dense breasts have higher radiation dose during screening mammography?

Recall and Cancer Detection Rates for Screening Mammography: Finding the Sweet Spot

arxiv: v2 [cs.cv] 8 Mar 2018

Over the recent decades, breast ultrasonography (US) has

Screening mammography: benefit of double reading by breast density

Automated Breast Density Assessment

Performance of Screening Mammography: A Report of the Alliance for Breast Cancer Screening in Korea

Breast Density: Significance and Notification. Carol H. Lee Memorial Sloan-Kettering Cancer Center New York, NY

Ji Eun Baek, Bong Joo Kang *, Sung Hun Kim and Hyun Sil Lee

Features of Prospectively Overlooked Computer-Aided Detection Marks on Prior Screening Digital Mammograms in Women With Breast Cancer

A comparative study of volumetric breast density estimation in digital mammography and magnetic resonance imaging: Results from a high-risk population

Breast Cancer Risk and Mammographic Density Assessed with Semiautomated and Fully Automated Methods

Women s Imaging Original Research

Breast-Specific Gamma Imaging for the Detection of Breast Cancer in Dense Versus Nondense Breasts

<Original Article > Distribution of Dense Breasts Using Screening Mammography in Korean Women: A Retrospective Observational Study

Volumetric breast density affects performance of digital screening mammography

A novel and automatic pectoral muscle identification algorithm for mediolateral oblique (MLO) view mammograms using ImageJ

COMPUTERIZED CALCULATION OF BREAST DENSITY: OUR EXPE- RIENCE FROM ARCADIA MEDICAL IMAGING CENTER

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study

Yonsei University, College of Medicine, 5 Department of Radiology, Soonchunhyang University Bucheon Hospital,

Automating Quality Assurance Metrics to Assess Adequate Breast Positioning in Mammography

SSQ01-01 SSQ SSQ01 Breast Imaging (Breast Density and Risk Assessment) Participants. Sub-Events

Shear-wave elastography for breast masses: local shear wave speed (m/sec) versus Young modulus (kpa)

Updates in Mammography. Dr. Yang Faridah A. Aziz Department of Biomedical Imaging University Malaya Medical Centre

Effect of Mammographic Screening Modality on Breast Density Assessment: Digital Mammography versus Digital Breast Tomosynthesis

Automatic classification of mammographic breast density

Frequently Asked Questions about Breast Density, Breast Cancer Risk, and the Breast Density Notification Law in California: A Consensus Document

RESEARCH ARTICLE. Woo Jung Choi, Joo Hee Cha*, Hak Hee Kim, Hee Jung Shin, Hyunji Kim, Eun Young Chae, Min Ji Hong. Abstract.

Practitioner compression force variation in mammography : a 6 year study

Mammographic Breast Density Classification by a Deep Learning Approach

Deep-Learning Based Semantic Labeling for 2D Mammography & Comparison of Complexity for Machine Learning Tasks

Ruud Pijnappel Professor of Radiology, UMC Utrecht. Chair Dutch Expert Centre for Screening Board EUSOBI

Observer Agreement Using the ACR Breast Imaging Reporting and Data System (BI-RADS)-Ultrasound, First Edition (2003)

Hormone replacement therapy and breast density after surgical menopause

Detection and Classification of Calcifications on Digital Breast Tomosynthesis and 2D Digital Mammography: A Comparison

Comparison of Digital Mammography and Screen-Film Mammography in Breast Cancer Screening: A Review in the Irish Breast Screening Program

A comparison of five methods of measuring mammographic density: a casecontrol

Dense Breasts, Get Educated

Digital Breast Tomosynthesis in the Diagnostic Environment: A Subjective Side-by-Side Review

New Palpable Breast Lump With Recent Negative Mammogram: Is Repeat Mammography Necessary?

S. Murgo, MD. Chr St-Joseph, Mons Erasme Hospital, Brussels

Triple Receptor Negative Breast Cancer: Imaging and Clinical Characteristics

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

What s New in Breast Imaging. Jennifer A. Harvey, M.D., FACR Professor of Radiology University of Virginia

Validation of the fifth edition BI-RADS ultrasound lexicon with comparison of fourth and fifth edition diagnostic performance using video clips

BI-RADS Categorization As a Predictor of Malignancy 1

Women s Imaging Original Research

Since its introduction in 2000, digital mammography has become

Damases, C, Hogg, P and McEntee, M Damases, C, Hogg, P and McEntee, M Article

Mammographic Breast Density: Effect on Imaging and Breast Cancer Risk

Detection to Prediction: Imaging Markers of Breast Cancer Risk

Amammography report is a key component of the breast

IJC International Journal of Cancer

Implementation of Breast Tomosynthesis in a Routine Screening Practice: An Observational Study

Københavns Universitet

Testing the Effect of Computer- Assisted Detection on Interpretive Performance in Screening Mammography

Mammographic density and inter-observer variability of pathologic evaluation of core biopsies among women with mammographic abnormalities

Introduction to Breast Density

BI-RADS 3 category, a pain in the neck for the radiologist which technique detects more cases?

Update of Digital Breast Tomosynthesis. Susan Orel Roth, MD

Automated measurement of volumetric mammographic density: a tool for widespread breast

Screening Options in Dense Breasts. Donna Plecha, M.D. Co-Director UHCMC Breast Centers Associate Professor of Radiology Director of Breast Imaging

The Breast Imaging Reporting and Data System (BI-RADS) has standardized the description and management of findings identified on mammograms, thereby f

WHAT TO EXPECT. Genius 3D MAMMOGRAPHY Exam. The most exciting advancement in mammography in over 30 years

Retrospective Analysis on Malignant Calcification Previously Misdiagnosed as Benign on Screening Mammography 스크리닝유방촬영술에서양성으로진단되었던악성석회화에대한후향적분석

Update in Breast Cancer Screening

Reproductive and Hormonal Factors Associated with Fatty or Dense Breast Patterns among Korean Women

Journal of Breast Cancer

Bianca den Dekker, MD - PhD student. Prof dr R.M. Pijnappel Prof dr H.M. Verkooijen Dr M. Broeders

Analysis of percent density estimates from digital breast tomosynthesis projection images

Adapting Breast Density Classification from Digitized to Full-Field Digital Mammograms

Why it matters and what to do

BI-RADS classification in breast tomosynthesis. Our experience in breast cancer cases categorized as BI-RADS 0 in digital mammography

FDA Executive Summary

Jing Zhang, PhD, Lars J. Grimm, MD, MHS, Joseph Y. Lo, PhD, Karen S. Johnson, MD,

Medical Audit of Diagnostic Mammography Examinations: Comparison with Screening Outcomes Obtained Concurrently

DYNAMIC SEGMENTATION OF BREAST TISSUE IN DIGITIZED MAMMOGRAMS

The Radiology Aspects

Transcription:

Medical Physics and Informatics Original Research Youk et al. Automated Breast Density Measurement and BI-RADS Fifth Edition Medical Physics and Informatics Original Research Ji Hyun Youk 1 Hye Mi Gweon Eun Ju Son Jeong-Ah Kim Youk JH, Gweon HM, Son EJ, Kim JA Keywords: breast, computer-assisted radiographic image interpretation, digital radiography, mammography, software DOI:10.2214/AJR.15.15472 Received August 24, 2015; accepted after revision October 29, 2015. 1 All authors: Department of Radiology, Gangnam Severance Hospital, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-Gu, Seoul 06273, South Korea. Address correspondence to J. H. Youk (jhyouk@yuhs.ac). AJR 2016; 206:1056 1062 0361 803X/16/2065 1056 American Roentgen Ray Society Automated Volumetric Breast Density Measurements in the Era of the BI-RADS Fifth Edition: A Comparison With Visual Assessment OBJECTIVE. The purpose of this study is to evaluate automated volumetric measurements in comparison with visual assessment of mammographic breast density by use of the fifth edition of BI-RADS. MATERIALS AND METHODS. A total of 1185 full-field digital mammography examinations with standard views were retrospectively analyzed. All images were visually assessed by two blinded radiologists according to breast density category in the fifth edition of the BI-RADS lexicon. Automated volumetric breast density assessment was performed using two different software programs, Quantra and Volpara. A weighted kappa value was calculated to assess the degree of agreement among the visual and volumetric assessments of the density category. The volumes of fibroglandular tissue or total breast and the percentage breast density provided by the two software programs were compared. RESULTS. Compared with a visual assessment, the agreement of density category ranged from moderate to substantial in Quantra (κ = 0.54 0.61) and fair to moderate in Volpara (κ = 0.32 0.43). The distribution of density category was statistically significantly different among visual and volumetric measurements (p < 0.0001). Quantra assigned category A and B (43.5%) more frequently than did the radiologists (25.6%) or Volpara (16.0%). Volpara assigned category D (42.1%) more frequently than did the radiologists (19.5%) or Quantra (15.4%). Between the two software programs, the means of all volumetric data were statistically significantly different (p < 0.0001), but were well correlated (γ = 0.79 0.99; p < 0.0001). CONCLUSION. More mammographic examinations were classified as nondense breast tissue using the Quantra software and as dense breast tissue using the Volpara software, as compared with visual assessments according to the BI-RADS fifth edition. S everal studies have shown that mammographic breast density is an important risk factor for developing breast cancer. It has been reported that women with dense breasts have a four- to sixfold higher risk of breast cancer than do women with fatty breasts [1 4]. The dense fibroglandular tissue may obscure breast cancers, and the sensitivity of screening mammography is decreased in women with dense breasts [5]. Several states in the United States have also passed breast density reporting laws as a result of the increasing importance of mammographic breast density. However, a visual assessment of breast density is limited by intra- or interobserver variability, irrespective of the assessment scale used. Although quantitative area based breast density measurements have been developed, they are intrinsically subjective, not reliably reproducible, and require additional decision time even by skilled users [6]. To overcome those limitations, fully automated methods for volumetric breast density estimation from digital mammograms have been developed and shown to be highly reproducible; in addition, it is feasible to obtain quantitative measurements of dense breast tissue volume with results that are in good agreement with the BI-RADS breast density categories [7 14]. Two commercially available software programs for volumetric density measurement were compared and showed good correlation despite different results [15 17]. However, to our knowledge, volumetric breast density measurements have not been compared with a visual assessment of breast density according to the fifth edition of BI-RADS [18]. The current BI-RADS lexicon eliminates quartile ranges of percentage tissue density to define the descriptors of breast composi- 1056 AJR:206, May 2016

Automated Breast Density Measurement and BI-RADS Fifth Edition tion and suggests assigning an overall breast composition rating that describes the distribution of tissue density to convey the likelihood of having an obscured lesion. For instance, breasts with dense fibroglandular tissue accumulated in a single area with otherwise fatty tissue throughout are described as heterogeneously dense, to signify that this area may obscure a mass, although the percentage of the total breast area consisting of dense fibroglandular tissue is less than 50% [19]. It is hypothesized that this change could influence the clinical application of automated volumetric density results after correlating with a visual assessment by radiologists. Thus, the current study was performed to evaluate two commercially available automated volumetric breast density measurement systems, in comparison with a visual assessment of mammographic breast density by radiologists based on the fifth edition of BI-RADS. A A Materials and Methods This retrospective study was conducted with institutional review board approval from Yonsei University College of Medicine and a waiver of the need for written informed consent from the participants. All patient records and information were made anonymous and deidentified before analysis. Between May 2013 and July 2013, 1248 consecutive full-field digital mammography examinations with standard views were performed at our institution. Digital mammographic examinations were performed on a full-field digital mammography unit (Lorad Selenia, Hologic). This unit was equipped with 24 29-cm amorphous selenium detectors with pixel sizes of 70 μm. Standard craniocaudal and mediolateral oblique views made up the dataset in this study. Among the available datasets, only examinations for which all volumetric data were obtained and available from both software programs were included. The exclusion criteria were as follows: examinations of breasts B B Fig. 1 59-year-old woman who underwent screening mammography. A and B, Craniocaudal view (A) and mediolateral oblique view (B) mammograms are shown. By visual assessment, breast density was designated as BI-RADS category C. According to Volpara software (version 1.5.5, Mātakina Technology), breast density was assessed as Volpara density grade 3, and volumetric breast density was 12.1% for right breast and 10.6% for left breast. According to Quantra software (version 2.0, Hologic), breast density was assessed as category 2, and volumetric breast density was 12% for right breast and 13% for left breast. iatrogenically altered by cancer surgery or reduction mammoplasty, examinations of augmented breasts, examinations performed when the subject was receiving neoadjuvant chemotherapy, or examinations for which any software failed to obtain the data, including data for breast volume, fibroglandular tissue volume, or breast density in DICOM imaging capture. Visual Mammographic Density Assessment by Radiologists All mammographic images were downloaded to a soft-copy review workstation (Selenia Softcopy Workstation, Hologic) with soft-copy reading software (MeVis BreastCare version 6.0.5, MeVis Medical Solutions). Two radiologists with 9 and 13 years of experience in interpreting mammography and 8 and 9 years of experience in soft-copy review of digital mammography independently reviewed the images at the review workstation. Each radiologist was blinded to the assessment of the other ra- Fig. 2 41-year-old woman who underwent screening mammography. A and B, Craniocaudal view (A) and mediolateral oblique view (B) mammograms are shown. By visual assessment, breast density was assessed as BI-RADS category C. According to Volpara software (version 1.5.5, Mātakina Technology), breast density was assessed as Volpara density grade 4, and volumetric breast density was 18.0% for right breast and 13.6% for left breast. According to Quantra software (version 2.0, Hologic), breast density was assessed as category 3, and volumetric breast density was 19% for right breast and 19% for left breast. AJR:206, May 2016 1057

Youk et al. diologist and volumetric breast density. Each mammogram was assessed for breast density according to the BI-RADS breast density categories. The following BI-RADS categories for breast density were used for mammographic interpretations: category A, almost fatty; category B, scattered areas of fibroglandular densities; category C, heterogeneously dense; and category D, extremely dense [18, 19]. After the review of the results from the two radiologists, if the BI-RADS breast density category of the mammogram was different between the radiologists, consensus was reached by discussion. Automated Volumetric Breast Density Assessment Using Two Different Software Programs For automated volumetric analysis, both the Volpara software (version 1.5.5, Mātakina Technology) and the Quantra software (version 2.0, Hologic) were used. Both are software applications intended for use with images acquired using digital mammography systems. By using the DICOM for processing the image data generated by the digital mammography system, both software algorithms calculate an objective measurement of breast density using volumetric parameters. The algorithms determine and report the ratio of fibroglandular tissue as a percentage of total breast volume, by the following procedure: First, these algorithms estimate two volumes, the volume of fibroglandular tissue in cubic centimeters and the volume of the breast in cubic centimeters. They then divide the volumes to produce a volumetric fraction of breast fibroglandular tissue as a percentage reported as the volumetric breast density. For the Volpara software, the breast density information is provided per breast by averaging the craniocaudal and mediolateral oblique values. For each patient, a Volpara density grade (VDG) is also provided. The VDG is the result of mapping the average volumetric breast density for the patient corresponding to a BI-RADS breast density category. The VDG is graded according to the percentage volumetric breast density as follows: VDG 1, less than 4.5%; VDG 2, 4.5% to less than 7.5%; VDG3, 7.5% to less than 15.5%; and VDG 4, 15.5% or more [11]. Similarly, the Quantra software segregates breast density into the BI-RADSlike breast composition categories 1 through 4. It provides numeric values of breast density per image, breast, and patient [7, 9]. Data and Statistical Analysis Medical records were reviewed, and demographic data such as age and personal history of breast augmentation, breast-conserving surgery, mastectomy, or neoadjuvant treatment were compiled. Also, radiologic reports and mammograms were reviewed for calcifications larger than 1 mm or any other mass that could be used to sort patients into groups of those with or without lesions (mass or calcification) for intergroup comparisons. A weighted kappa value (κ) was calculated to assess the proportion of agreement between the visual assessment and the two volumetric measurements of breast density according to the BI-RADS category. BI-RADS breast density categories A, B, C, and D were considered to correspond to a VDG of 1, 2, 3, and 4, and BI-RADSlike breast composition categories of 1, 2, 3, and 4, respectively. The kappa values were interpreted as suggested by Landis and Koch [20] as follows: κ 0.20 indicates slight agreement, κ = 0.21 0.40 indicates fair agreement, κ = 0.41 0.60 indicates moderate agreement, κ = 0.61 0.80 indicates substantial agreement, and κ = 0.81 1.00 indicates almost perfect agreement. After the visual and volumetric assessments according to the BI-RADS category were compared, subjects were divided into either a concordant or discordant group, and the differences between them were analyzed according to age, mammogram findings, and volumetric density data. The volume of the fibroglandular tissue, the breast volume, and the volumetric breast density provided by the two software programs were compared. Statistical comparisons were performed with the independent or paired t test or the Pearson test for continuous variables and the chi-square or Fisher exact test for categoric variables. Statistical analyses were performed using statistical software programs (SPSS, version 20.0.0, IBM; and MedCalc, version 13.3.3.0, MedCalc Software). A p < 0.05 was considered to indicate statistical significance. Results Of the 1248 full-field digital mammography examinations, 1185 examinations of women aged 23 90 years (mean [± SD] age, 52.4 ± 10.0 years) were included in this study. The remaining 63 of 1185 examinations (5.3%) that were excluded from the study were examinations of breasts iatrogenically altered by bilateral cancer surgery (n = 33), reduction mammoplasty (n = 2), or augmented with foreign material injection (n = 3), autologous fat injection (n = 2), or silicone or saline bag insertion (n = 12); examinations that were performed while the patient was receiving neoadjuvant chemotherapy (n = 6); and examinations in which the software programs failed to obtain the data because of technical problems (n = 5; two in Volpara, two in Quantra, and one in both). Among the 1185 examinations, 681 (57.5%) were examinations of the bilateral breasts and the remaining 504 (42.5%) were examinations of a unilateral breast (278 for the right and 226 for the left), because of a history of contralateral cancer surgery in 495 subjects and uni- TABLE 1: Agreement Among Visual and Volumetric Assessments of Breast Density According to the BI-RADS Fifth Edition Assessment All Negative Examination Positive Examination Two reviewers Right breast 0.82 (0.79 0.85) 0.82 (0.79 0.85) 0.83 (0.76 0.90) Left breast 0.82 (0.79 0.85) 0.81 (0.78 0.85) 0.84 (0.78 0.91) Total 0.81 (0.78 0.84) 0.81 (0.78 0.84) 0.83 (0.76 0.90) Quantra vs reviewers Right breast 0.60 (0.56 0.64) 0.61 (0.57 0.65) 0.58 (0.49 0.67) Left breast 0.55 (0.51 0.58) 0.55 (0.51 0.59) 0.54 (0.45 0.62) Total 0.61 (0.58 0.64) 0.61 (0.58 0.65) 0.60 (0.52 0.69) Volpara vs reviewers Right breast 0.50 (0.46 0.54) 0.51 (0.47 0.55) 0.46 (0.36 0.55) Left breast 0.53 (0.49 0.57) 0.54 (0.49 0.58) 0.49 (0.40 0.58) Total 0.50 (0.46 0.53) 0.51 (0.47 0.54) 0.45 (0.36 0.54) Quantra vs Volpara Right breast 0.36 (0.33 0.40) 0.37 (0.33 0.40) 0.36 (0.28 0.44) Left breast 0.33 (0.29 0.36) 0.32 (0.29 0.36) 0.33 (0.26 0.40) Total 0.38 (0.35 0.41) 0.38 (0.35 0.41) 0.40 (0.32 0.47) Note Data are weighted kappa value (95% CI). Quantra (version 2.0) is manufactured by Hologic and Volpara (version 1.5.5) is manufactured by Mātakina Technology. 1058 AJR:206, May 2016

Automated Breast Density Measurement and BI-RADS Fifth Edition TABLE 2: Breast Density Categories According to the BI-RADS Fifth Edition Assessed by the Radiologists and the Two Different Volumetric Measurements Density Category a lateral examination only performed in nine subjects. After reviewing radiologic reports and mammograms for calcifications larger than 1 mm or for other masses, 993 (83.8%) were negative examinations and 192 (16.2%) were positive examinations. Table 1 summarizes the agreement among visual and volumetric assessments of breast density according to the BI-RADS breast density categories. The agreement of breast density categories was almost perfect between the radiologists (κ = 0.81 0.84), moderate to substantial between the visual assessment and volumetric assessment by Quantra (κ = 0.54 0.61), moderate between Volpara and the radiologists (κ = 0.45 0.54), and fair between the Quantra and Volpara software programs (κ = 0.32 0.40). The kappa values were similar between unilateral and bilateral examinations and between negative and positive examinations. The distribution of breast density category was statistically significantly different between the visual assessment and the two volumetric measurements, regardless of whether the examination was negative or Reviewers Reviewer 1 Reviewer 2 Both Quantra Volpara Quantra Volpara Total All (n = 1185) A 40 (3.4) 46 (3.9) 42 (3.5) 31 (2.6) 2 (0.2) < 0.0001 b < 0.0001 b < 0.0001 b,c B 238 (20.1) 312 (26.3) 262 (22.1) 485 (40.9) 187 (15.8) < 0.0001 d < 0.0001 d < 0.0001 c,d C 647 (54.6) 630 (53.2) 650 (54.9) 486 (41.0) 497 (41.9) < 0.0001 e < 0.0001 e < 0.0001 c,e D 260 (21.9) 197 (16.6) 231 (19.5) 183 (15.4) 499 (42.1) < 0.0001 f Negative examination (n = 993) A 37 (3.7) 42 (4.2) 39 (3.9) 27 (2.7) 1 (0.1) < 0.0001 b < 0.0001 b < 0.0001 b,c B 191 (19.2) 256 (25.8) 212 (21.3) 408 (41.1) 157 (15.8) < 0.0001 d < 0.0001 d < 0.0001 c,d C 538 (54.1) 523 (52.7) 538 (54.2) 404 (40.7) 413 (41.6) < 0.0001 e < 0.0001 e < 0.0001 c,e D 227 (22.9) 172 (17.3) 204 (20.5) 154 (15.5) 422 (42.5) < 0.0001 f Positive examination (n = 192) A 3 (1.6) 4 (2.1) 3 (1.6) 4 (2.1) 1 (0.5) 0.009 b < 0.0001 b < 0.0001 b,c B 47 (24.5) 56 (29.2) 50 (26.0) 77 (40.1) 30 (15.6) 0.075 d < 0.0001 d < 0.0001 c,d C 109 (56.8) 107 (55.7) 112 (58.3) 82 (42.7) 84 (43.8) 0.014 e < 0.0001 e < 0.0001 c,e D 33 (17.2) 25 (13.0) 27 (14.1) 29 (15.1) 77 (40.1) < 0.0001 f Note Except for p values, data are number (%) of examinations. a Reviewers assessed breast density according to the BI-RADS density categories, whereas the density grades assigned by the Quantra software (version 2.0, Hologic) refer to BI-RADS-like breast composition categories, and the Volpara software (version 1.5.5, Mātakina Technology) assigns density grades as Volpara density grade (VDG). b Intrareader comparison for reviewer 1 only. c Comparison among reviewers, Quantra, and Volpara. d Intrareader comparison for reviewer 2 only. e Consensus between both reviewers. f Comparison between Quantra and Volpara. positive (p < 0.0001; Table 2). For the visual assessment by the radiologists, category C was the most frequently assigned designation, accounting for 54.9% of all examinations. For Quantra, categories B and C were evenly assigned, accounting for 41%, and the proportion of category A and B designations (43.5%) was higher than that for radiologists (25.6%) or Volpara (16.0%) (Fig. 1). For Volpara, category A was scarcely assigned, with only two examinations in total (0.2%), and the proportion of subjects with a category D designation (42.1%) was much higher than that given by the radiologists (19.5%) or Quantra (15.4%) (Fig. 2). Of the 1185 examinations, 810 (68.4%) were concordant and 375 (31.6%) were discordant in the breast density category between the visual assessment and the Quantra software. For Volpara, 711 (60.0%) were concordant and 474 (40.0%) were discordant in the breast density category compared with the visual assessment. For the discordant group, 83.2% (312/375) were assessed as having a less dense breast with the Quantra software and 93.0% (441/474) were assessed as having a denser breast with the Volpara software, as compared with the visual assessment. According to subject age, mammogram findings, and volumetric density data, no statistically significant difference was found between the concordant and discordant groups with the Volpara software (Table 3). For the Quantra software, however, the discordant group was found to be older and to have a smaller volume of fibroglandular tissue, larger volume of total breast, and lower percentage of volumetric breast density than the concordant group when comparing the mean values (p < 0.05; Table 3). Regarding volumetric density data obtained from the two different volumetric measurements (Table 4), the means of all volumetric data from Quantra were statistically significantly higher than those from Volpara (p < 0.0001), but the data were statistically significantly correlated with each other (γ = 0.74 0.99; p < 0.0001). For the presence of mammographic lesions (mass or calcification), no statistically significant difference in the mean volumetric breast density was found between the negative and positive examinations (16.9% ± 10.7% and 16.8% ± 9.7% in p AJR:206, May 2016 1059

TABLE 3: Comparison of Patient Age, Mammographic Abnormality, and Volumetric Density Data Between Concordant and Discordant Groups Youk et al. Factor Quantra, p = 0.970; 15.2% ± 7.8% and 14.7% ± 7.8% in Volpara, p = 0.460). Discussion The BI-RADS fifth edition has revised the mammographic breast density categories by Quantra excluding percentage quartiles for each of the four density categories to emphasize the text descriptions of breast density, which reflect the masking effect of dense fibroglandular tissue on mammographic depiction of noncalcified lesions. This more subjective Volpara Concordant Discordant p Concordant Discordant p Patient age (y) 51.4 ± 9.7 54.3 ± 10.5 < 0.0001 52.0 ± 9.9 52.8 ± 10.2 0.150 Volume of fibroglandular tissue (cm 3 ) Right breast 74.3 ± 44.8 67.5 ± 46.4 0.036 55.2 ± 30.2 56.1 ± 28.0 0.651 Left breast 72.3 ± 40.3 65.0 ± 35.8 0.009 51.8 ± 25.6 52.0 ± 24.3 0.881 Volume of the breast (cm 3 ) Right breast 454.5 ± 239.0 514.9 ± 258.1 0.001 421.2 ± 232.3 397.2 ± 201.1 0.098 Left breast 465.2 ± 244.7 513.7 ± 248.2 0.007 420.4 ± 236.7 394.1 ± 187.8 0.077 Volumetric breast density (%) Right breast 19.2 ± 12.2 13.6 ± 6.8 < 0.0001 16.0 ± 9.1 15.9 ± 7.0 0.929 Left breast 18.5 ± 11.7 13.9 ± 7.1 < 0.0001 15.2 ± 8.6 14.8 ± 6.3 0.403 Mammographic examination, no. (%) Negative 675 (83.3) 318 (84.8) 0.524 603 (84.8) 390 (82.3) 0.247 Positive 135 (16.7) 57 (15.2) 108 (15.2) 84 (17.7) Note Except where noted otherwise, data are mean ± SD. Quantra (version 2.0) is manufactured by Hologic and Volpara (version 1.5.5) is manufactured by Mātakina Technology. TABLE 4: Comparison and Correlation of Volumetric Density Data Between Quantra and Volpara All Volumetric Density Data Quantra 72.2 ± 45.4 (7 402) Volpara 55.6 ± 29.3 (8.2 211.9) Volume of Fibroglandular Tissue (cm 3 ) Volume of the Breast (cm 3 ) Volumetric Breast Density (%) Right Left Right Left Right Left 70.1 ± 39.1 (7 280) 51.9 ± 25.1 (6.7 166.9) 472.6 ± 246.4 (43 1688) 411.6 ± 220.5 (30.3 1554) 479.9 ± 246.7 (41 1737) 410.1 ± 219.0 (34.1 1483.4) 17.5 ± 11.1 (3 84) 16.0 ± 8.3 (4.1 43.8) 17.1 ± 10.7 (3 78) 15.1 ± 7.8 (4.2 39.3) p < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 Correlation coefficient a 0.86 0.86 0.99 0.99 0.77 0.75 Negative examination Quantra 70.7 ± 43.7 (7 369) Volpara 54.8 ± 29.0 (8.2 211.9) 69.7 ± 39.0 (7 280) 51.8 ± 24.9 (6.7 153.8) 464.4 ± 239.9 (43 1688) 403.7 ± 215.2 (30.3 1554) 475.1 ± 243.5 (41 1737) 404.9 ± 216.0 (34.1 1483.4) 17.6 ± 11.3 (3 84) 16.0 ± 8.3 (4.4 42.6) 17.3 ± 11.0 (3 78) 15.3 ± 7.8 (4.2 38.4) p < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001 Correlation coefficient a 0.87 0.86 0.99 0.99 0.77 0.74 Positive examination Quantra 79.2 ± 51.9 (12 402) Volpara 59.3 ± 30.5 (16.3 187.4) 71.6 ± 39.6 (11 265) 52.1 ± 26.0 (15.4 166.9) 510.6 ± 271.7 (59 1484) 448.0 ± 240.8 (49.2 1287.3) 499.9 ± 259.1 (81 1616) 431.5 ± 230.4 (56.5 1380.8) 17.5 ± 10.6 (5 66) 15.7 ± 8.5 (4.1 43.8) 16.3 ± 9.6 (5 67) 14.3 ± 7.7 (4.6 39.3) p < 0.0001 < 0.0001 < 0.0001 < 0.0001 0.001 < 0.0001 Correlation coefficient a 0.79 0.86 0.99 0.99 0.75 0.81 Note Except where noted otherwise, data are mean ± SD (range). a Pearson correlation between Quantra (version 2.0, Hologic) and Volpara (version 1.5.5, Mātakina Technology) (p < 0.0001). assessment system will change the distribution of assigned density categories [18]. For instance, more mammograms might be categorized toward dense breast tissue when there is a localized dense tissue that would have been considered nondense breast tissue 1060 AJR:206, May 2016

Automated Breast Density Measurement and BI-RADS Fifth Edition according to the percentage quartile assessment [19]. Before the fifth edition, the nondense and dense breast tissue was evenly distributed in the general screening population, with 10% having a fatty designation, 40% scattered, 40% heterogeneously dense, and 10% considered extremely dense [21, 22]. According to the revised BI-RADS used in the current study, however, there was a higher proportion of dense breast than nondense breast tissue, as expected, with 3.5% fatty designation, 22.1% scattered, 54.9% heterogeneously dense, and 19.5% considered extremely dense. This is similar to a recent study that found 1.6%, 14.3%, 69.1%, and 15.0% of subjects in categories A, B, C, and D, respectively [23]. Regarding interobserver agreement of the BI-RADS density category, agreement was moderate to substantial in the fourth edition even with the wide variability of the kappa values (κ = 0.02 0.87), and a previous study found similar results of substantial agreement (κ = 0.72) in the revised BI-RADS lexicon [11, 23, 24]. However, the results here showed better agreement (κ = 0.81 0.84), which can be attributed to the simplicity of the revised BI-RADS density categorization based on the relative possibility of lesion obscuration rather than estimated density percentage quartiles, as well as to a relatively longer period of time using those revised criteria in practice for radiologists. Considering the different distribution of breast density categories between the two editions of BI-RADS, both vendor-provided breast density grading systems of volumetric density measurement can hardly be expected to give radiologists a consistent breast composition assessment that accurately reflects the new breast density categories. The thresholds of density grades with each program are set according to a visual assessment by the radiologists according to the BI-RADS fourth edition, and, in particular, quartiles of the percentage density [7, 9, 11]. However, the quartiles of percentage density are not the criteria for the revised density category, so the density category according to thresholds of percentage volumetric breast density may not correlate with a subjective assessment of lesion obscuration for dense breast tissue, as mentioned already. Compared with studies of breast density categories based on the BI-RADS fourth edition showing fair to substantial agreement (κ = 0.40 0.80) by the Volpara software, and an intraclass correlation ranging from 0.63 to 0.73 by the Quantra software compared with visual assessment [9, 11, 13, 25], we found a relatively lower agreement between visual and volumetric assessments according to the BI-RADS fifth edition. There was moderate agreement by the Volpara software (κ = 0.45 0.54) and moderate to substantial agreement by the Quantra software (κ = 0.54 0.61). Specifically, Quantra classified more mammographic examinations as having nondense breast tissue (43.5% vs 25.6%) and Volpara identified more mammographic examinations as category D, or dense breast tissue (42.1% vs 19.5%) (Figs. 1 and 2). With the Quantra software, 26.3% (312/1185) of subjects (83.2% of 375 discordant studies) were assessed at a lower category, and 37.2% (441/1185) of subjects (93.0% of 474 discordant studies) were assessed at a higher category with the Volpara software. Because percentage quartiles for the density categories are eliminated in the revised BI-RADS atlas, examinations that would have been assessed as nondense breast tissue by Quantra tended to be identified as dense breast tissue by visual assessment because of the possibility of an obscured lesion according to the new BI-RADS lexicon. Similarly, the discordant group was associated with a smaller volume of fibroglandular tissue, larger volume of total breast, and lower percentage volumetric breast density in Quantra, which favored a designation of nondense breast tissue by volumetric measurements (Table 3). In contrast, Volpara appeared to be more concordant with the revised BI-RADS atlas for the assignment of nondense breast tissue. However, VDG scores tended to be overestimated as compared with the BI-RADS density scores. For example, category A was scarcely assigned and category D accounted for more than 40% of the designations. Even in the BI-RADS fourth edition, the previous studies reported similar results that, of discordant examinations, 81.5 92.6% (mean, 88.7%) were overestimated by Volpara compared with a visual assessment, and 35.9 74.7% of examinations (mean, 58.4%) that were assessed as category 3 were designated as VDG 4 by visual assessment [11, 13, 14]. Likewise, according to the volumetric breast density data, there was a statistically significant difference between the two volumetric measurements (Table 4). Previous studies reported that Quantra gave the larger value and a greater range of fibroglandular volume and percentage breast density than Volpara, although both volumetric measurements showed good correlation. Our results are comparable to those obtained in previous studies [15, 16]. The reason that Volpara categorized an increased number of denser breasts on their scale despite a smaller percentage breast density compared with Quantra is unclear, but those authors speculated that the reference population for breast density grading systems and their own criteria or cut-offs of grading according to the range of percentage breast density calculated by their own algorithms were different between the two software programs. The difference in the distribution of breast density categories in comparison with visual assessment and in volumetric breast density data between the two software programs might be attributable to an intrinsic difference in the algorithms abilities to estimate fibroglandular tissue volume or total breast volume and the established thresholds for density categories. Quantra estimates the 2D thickening distribution of fibroglandular tissue density after adding the fibroglandular tissue density per pixel estimated, by using given image acquisition parameters, such as breast thickness [17, 19]. In contrast, Volpara estimates the fibroglandular tissue and total breast volume by adding the pixel values compared with a reference pixel of fat to determine the difference in x-ray attenuation and tissue composition, which is less dependent on accurate breast thickness [17, 19]. Another possible reason is that the skin is included in Quantra but not included in Volpara to estimate the volume of the breast, which would lead to a larger total breast volume measured by Quantra. Considering the erstwhile discrepancies between the visual assessment and volumetric breast density measurements, volumetric breast density measurements may be used in the future to evaluate the role of quantitative breast density data in predicting breast cancer risk, whereas subjective breast density assessment may be performed to evaluate mammographic sensitivity [23]. This study has some limitations. First, this is a single-institution study and all the mammograms in this study were obtained from a single mammographic unit. To render our results general and valid, further study is needed. Second, volumetric breast density data were evaluated for the right and the left breast separately because the examinations of unilateral breasts were included and Volpara provides volumetric density data per breast, rather than per patient. Finally, a very high interobserver agreement shown for vi- AJR:206, May 2016 1061

Youk et al. sual assessment in the current study could be because only two readers performed the study. Because several previous studies have shown a variable and, in many cases, a lower interobserver agreement [11, 24], it remains to be seen whether the results would change by increasing the number of observers. In conclusion, more mammographic examinations were classified as nondense breast tissue using the Quantra software and as dense breast tissue using the Volpara software, as compared with a visual assessment according to the BI-RADS fifth edition. References 1. Boyd NF, Guo H, Martin LJ, et al. Mammographic density and the risk and detection of breast cancer. N Engl J Med 2007; 356:227 236 2. McCormack VA, dos Santos Silva I. Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis. Cancer Epidemiol Biomarkers Prev 2006; 15:1159 1169 3. Assi V, Warwick J, Cuzick J, Duffy SW. Clinical and epidemiological issues in mammographic density. Nat Rev Clin Oncol 2011; 9:33 40 4. Harvey JA, Bovbjerg VE. Quantitative assessment of mammographic breast density: relationship with breast cancer risk. Radiology 2004; 230:29 41 5. Carney PA, Miglioretti DL, Yankaskas BC, et al. Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann Intern Med 2003; 138:168 175 6. Highnam R, Brady M, Yaffe M, Karssemeijer N, Harvey JA. Robust breast composition measurement: Volpara. In: Marti J, ed. International Workshop on Digital Mammography 2010. Heidelberg, Germany: Springer, 2010:342 349 7. Ciatto S, Bernardi D, Calabrese M, et al. A first evaluation of breast radiological density assessment by QUANTRA software as compared to visual classification. Breast 2012; 21:503 506 8. Skippage P, Wilkinson L, Allen S, Roche N, Dowsett M, A Hern R. Correlation of age and HRT use with breast density as assessed by Quantra. Breast J 2013; 19:79 86 9. Singh JM, Fallenberg EM, Diekmann F, et al. Volumetric breast density assessment: reproducibility in serial examinations and comparison with visual assessment. Rofo 2013; 185:844 848 10. Seo JM, Ko ES, Han BK, Ko EY, Shin JH, Hahn SY. Automated volumetric breast density estimation: a comparison with visual assessment. Clin Radiol 2013; 68:690 695 11. Gweon HM, Youk JH, Kim JA, Son EJ. Radiologist assessment of breast density by BI-RADS categories versus fully automated volumetric assessment. AJR 2013; 201:692 697 12. Engelken F, Singh JM, Fallenberg EM, Bick U, Bottcher J, Renz DM. Volumetric breast composition analysis: reproducibility of breast percent density and fibroglandular tissue volume measurements in serial mammograms. Acta Radiol 2014; 55:32 38 13. Gubern-Mérida A, Kallenberg M, Platel B, Mann RM, Marti R, Karssemeijer N. Volumetric breast density estimation from full-field digital mammograms: a validation study. PLoS One 2014; 9:e85952 14. Ko SY, Kim EK, Kim MJ, Moon HJ. Mammographic density estimation with automated volumetric breast density measurement. Korean J Radiol 2014; 15:313 321 15. Schmachtenberg C, Hammann-Kloss S, Bick U, Engelken F. Intraindividual comparison of two methods of volumetric breast composition assessment. Acad Radiol 2015; 22:447 452 16. Morrish OW, Tucker L, Black R, Willsher P, Duffy SW, Gilbert FJ. Mammographic breast density: comparison of methods for quantitative evaluation. Radiology 2015; 275:356 365 17. Alonzo-Proulx O, Mawdsley GE, Patrie JT, Yaffe MJ, Harvey JA. Reliability of automated breast density measurements. Radiology 2015; 275:366 376 18. Sickles EA, D Orsi CJ, Bassett LW, et al. ACR BI-RADS Mammography. In: D Orsi CJ, Sickles EA, Mendelson EB, et al., eds. ACR BI-RADS Atlas, Breast Imaging Reporting and Data System. Reston, VA: American College of Radiology, 2013 19. Winkler NS, Raza S, Mackesy M, Birdwell RL. Breast density: clinical implications and assessment methods. RadioGraphics 2015; 35:316 324 20. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33:159 174 21. Kerlikowske K, Zhu W, Hubbard RA, et al. Outcomes of screening mammography by frequency, breast density, and postmenopausal hormone therapy. JAMA Intern Med 2013; 173:807 816 22. Freer PE. Mammographic breast density: impact on breast cancer risk and implications for screening. RadioGraphics 2015; 35:302 315 23. Machida Y, Tozaki M, Shimauchi A, Yoshida T. Breast density: the trend in breast cancer screening. Breast Cancer 2015; 22:253 261 24. Winkel RR, von Euler-Chelpin M, Nielsen M, et al. Inter-observer agreement according to three methods of evaluating mammographic density and parenchymal pattern in a case control study: impact on relative risk of breast cancer. BMC Cancer 2015; 15:274 25. Lee HN, Sohn YM, Han KH. Comparison of mammographic density estimation by Volpara software with radiologists visual assessment: analysis of clinical-radiologic factors affecting discrepancy between them. Acta Radiol 2015; 56:1061 1068 1062 AJR:206, May 2016