Deep-Learning Based Semantic Labeling for 2D Mammography & Comparison of Complexity for Machine Learning Tasks

Similar documents
Mammographic Breast Density Classification by a Deep Learning Approach

Automatic classification of mammographic breast density

Artificial Intelligence in Breast Imaging

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

arxiv: v1 [cs.cv] 21 Jul 2017

The Radiology Aspects

arxiv: v2 [cs.cv] 8 Mar 2018

High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks

HEALTHCARE AI DEVELOPMENT CYCLE

Since its introduction in 2000, digital mammography has become

Automating Quality Assurance Metrics to Assess Adequate Breast Positioning in Mammography

Computerized image analysis: Estimation of breast density on mammograms

DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation

Using Deep Convolutional Neural Networks to Predict Semantic Features of Lesions in Mammograms

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study

A Deep Learning Approach for Breast Cancer Mass Detection

Ruud Pijnappel Professor of Radiology, UMC Utrecht. Chair Dutch Expert Centre for Screening Board EUSOBI

arxiv: v1 [stat.ml] 23 Jan 2017

Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images

Expert identification of visual primitives used by CNNs during mammogram classification

Breast asymmetries in mammography: Management

arxiv: v1 [cs.cv] 30 May 2018

ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY. Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc.

How to Reduce Missed Diagnosis of Breast Cancer

Healthcare Research You

MAMMOGRAM AND TOMOSYNTHESIS CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORKS

Financial Disclosures

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

Corporate Medical Policy

Does the synthesised digital mammography (3D-DM) change the ACR density pattern?

Classification of breast cancer histology images using transfer learning

Look differently. Invenia ABUS. Automated Breast Ultrasound

DIAGNOSTIC CLASSIFICATION OF LUNG NODULES USING 3D NEURAL NETWORKS

Deep learning and non-negative matrix factorization in recognition of mammograms

ACCREDITATION DOCUMENT THE RADIOGRAPHER

Predicting Malignancy from Mammography Findings and Image Guided Core Biopsies

A deep learning method for classifying mammographic breast density categories

Mammogram Analysis: Tumor Classification

S. Murgo, MD. Chr St-Joseph, Mons Erasme Hospital, Brussels

MAMMO: A Deep Learning Solution for Facilitating Radiologist-Machine Collaboration in Breast Cancer Diagnosis

Assessment of extent of disease: digital breast tomosynthesis (DBT) versus full-field digital mammography (FFDM)

Research Article A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis

The value of the craniocaudal mammographic view in breast cancer detection: A preliminary study

BREAST CANCER EARLY DETECTION USING X RAY IMAGES

CSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil

Imaging Collaboration: From Pen and Ink to Artificial Intelligence June 2, 2018

CLASSIFICATION OF ABNORMALITY IN B -MASS BY ARCHITECTURAL DISTORTION

FDA Executive Summary

Intracystic papillary carcinoma of the breast

ImageCLEF2018: Transfer Learning for Deep Learning with CNN for Tuberculosis Classification

Computer Aided Diagnosis for Breast Cancer Screening

GE Healthcare. Look differently. Invenia ABUS. Automated Breast Ultrasound

Epworth Healthcare Benign Breast Disease Symposium. Sat Nov 12 th 2016

Estimation of Breast Density and Feature Extraction of Mammographic Images

Mammogram Analysis: Tumor Classification

8/31/2016 HIDING IN PLAIN SITE, ARCHITECTURAL DISTORTIONS AND BREAST ASYMMETRIES ARCHITECTURAL DISTORTIONS ARCHITECTURAL DISTORTIONS

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

Challenges to Delivery of High Quality Mammography

Diagnostic benefits of ultrasound-guided. CNB) versus mammograph-guided biopsy for suspicious microcalcifications. without definite breast mass

CURRENTLY FDA APPROVED ARE FULL FIELD DIGITAL MAMMOGRAPHY SYSTEMS AND FILM SCREEN STILL BEING USED AT SOME INSTITUTIONS

FULLY AUTOMATED CLASSIFICATION OF MAMMOGRAMS USING DEEP RESIDUAL NEURAL NETWORKS. Gustavo Carneiro

NMF-Density: NMF-Based Breast Density Classifier

Additional US or DBT after digital mammography: which one is the best combination?

Breast Imaging! Ravi Adhikary, MD!

ORIGINAL ARTICLE EVALUATION OF BREAST LESIONS USING X-RAY MAMMOGRAM WITH HISTOPATHOLOGICAL CORRELATION

Elad Hoffer*, Itay Hubara*, Daniel Soudry

Computer aided detection of clusters of microcalcifications on full field digital mammograms

Breast Imaging & You

Deep CNNs for Diabetic Retinopathy Detection

Background Information

Breast Density. Update 2018: Implications for Clinical Practice

London Medical Imaging & Artificial Intelligence Centre for Value-Based Healthcare. Professor Reza Razavi Centre Director

Min Jung Kim Department of Medicine The Graduate School, Yonsei University

A comparison of the accuracy of film-screen mammography, full-field digital mammography, and digital breast tomosynthesis

Ana Sofia Preto 19/06/2013

Automated Volumetric Breast Density Measurements in the Era of the BI-RADS Fifth Edition: A Comparison With Visual Assessment

arxiv: v1 [cs.cv] 17 Oct 2018

arxiv: v1 [cs.cv] 26 Feb 2018

9/3/2015. New Mammographic Modality Training. Mammographer. Qualified Radiologic Technologist. Tomosynthesis Training

Can Digital Breast Tomosynthesis(DBT) Perform Better than Standard Digital Mammography Workup in a Breast Cancer Assessment Clinic?

Final Report: Automated Semantic Segmentation of Volumetric Cardiovascular Features and Disease Assessment

Lung Nodule Segmentation Using 3D Convolutional Neural Networks

Differentiating Tumor and Edema in Brain Magnetic Resonance Images Using a Convolutional Neural Network

Detection of suspicious lesion based on Multiresolution Analysis using windowing and adaptive thresholding method.

Breast density in quantifying breast cancer risk 9 December 2016

Mammography is a most effective imaging modality in early breast cancer detection. The radiographs are searched for signs of abnormality by expert

Machine Learning Powered Automatic Organ Classification for Patient Specific Organ Dose Estimation

CLASSIFICATION OF DIGITAL MAMMOGRAM BASED ON NEAREST- NEIGHBOR METHOD FOR BREAST CANCER DETECTION

Amammography report is a key component of the breast

arxiv: v2 [cs.cv] 29 Jan 2019

Digital Breast Tomosynthesis from a first idea to clinical routine

Skin cancer reorganization and classification with deep neural network

'Automated dermatologist' detects skin cancer with expert accuracy - CNN.com

Digital Breast Tomosynthesis in the Diagnostic Environment: A Subjective Side-by-Side Review

Highly Accurate Brain Stroke Diagnostic System and Generative Lesion Model. Junghwan Cho, Ph.D. CAIDE Systems, Inc. Deep Learning R&D Team

Classification of benign and malignant masses in breast mammograms

Improving the breast cancer diagnosis using digital repositories

Disclosures. Breast Cancer. Breast Imaging Modalities. Breast Cancer Screening. Breast Cancer 6/4/2014

GE Healthcare. Look differently. Invenia ABUS Automated Breast Ultrasound

SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis

Transcription:

Deep-Learning Based Semantic Labeling for 2D Mammography & Comparison of Complexity for Machine Learning Tasks Paul H. Yi, MD, Abigail Lin, BSE, Jinchi Wei, BSE, Haris I. Sair, MD, Ferdinand K. Hui, MD, Gregory D Hager, PhD, & Susan C. Harvey, MD Radiology Artificial Intelligence Lab (RAIL), Johns Hopkins University School of Medicine & Malone Center for Engineering in Healthcare, Johns Hopkins University Whiting School of Engineering

Introduction

Introduction

Introduction arxiv 2017. 112,000 frontal CXRs!

Introduction https://lukeoakdenrayner.wordpress.com/2018/04/30/theunreasonable-usefulness-of-deep-learning-in-medical-image-datasets/

Introduction Radiologist PACS workflow depends on accurate semantic labeling. Although DICOM stores metadata, its inclusion is inconsistent, can vary between equipment manufacturers, & can be inaccurate. Automated method for semantic labeling could: 1) Improve radiologist workflow 2) Facilitate curation of medical imaging datasets for machine learning purposes.

How Many Images Do You Need? arxiv 2015. ~5000 images per class J Digit Imaging. 2017. this number may depend on the difficulty of the training set.

Mammography as a Model? Mammography is recommended annually for all women over age 40 by the ACR. Due to strict national regulations for DICOM labeling, mammography serves as a potential model to explore the nuances of developing semantic labeling algorithms. Lessons learned can be applied towards other modalities and more complex, but analogous problems!

Purpose 1. Develop deep convolutional neural networks (DCNNs) for automated classification of 2D-mammography: 1. View 2. Laterality 3. Breast density 4. Normal/Benign vs. Malignant masses 2. Compare the performance of DCNNs on these tasks of varying complexity.

Methods- Dataset Digital Database for Screening Mammography*: 3034 2D mammography images (2620 patients) 4 USA Hospitals (MGH, Wake Forest, Sacred Heart, WUSTL) Normal & Benign or malignant masses (pathology groundtruth) Labels: Mammographic view (craniocaudal [CC] vs. mediolateral oblique [MLO]) Breast laterality (right vs. left) Breast density (4 BI-RADS categories) Normal/benign mass vs. malignant mass. *Updated CBIS-DDSM. Rebecca Sawyer Lee, Francisco Gimenez, Assaf Hoogi, Daniel Rubin (2016). Curated Breast Imaging Subset of DDSM. The Cancer Imaging Archive.

Methods- Dataset Data split into training (70%), validation (10%), & testing (20%) sets. Mammographi c View Total Label #s (3034) CC: 1429 (47%) MLO: 1605 (53%) Laterality Left: 1560 (51%) Right: 1474 (49%) Breast Density (BI-RADS) Benign vs. Malignant A: 416 (14%) B: 1182 (39%) C: 928 (31%) D: 508 (16%) Benign: 1704 (56%) Malignant: 1330 (44%) Training (70%) Validation (10%) Testing (20%) CC: 1000 MLO: 1123 Left: 1092 Right: 1032 A: 291 B: 827 C:649 D: 355 Benign: 1193 Malignant: 931 A = Fatty; B = Scattered fibroglandular; C = Heterogeneously dense; D = Dense CC: 143 MLO: 161 Left: 156 Right: 148 A: 42 B: 119 C: 93 D: 51 Benign: 171 Malignant: 1331 CC: 288 MLO: 323 Left: 314 Right: 296 A: 85 B: 238 C: 188 D: 104 Benign: 342 Malignant: 268

Methods- DCNN Transfer learning to train, validate, and test the ResNet-50 DCNN pretrained on ImageNet on these mammography datasets. Last fully connected layer fine-tuned using these datasets. During each training epoch, images augmented via rotations, cropping, and horizontal flipping.

Methods Receiver operating characteristic (ROC) curves with area under the curve (AUC) were generated. AUCs compared between DCNNs using the DeLong parametric method (significance set at p<0.05).

Methods Heatmaps created through class activation mapping: http://cnnlocalization.csail.mit.edu

Results: Simplest Task The DCNN trained to classify mammographic view achieved AUC of 1.

Results: Slightly More Difficult Task The DCNN trained to classify breast laterality initially mis-classified right and left breasts not infrequently (AUC 0.89, 77% accuracy) However, after discontinuing horizontal flips, AUC significantly improved to 1 (p<0.0001)!

Results: Laterality

Results: Most Difficult Tasks Classification of normal/benign vs. malignant masses proved more difficult: AUC of 0.75 (p<0.0001, compared to both view and laterality DCNNs) Similarly, breast density classification was not as successful with 68% accuracy.

Discussion Semantic labeling DCNNs achieved AUC of 1 for mammographic view and laterality ( obvious differences) using 2427 training/validation images. J Digit Imaging. 2017. Deep convolutional neural networks perform rather well in distinguishing images that have many obvious differences, such as chest vs. abdominal radiographs (AUC = 1.00), and require only a small amount of training data. 45 chest and 45 abdominal XRs were sufficient!

Discussion They were less successful at more complex tasks, likely owing to increased subtleties in these categories. Datasets to train high-performing DCNNs for more complex tasks need to be larger than those used for simple tasks, e.g., semantic labeling.

Discussion More augmentation did not not always improve performance! Laterality DCNNs demonstrated significantly improved performance by omitting horizontal flips. ARRS 2018 Interestingly, the network initially miscategorized large right for large left effusions; however, 100% accuracy was achieved for correct laterality identification after discontinuing horizontal flipping during data augmentation.

Discussion: Is More Better? Canonical ML wisdom is to always perform data augmentation during training to decrease overfitting. However, this should be performed thoughtfully! N.B. We don t know why certain techniques may help or hurt GREY

Discussion: How low can we go? We know that DCNNs for simpler tasks require less data! But how low can we go? Work Imaging Views Total # Training & Validation Images AUC Rajkomar et al. J Digit Imaging. 2017. Yi et al. (Current work) Lakhani et al. J Digit Imaging. 2017. *Augmented Future work? Frontal vs. Lateral Chest X-Ray 150,000* 1 CC vs. MLO Mammography 2427 1 Chest vs. Abdomen X-Ray 90 1

Conclusions DCNN semantic labeling of 2D-mammography is feasible using relatively small image datasets. However, automated classification of more difficult tasks will likely require larger datasets. Risks of image augmentation?

Thank you!