ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc.
1. BACKGROUND: DIGITAL PATHOLOGY 2. APPLICATIONS AGENDA BREAST CANCER PROSTATE CANCER 3. DEMONSTRATIONS 4. CONCLUSION 2
BACKGROUND: DIGITAL PATHOLOGY DIAGNOSTIC PROCEDURE Patient Detection (X-ray, CT, MRI,...) Diagnosis (biopsy, resection,...) Treatment Radiology Pathology Oncology 3
BACKGROUND: DIGITAL PATHOLOGY LIMITATIONS OF CONVENTIONAL PATHOLOGY Slide Report Diagnosis (biopsy, resection,...) Pathology (-) Archiving (-) Workflow (-) Analysis 4
BACKGROUND: DIGITAL PATHOLOGY RISE OF DIGITAL PATHOLOGY Diagnosis (biopsy, resection,...) Pathology (+) Archiving (+) Workflow (+) Analysis Digital pathology 5
BACKGROUND: DIGITAL PATHOLOGY WHY DO WE NEED AI IN DIGITAL PATHOLOGY? (+) Reproducibility (+) Accuracy (+) Workload reduction 25% disagreement among pathologists in breast biopsy report. Diagnostic Concordance Among Pathologists Interpreting Breast Biopsy Specimens., JAMA, 2015. 6
BACKGROUND: DIGITAL PATHOLOGY CHALLENGES IN AI FOR DIGITAL PATHOLOGY ~ 100,000 pixels 1. Gigapixel images Grade 1 Grade 2 Grade 3 2. Quality variation 3. Ambiguity in ground-truth definition 3! 4! 3? 4? 7
KEY APPLICATIONS: #1. Tumor proliferation score prediction in breast resection specimen. #2. Gleason score prediction in prostate biopsy specimen. 8
APPLICATION #1: BREAST CANCER WHAT IS TUMOR PROLIFERATION SCORE? Breast resection specimen Proliferation score (in 10 consecutive HPFs) Mitosis Score 1: ~6 mitosis Score 2: 6~10 mitosis Score 3: 10~ mitosis good prognosis bad 9
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION Data statistics Tumor Proliferation Assessment Challenge 2016 TUPAC16 MICCAI Grand Challenge Training dataset Test dataset, Proliferation score 500 slides, Proliferation score 321 slides, Mitosis #1 (x,y)... Mitosis #N (x,y) Auxiliary dataset 656 ROIs from 73 slides 10
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION System overview Whole slide image... Mitosis Detection Network 1. The number of mitosis 2. The number of cells Tissue region extraction Stain normalization Patch extraction at x40 ROI detection using cell density Phase 1: Handling whole slide images Auxiliary set for mitosis detection Phase 2: Mitosis detection Feature vector based on statistical information Support Vector Machine Proliferation score Phase 3: Score prediction 11
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION Phase 1: Handling whole slide images Resizing a whole slide image. Finding a threshold. Morphological operations. Whole slide image... Stain normalization Patch extraction with 10 HPFs size. Cell detection in each patch. Tissue region extraction Patch extraction at x40 ROI detection using cell density 30 ROIs selection. Stain normalization. 12
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION Phase 2: Mitosis detection Mitosis Detection Network 128 x 128 conv 1, 3x3, 16 resblock 1.1, 3x3, 32 resblock 1.3, 3x3, 32 resblock 2.1, 3x3, 64 resblock 2.3, 3x3, 64 resblock 3.1, 3x3, 128 resblock 3.3, 3x3, 128 16 mitosis 8 normal Global pooling layer Auxiliary set for mitosis detection Based on Residual Network (ResNet). 9 residual blocks = 21 layers architecture. 2 step training procedure. Cropped global pooling layer. Training step:, Inference step: 13
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION Phase 3: Score prediction Converting each WSI to a 21-dim feature vector. 10-fold cross validation from 500 training samples. Feature selection based on cross validation results. 1. The number of mitosis 2. The number of cells Feature vector based on statistical information Support Vector Machine Proliferation score 14
APPLICATION #1: BREAST CANCER TUMOR PROLIFERATION SCORE PREDICTION Results Tumor Proliferation Assessment Challenge 2016 TUPAC16 MICCAI Grand Challenge 15
APPLICATION #2: PROSTATE CANCER WHAT IS GLEASON SCORE? Prostate biopsy specimen Core #1: Core #2: Core #3: Core #4: 5+5 0 3+4 0 Grade 1 Grade 2 Grade 3 Grade 4 Grade 1, 2 Grade 3 Grade 4 Grade 5 Grade 5 16
APPLICATION #2: PROSTATE CANCER GLEASON SCORE PREDICTION Data statistics Training dataset Test dataset, { Grade, Contours } 900 slides, { Grade, Contours } 50 slides The number of patients: 385 The number of slides: 1152 The number of cores: 4907 The number of normal cores: 2872 The number of cancer cores: 2035 Dataset from medical centers 17
APPLICATION #2: PROSTATE CANCER GLEASON SCORE PREDICTION System overview Patch-based classification Normal Grade 3 Grade 4 Grade 5 Gleason score classification network Normal Grade 3 Grade 4 Grade 5 Memory network-based refinement (25 neighbors) 1000 1100 1110 1111 Embedded memory vector... Query vector Embedding Ranking loss with thermometer code Memory network 18 Refined output
APPLICATION #2: PROSTATE CANCER GLEASON SCORE PREDICTION Patch-based classification Normal Grade 3 Baseline settings ResNet 101 architecture. 512x512 patch with 75% overlap. Softmax loss with 4 class classification. Key features for improving performance ~75% Grade 4 Normal patches from only fully normal slides. è +~5% gain Ranking loss with thermometer code. è +2~3% gain Grade 5 Not a classification problem! Ordering problem! 1000 1100 1110 1111 Network decodes from the left-most bit to the right-most bit. 19
APPLICATION #2: PROSTATE CANCER Patch-level outputs (25 neighbors) GLEASON SCORE PREDICTION Memory network-based refinement + ~5% gain 1D-CNN Refined output...... Memory vector (25x4dim) Query vector (1x4dim) Embedding... 25x1024......... 1x1024 Innerproduct... Weighting Attention vector 25x1 Softmax 20
APPLICATION #2: PROSTATE CANCER GLEASON SCORE PREDICTION Results Patch-level performance Baseline: 75% + Data cleansing: 80% + Ranking loss: 82.8% + Memnet refinement: 87.5% Core-level performance Normal or cancer core? AUC: 97.8% Gleason score prediction? Only 1 st major: 83% Both: 76.7% 21
DEMO #1: BREAST CANCER 22
DEMO #2: PROSTATE CANCER 23
Lessons learned CONCLUSION Artificial intelligence for digital pathology Challenge #1. How to handle gigapixel images? (i.e., whole slide images) ü Consider how to sample patches. (patch size, sampling step,...) è with pathologists. ü Consider how to construct whole pipeline from gigapixel images to diagnosis. Challenge #2. How to handle quality variation between slides? ü Design image processing modules carefully. ü Do cross-validation to avoid overfitting. Challenge #3. How to handle ambiguous ground-truth? ü Design task-specific loss. ü Sanitize training dataset as much as possible you can. ü Don t be satisfied with patch-based results. 24
THANK YOU TEAM MEMBERS