Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT

Similar documents
Breast Cancer Risk Assessment and Prevention

Package BCRA. April 25, 2018

BREAST CANCER EPIDEMIOLOGY MODEL:

Modelling cancer risk predictions:clinical practice perspective.

Hereditary breast cancer who to refer to a cancer genetics clinic and how to counsel patients with

Breast density: imaging, risks and recommendations

It s hard to predict!

Insights and Updates in Breast Cancer. No Disclosures. Learning Objectives. Mountain States Cancer Conference 2017 Regina Jeanise Brown MD

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Risk modeling for Breast-Specific outcomes, CVD risk, and overall mortality in Alliance Clinical Trials of Breast Cancer

Development, validation and application of risk prediction models

HBOC Syndrome A review of BRCA 1/2 testing, Cancer Risk Assessment, Counseling and Beyond.

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Twenty-five Years of Breast Cancer Risk Models and Their Applications

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Improved Hepatic Fibrosis Grading Using Point Shear Wave Elastography and Machine Learning

Breast Cancer Risk Assessment among Bahraini Women. Majida Fikree, MD, MSc* Randah R Hamadeh, BSc, MSc, D Phil (Oxon)**

ARTICLES. Validation of the Gail et al. Model of Breast Cancer Risk Prediction and Implications for Chemoprevention

BRCA Precertification Information Request Form

Melissa Hartman, DO Women s Health Orlando VA Medical Center

EECS 433 Statistical Pattern Recognition

The best way of detection of and screening for breast cancer in women with genetic or hereditary risk

Cancer Treatment Centers of America: Supercharge Your Knowledge: A Focus on Breast, Cervical and Prostate Screening Guidelines and Controversies

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Generalized additive model for disease risk prediction

Mammogram Analysis: Tumor Classification

Computer Models for Medical Diagnosis and Prognostication

Patient Safety in the Age of Big Data

AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK IN PRIMARY CARE

Development, Validation, and Application of Risk Prediction Models - Course Syllabus

Estimating Likelihood of Having a BRCA Gene Mutation Based on Family History of Cancers and Recommending Optimized Cancer Preventive Actions

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Augmented Medical Decisions

BIOINFORMATICS ORIGINAL PAPER

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

Management of BRCA Positive Breast Cancer. Archana Ganaraj, MD February 17, 2018 UPDATE ON WOMEN S HEALTH

WHAT DO GENES HAVE TO DO WITH IT? Breast Cancer Risk Assessment and Risk Reduction in 2016

Selection and Combination of Markers for Prediction

Prophylactic Mastectomy

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

Smarter Big Data for a Healthy Pennsylvania: Changing the Paradigm of Healthcare

Breast Cancer Screening

Risk Assessment, Genetics, and Prevention

Current issues and controversies in breast imaging. Kate Brown, South GP CME 2015

Supporting Information Identification of Amino Acids with Sensitive Nanoporous MoS 2 : Towards Machine Learning-Based Prediction

Overdiagnosis in. Outline

Downloaded from irje.tums.ac.ir at 20:01 IRDT on Friday March 22nd 2019

Introduction to Cost-Effectiveness Analysis

Breast Cancer Risk and Prevention

Data mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis

Talking about Cancer in Your Family Can Keep You and Your Family Healthy. Do you know your Kin Facts? A guide for and your family

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science

Shared Decision Making in Breast and Prostate Cancer Screening. An Update and a Patient-Centered Approach. Sharon K. Hull, MD, MPH July, 2017

Mammogram Analysis: Tumor Classification

Foundational funding sources allow BCCHP to screen and diagnose women outside of the CDC guidelines under specific circumstances in Washington State.

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Keywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.

Big Data and Machine Learning in RCTs An overview

Does Cancer Run in Your Family?

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008

Learning Objectives. ! At the end of the presentation,students will be able to:

Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data

Analysis of Classification Algorithms towards Breast Tissue Data Set

Predicting Malignancy from Mammography Findings and Image Guided Core Biopsies

Different Roles, Same Goals: Preventing Sexual Abuse 2016 ATSA Conference Friday November 4 1:30 PM - 3:00 PM F-19

Chemo-endocrine prevention of breast cancer

MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE

Effect of Feedforward Back Propagation Neural Network for Breast Tumor Classification

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

Classification of benign and malignant masses in breast mammograms

Evidence Contrary to the Statistical View of Boosting: A Rejoinder to Responses

Panel: Machine Learning in Surgery and Cancer

Breast Cancer Risk Assessment: Evaluation of Screening Tools for Genetics Referral

Colon cancer survival prediction using ensemble data mining on SEER data

Untangling the Confusion: Multiple Breast Cancer Screening Guidelines and the Ones We Should Follow

MANAGEMENT OF DENSE BREASTS. Nichole K Ingalls, MD, MPH NW Surgical Specialists September 25, 2015

Overdiagnosis of Breast Cancer: Myths and Facts

Enhanced Surveillance vs. Risk Reducing Mastectomy for BRCA+ Patients with Advanced Ovarian Cancer: Surgical Futility or Valuable Prevention?

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

breast cancer; relative risk; risk factor; standard deviation; strength of association

Breast Cancer Screening Clinical Practice Guideline. Kaiser Permanente National Breast Cancer Screening Guideline Development Team

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Ge elastography cpt codes

Module Overview. What is a Marker? Part 1 Overview

Breast Cancer Screening and Diagnosis

Increased Risk of Breast Cancer: Screening and Prevention. Elizabeth Pritchard, MD 4/5/2017

FEP Medical Policy Manual

Akosa, Josephine Kelly, Shannon SAS Analytics Day

Hybridized KNN and SVM for gene expression data classification

iprevent Breast Cancer Risk Assessment and Risk Management Decision Support Tool Kelly Anne Phillips MBBS MD FRACP

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

pearls to take back in the primary care setting of things you don't want to miss in terms of thinking

Prediction of heart disease using k-nearest neighbor and particle swarm optimization.

Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Anirudh Kamath 1, Raj Ramnani 2, Jay Shenoy 3, Aditya Singh 4, and Ayush Vyas 5 arxiv: v2 [stat.ml] 15 Aug 2017.

Transcription:

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT Chang Ming, 22.11.2017 University of Basel Swiss Public Health Conference 2017

Breast Cancer & personalized cancer prevention In Switzerland each year, about 5 250 women develop breast cancer and 1 350 die from it. 1 Prediction model in personalized cancer prevention: ability to forecast breast cancer risk or presence before clinical symptoms appear opportunity to act on the breast cancer through early intervention. guide surveillance and preventive treatment (such as increased frequency of mammography, prophylactic surgery, chemoprevention and medication) 2

What is a good prediction model? Calibration Does the model correctly predict the number of people will develop breast cancer? Discriminatory accuracy Does the model correctly predict exactly who will develop breast cancer? The Area Under an ROC Curve The area measures discrimination, that is, the ability of the test to correctly classify those with and without the disease.90-1 = excellent (A).80-.90 = good (B).70-.80 = fair (C).60-.70 = poor (D).50-.60 = fail (F). Chang Ming, 22.11.17 University of Basel 3

Population level or Personalized level? 3 Chang Ming, 22.11.17 University of Basel 4

Gail model Based on case-control data from 284,780 women. Risk factors included : Age Reproductive history Family history Personal history (Biopsies) Validated using data from NCI's Surveillance, Epidemiology, and End Results (SEER). Caucasian women and African American Asian and Pacific Islander Guideline based on Gail model From The American Society of Clinical Oncology (ASCO) five-year risk 1.67%: Clinical breast examination at least once per year, annual mammogram, consider high-risk counseling or risk reducing medication.(e.g. tamoxifen) 5 Universität Basel Chang Ming, 22.11.17

BOADICEA model Based on 2785 UK families Included Family pedigree and cancer history, Mutation BRCA 1&2, Ethnicity, and several Biomarkers Validated in a large series of families from UK genetics clinics In UK and several European countries, it is recommended as a risk assessment tool in clinical guideline In Newest UK guideline, Lifetime risk > 30% Screening starting at 30-35 yrs Consider annual MRI starting at 30 yrs Clinical breast exam (annual) Preventive treatment: Consider chemoprevention and preventive mastectomy 6

Methodology Machine learning -learn from experience Three characters: Learning Machine learning algorithms use computational methods to learn information directly from data without relying on a predetermined equation as a model. Learning more The algorithms adaptively improve their performance as the number of samples available for learning increases. Generate insight for prediction 7 University of Basel Chang Ming, 22.11.2017

Leaning techniques/algorithms --How should the machine search for pattern in data --Depends on whether known responses(disease presence) is part of the learning >>>Supervised Classification techniques predict discrete responses Binary vs. Multiclass Classification Logistic Regression k Nearest Neighbor (knn) Neural Network Bagged and Boosted Decision Trees Regression techniques predict continuous responses Generalized Linear Model Gaussian Process Regression Model 8

Study Setting and Materials ML v.s. Gail a random population-based US breast cancer patients and their cancer-free female relatives (N=1232) CDC ML v.s. BOADICEA a clinic-based sample Swiss breast cancer patients and cancer-free women seeking genetic evaluation and/or testing Geneva University Hospitals (N=1967 Families and 112,482 individual) collected since 1998 9

Results1 Always same input data for Gail v.s. ML 1. simulated, with no signal; N=800 Gail ML-ada acc 0.345 0.384 1. simulated, with artificial signal; N=800 Gail ML-ada Adapt boosting (ada) Linear discriminant (lda) Random forest (rf) Linear Model (lm) Logistic Regression (Logistic) k-nearest Neighbors algorithm (k-nn) Quadratic Discriminant (qda) acc 0.711 0.958 2. Real data N=1232 Gail ML-lda ML-rf MLada MLlogistic MLknn MLqda ML-lm acc 0.658 0.897 0.828 0.734 0.855 0.783 0.782 0.334 10

Results2 Always same input data for BOADICEA v.s. ML 1. simulated, with no signal; N=800 BOADICEA ML-Logistic ML-ada acc 0.279 0.301 0.234 1. simulated, with artificial signal; N=800 BOADICEA ML-Logistic ML-ada Adapt boosting (ada) Linear discriminant (lda) Random forest (rf) Linear Model (lm) Logistic Regression (Logistic) k-nearest Neighbors algorithm (k-nn) Quadratic Discriminant (qda) acc 0.699 0.953 0.939 2. Real data N=112,482 ML- BOADICEA ML-rf Logistic ML-lda MLada MLknn MLqda ML-lm acc 0.671 0.924 0.894 0.881 0.625 0.858 0.812 0.534 11

Conclusion and Next steps Advantages of ML: Big improvement in predictive discriminatory accuracy Not limited by various epidemiology assumptions Model-free: Free to add any risk factors, e.g Mammographic density The bigger the data, the better the prediction Easy adaption in application Limitations: If not having enough data SWISS PROMPT: first project internationally to apply machine-learning methods in individual breast cancer risk prediction and compares its predictive accuracy with existing models; first risk prediction model which is developed using primarily data from Swiss populations; will incorporate additional risk factors than existing models. E.g. Modifiable and non-modifiable risk factors 12

Reference 1. BOUCHARDY MAGNIN, Christine, LOREZ, Matthias, ARNDT, Volker. Effects of age and stage on breast cancer survival in Switzerland. In: Bulletin suisse du cancer 2. Gagnon, J. et al. Recommendations on Breast Cancer Screening and Prevention in the Context of Implementing Risk Stratification: Impending Changes to Current Policies. Current Oncology 23.6 (2016): e615 e625. PMC. Web. 20 Nov. 2017. 3. Usher-Smith, Juliet et al. Risk Prediction Tools for Cancer in Primary Care. British Journal of Cancer 113.12 (2015): 1645 1650. PMC. Web. 20 Nov. 2017. 4. Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Shairer C, Mulvihill JJ: Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 81(24):1879-86, 1989. 5. W. L. Chao, J. J. Ding, Integrated Machine Learning Algorithms for Human Age Estimation, NTU, 2011. 6. Easton et al., Nature 2007; 447: 1087-1095 7. Cox et al., Nature Genetics 2007; 39: 352-358 8. Stacey et al., Nature Genetics 2007; 39: 865-869 9. Friedman, Hastie, and Tibshirani, Additive logistic regression: a statistical view of boosting, Annals of Statistics, 2000 10. Christopher Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995 13

Thank you for your attention.