Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Similar documents
Comparison of discrimination methods for the classification of tumors using gene expression data

Introduction to Discrimination in Microarray Data Analysis

Summary of main challenges and future directions

Reliable Evaluation of Prognostic & Predictive Genomic Tests

Use of Archived Tissues in the Development and Validation of Prognostic & Predictive Biomarkers

Assessment of Risk Recurrence: Adjuvant Online, OncotypeDx & Mammaprint

Multigene Testing in NCCN Breast Cancer Treatment Guidelines, v1.2011

Prosigna BREAST CANCER PROGNOSTIC GENE SIGNATURE ASSAY

Prosigna BREAST CANCER PROGNOSTIC GENE SIGNATURE ASSAY

Profili di espressione genica

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2008 Formula Grant

RNA preparation from extracted paraffin cores:

Biomarker adaptive designs in clinical trials

Clinical Utility of Diagnostic Tests

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Assessment of omicsbased predictor readiness for use in a clinical trial

Breast Cancer Assays of Genetic Expression in Tumor Tissue

The Oncotype DX Assay A Genomic Approach to Breast Cancer

The Oncotype DX Assay in the Contemporary Management of Invasive Early-stage Breast Cancer

8/8/2011. PONDERing the Need to TAILOR Adjuvant Chemotherapy in ER+ Node Positive Breast Cancer. Overview

Profili Genici e Personalizzazione del trattamento adiuvante nel carcinoma mammario G. RICCIARDI

BreastPRS is a gene expression assay that stratifies intermediaterisk Oncotype DX patients into high- or low-risk for disease recurrence

Predicting Breast Cancer Survival Using Treatment and Patient Factors

She counts on your breast cancer expertise at the most vulnerable time of her life.

Oncotype DX tools User Guide

OVERVIEW OF GENE EXPRESSION-BASED TESTS IN EARLY BREAST CANCER

TABLE OF CONTENTS. Executive Summary The EndoPredict Test Intended Use Population Breast Cancer Clinical Dilemma. Analytical Validity

Precision diagnostics for personalized melanoma care

IJC International Journal of Cancer

T. R. Golub, D. K. Slonim & Others 1999

From single studies to an EBM based assessment some central issues

MP Assays of Genetic Expression in Tumor Tissue as a Technique to Determine Prognosis in Patients With Breast Cancer. Related Policies None

Statistical Analysis of Biomarker Data

BIOINFORMATICS ORIGINAL PAPER

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Oncotype DX reveals the underlying biology that changes treatment decisions 37% of the time

11th Annual Population Health Colloquium. Stan Skrzypczak, MS, MBA Sr. Director, Marketing Genomic Health, Inc. March 15, 2011

Diagnosis of multiple cancer types by shrunken centroids of gene expression

Gene Signatures in Breast Cancer: Moving Beyond ER, PR, and HER2? Lisa A. Carey, M.D. University of North Carolina USA

FISH mcgh Karyotyping ISH RT-PCR. Expression arrays RNA. Tissue microarrays Protein arrays MS. Protein IHC

MEDICAL POLICY Gene Expression Profiling for Cancers of Unknown Primary Site

Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

Breast cancer: Molecular STAGING classification and testing. Korourian A : AP,CP ; MD,PHD(Molecular medicine)

Colon cancer subtypes from gene expression data

Medical Policy An independent licensee of the Blue Cross Blue Shield Association

Oncotype DX testing in node-positive disease

Clinical Policy Title: Breast cancer index genetic testing

NIH Public Access Author Manuscript Best Pract Res Clin Haematol. Author manuscript; available in PMC 2010 June 1.

A new way of looking at breast cancer tumour biology

Contemporary Classification of Breast Cancer

The 21 Gene Oncotype DX Assay and The NCI-Sponsored TAILORx Trial. Steven Shak Chief Medical Officer Genomic Health October 4, 2007

VL Network Analysis ( ) SS2016 Week 3

Molecular in vitro diagnostic test for the quantitative detection of the mrna expression of ERBB2, ESR1, PGR and MKI67 in breast cancer tissue.

Classification with microarray data

Medical Policy An independent licensee of the Blue Cross Blue Shield Association

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/

Predicting Breast Cancer Recurrence Using Machine Learning Techniques

Gene expression correlates of clinical prostate cancer behavior

SUPPLEMENTARY MATERIAL

Oncotype DX DX : Colon Cancer Assay: Scientific Publications and Presentations.

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Role of Genomic Profiling in (Minimally) Node Positive Breast Cancer

Should we still be performing IHC on all sentinel nodes?

-- Kaiser Permanente Presents Results from Large Community-Based Study Consistent with NSABP Recurrence Study Findings

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Medical Policy An independent licensee of the Blue Cross Blue Shield Association

Gene-Expression Profiling and the Future of Adjuvant Therapy

Rationale For & Design of TAILORx. Joseph A. Sparano, MD Albert Einstein College of Medicine Montefiore-Einstein Cancer Center Bronx, New York

Genomic Profiling of Tumors and Loco-Regional Recurrence

Molecular in vitro diagnostic test for the quantitative detection of the mrna expression of ERBB2, ESR1, PGR and MKI67 in breast cancer tissue.

Selection of Patient Samples and Genes for Outcome Prediction

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Table S2. Expression of PRMT7 in clinical breast carcinoma samples

Assays of Genetic Expression in Tumor Tissue as a Technique to Determine Prognosis in Patients with Breast Cancer

Corporate Medical Policy

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Adjuvan Chemotherapy in Breast Cancer

Providing Treatment Information for Prostate Cancer Patients

Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers q

Breast Cancer: Basic and Clinical Research 2014:8

Aggressive B-cell lymphomas and gene expression profiling towards individualized therapy?

Have you been newly diagnosed with early-stage breast cancer? Have you discussed whether chemotherapy will be part of your treatment plan?

ISPOR 4 th Asia Pacific Conference IP2 Gilberto de Lima Lopes

Landmarking, immortal time bias and. Dynamic prediction

BREAST CANCER EPIDEMIOLOGY MODEL:

30 years of progress in cancer research

The Current Status and the Future Prospects of Multigene testing in Europe

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA

Medical Policy An independent licensee of the Blue Cross Blue Shield Association

FAQs for UK Pathology Departments

Relevancia práctica de la clasificación de subtipos intrínsecos en cáncer de mama Miguel Martín Instituto de Investigación Sanitaria Gregorio Marañón

95% 2.5% 2.5% +2SD 95% of data will 95% be within of data will 1.96 be within standard deviations 1.96 of sample mean

Assays of Genetic Expression in Tumor Tissue as a Technique to Determine Prognosis in Patients with Breast Cancer

Nielsen et al. BMC Cancer 2014, 14:177

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT

Augmented Medical Decisions

Transcription:

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006

Major reasons for few prognostic factors to be used in clinical practice lack of impact or guideline on clinical practice (prognostic factors should be therapeutically relevant, e.g. a classifier for predicting the patients unlikely to respond to a certain therapy may not be widely used if there is no good alternative treatment) lack of well defined patients with respect to disease status, therapy etc. lack of independent validation of prognostic markers lack of detailed analysis planning and clear define of participants and end points lack of reproducibility study

Key Steps in Development and Validation of Therapeutically Relevant Genomic Classifiers Step 1: Develop classifier for addressing a specific important therapeutic decision Patients are sufficiently homogeneous and receiving uniform treatment so that results are therapeutically relevant Treatment options and costs of mis-classification are such that a classifier is likely to be used Step 2: Perform internal validation of classifier to assess whether it appears sufficiently accurate relative to standard prognostic factors that it is worth further development Step 3: Translate classifier to platform that would be used for broad clinical application Step 4: Demonstrate that the classifier is reproducible Step 5: Independent validation of the completely specified classifier on a prospectively planned study

Some issues about multigene classifier A multigene expression signature classifier is a function that provides a classification of a tumor based on the expression levels of the component genes. genes regulating the same pathway are correlated simultaneous change of genes involved in regulating the same signal pathway may be significant biologically, but not necessarily significant statistically stringent statistical criteria used for identifying differential genes results in reduced power and may miss those co-regulated genes classifier external validation validates the usefulness of classifier in clinical practice, not the development of classifier

Step 1: Develop classifier for addressing a specific important therapeutic decision

Steps in classifier development determine which features should be included as components, e.g. select the top 10 most differential genes that best discriminate between two class of subjects what mathematical form to use for combining the values of the different features, e.g. l( x) = w i x i i F what cutoff values to use for converting a continuous function into a discrete classification

What kinds of classifiers are useful Fisher linear discriminant analysis (FLDA) Maximum likelihood discriminant rules (DQDA, DLDA) Weighted voting method of Golub (Golub) Nearest neighbor classifiers (NN) Classification and regression trees (CART) CART with perturbed learning sets generated by bagging (Bag CART) CART with perturbed learning sets generated by sampling from MVN (MVN CART) CART with perturbed learning sets generated by convex pseudo-data (CPD CART) CART with perturbed learning sets generated by boosting (Boost CART)

What did Dudoit et al do? Leukemia dataset described by Golub 1999 47 cases of ALL and 25 cases of AML p = 40 selected genes (genes with largest BSS/WSS) 2:1 of training to test set ratio repeat entire procedure N = 150 time compute test set error rate for each classifier compute observation-wise error rate Dudoit S. J Am Stat Assoc 97:77-87, 2002

Dudoit S. J Am Stat Assoc 97:77-87, 2002 Nearest Neighbor (NN) and Diagonal Linear Discriminant Analysis (DLDA) classifiers have the smallest mean error rate Fisher s Linear Discriminant Analysis (FLDA) has the highest error rate

Dudoit S. J Am Stat Assoc 97:77-87, 2002 NN and DLDA perform as well or better than more complex classifiers FLDA perform poorly

The fact that number of cases is relatively smaller compared to the number of genes prevents the more complex nonlinear classifier to perform superiorly than the simplest linear classifier In a small data set (this is probably true for most cases), there is not sufficient information to effectively utilize complex classifiers

Step 2:Perform internal validation of classifier to assess whether it appears sufficiently accurate relative to standard prognostic factors that it is worth further development

Purpose of internal validation: provide a preliminary estimate of the predictive power of the classifier Danger of using error rate or predictive accuracy from development study based on training set: overfitting almost always occur, always possible to find classifiers with high predictive accuracy even there is no relationship between expression of any of the genes and outcome; fit random variation or noise in original data that do no represent true relationships that hold for independent data

Cross Validation Holdout method or split-sample validation withholding a substantial proportion of the sample from training set may considerably reduce the performance of the prediction rule not an efficient way to use the available data K-fold cross validation Leave-one-out cross validation (K-fold cross validation with K = N)

What does cross validation validate Three steps in class prediction selection of informative genes computation of weights for selected informative genes creation of a prediction rule It is important that all three steps undergo the crossvalidation procedure Cross-validated prediction error is an estimate of the prediction error associated with application of the algorithm for model building Failure to cross-validate all steps may lead to substantial bias in the estimated error rate

What did Simon et al do? Each simulated data set is composed of 20 gene expression profiles, each profile consisting of 6000 gene expression measurements For each of the 20 profiles, the 6000 genes expression measurements are independent and identically distributed from the standard normal distribution (mu=0, var=1) Randomly assign 10 profiles to class 1 and the other 10 to class 2 Generate 2000 such simulated data sets Simon R, J Natl Cancer Inst 95:14-18, 2003

Steps in class prediction study Gene selection (10 most differentially expressed genes based on two-sample t-test) Computation of a weight for each gene (univariate two-sample t-statistic) l x = ( ) Computation of prediction rule (classification threshold) i F w i x i

Resubstitution (no cross-validation) Gene selection (10 most differentially expressed genes based on two-sample t-test) Computation of a weight for each gene (univariate two-sample t-statistic) Computation of prediction rule (classification threshold)

Cross-validation after gene selection Gene selection (10 most differentially expressed genes based on two-sample t-test) Computation of a weight for each gene (univariate two-sample t-statistic) Computation of prediction rule (classification threshold)

Cross-validation prior to gene selection Gene selection (10 most differentially expressed genes based on two-sample t-test) Computation of a weight for each gene (univariate two-sample t-statistic) Computation of prediction rule (classification threshold)

Fig 1. The effect of various levels of cross validation on the estimated error rate of a predictor Simon, R. J Clin Oncol; 23:7332-7341 2005 Copyright American Society of Clinical Oncology

The improperly cross-validated method results in seriously biased underestimate of the error rate, probably largely due to overfitting the predictor to the specific dataset

How to report the error rate Report properly cross-validated error rate Report error rate derived from sufficiently large independent validation set Report error rate from validation on all the classes for which the classifier was developed Report confidence interval or statistical significance of the error rate especially when the test set is small the point estimate of error rate may be small while the CI is too wide and cover 0.5 Consider unclassified specimen Unclassified specimens are those that could not confidently be assigned to any of the examined classes based on the classifier Simply ignoring the unclassified specimens gives an overstated performance of classifier

Does your classifier perform better than standard prognostic factors? the new classifier needs to outperform or add predictive power to the existing prognostic methods in order to justify the money and time invested on the external validation in a trial of much larger scale

Rosenwald A et al, The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. N Engl J Med 346:1937-1947, 2002

670 of 7399 microarray features were significantly associated with a good or a bad survival outcome in the training set (N=160) Selected 16 genes that were highly variable in expression out of the 670 differential genes in order to minimize the number of genes in the outcome predictor Categorized the 16 genes based on their physiological function. outcome-predictor score = (0.241 x the average value of the proliferation signature) + (0.310 x value for BMP6) (0.290 x the average value of the germinal-center B-cell signature) (0.311 x the average value of the majorhistocompatibility-complex [MHC] class II signature) (0.249 x the average value of the lymph-node signature) High score indicates a poor outcome.

Will the gene-expression based outcomepredictor score add more predictive accuracy to the standard International Prognostic Index (IPI)? Rosenwald used a validation set of N=80 to see if the predictor score can add predictive power to IPI.

Fig 2. Survival curves for diffuse large-b-cell lymphoma patients by gene expression classifier stratified by three levels of International Prognostic Index (IPI) score: (A) IPI scores 0-1; (B) IPI scores 2-3; (C) IPI scores 4-5.Four prognostic classes were defined based on gene expression risk score Simon, R. J Clin Oncol; 23:7332-7341 2005 Copyright American Society of Clinical Oncology

Step 3 & 4: Translate classifier to platform that would be used for broad clinical application and demonstrate that the classifier is reproducible RT-PCR can be performed on formalin-fixed paraffin-embedded (FFPE) tissue Standardization of assay protocols is crucial to achieve satisfactory inter- and intra-laboratory reproducibility

Step 5: Independent external validation of the completely specified classifier on a prospectively planned study The objective is to determine whether use of a completely specified diagnostic classifier for therapeutic decision making in a defined clinical context results in patient benefit The objective is not to see if the same genes are prognostic or if the same classifier is obtained

Example Paik S, Shak S, Tang G, et al: A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351:2817-2826, 2004 Constructed a recurrence-score algorithm based on training set of N=447 patients Classify patients into low risk (recurrence-score<18), intermediate risk (18 recurrence-score<31), and high risk (31 recurrence-score) Want to test the strategy of withholding cytotoxic chemotherapy form the subset of patients classified as low risk

Design 1 treatment without guideline of classifier tamoxifen+chemo follow-up End point Patients randomization high risk treatment with guideline of classifier (RT-PCR Assay+classifier) low risk tamoxifen+chemo tamoxifen follow-up End point Patients which are categorized as high risk will receive the same treatment (tamoxifen+chemo), either way they are assigned Require huge sample size, especially when high-risk patients account for the majority of the cohort

Design 2 tamoxifen+chemo follow-up End point Patients RT-PCR Assay+classifier low risk randomization tamoxifen follow-up End point Pros: Require less patients and more efficient than the previous design assigning all patients Cons: Recurrence rate is low for low-risk patients With the fact that the recurrence rate is low, it still requires many patients to be able to detect small difference in order to reject null hypothesis and claim bioequivalence.

Design 3 Patients RT-PCR Assay+classifier low risk tamoxifen follow-up End point Single-arm Require the smallest number of patients If after long follow-up, the recurrence rate is very low, then the classifier is considered validated for providing clinical benefit because it enable the identification of patients whose prognosis was so good with tamoxifen monotherapy that they could be spared the cytotoxic chemotherapy.

Paik S, et al N Engl J Med 351:2817-2826, 2004

Use of a genomic classifier for focusing a clinical trial on a population which is more likely to benefit can reduce the required sample size dramatically When long term follow-up is difficult, can conduct a prospectively planned validation using archived specimens from patient treated in a previously conducted prospective multicenter clinical trial Study protocol should be developed prospectively

In Paik S, et al 2004, they stated: The prospectively defined assay methods and end points were finalized in a protocol signed on August 27, 2003. RT-PCR analysis was initiated on September 5, 2003, and RT-PCR data were transferred to the NSABP for analysis on September 29, 2003.