AIMS: Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype

Similar documents
Package AIMS. June 29, 2018

MethylMix An R package for identifying DNA methylation driven genes

metaseq: Meta-analysis of RNA-seq count data

GeneOverlap: An R package to test and visualize

Introduction to antiprofiles

RNA preparation from extracted paraffin cores:

Using the DART package: Denoising Algorithm based on Relevance network Topology

Using Messina. Mark Pinese. October 13, Introduction The problem Example: Designing a colon cancer screening test...

Breast cancer classification: beyond the intrinsic molecular subtypes

splicer: An R package for classification of alternative splicing and prediction of coding potential from RNA-seq data

Infer mirna-mrna interactions using paired expression data from a single sample

Supervised analysis of MS images using Cardinal

Comparison of Triple Negative Breast Cancer between Asian and Western Data Sets

Immunohistochemical classification of breast tumours

Contemporary Classification of Breast Cancer

Detecting gene signature activation in breast cancer in an absolute, single-patient manner

CLINICOPATHOLOGIC FEATURES AND MOLECULAR SUBTYPES OF BREAST CANCER IN FEZ-MEKNES REGION (MOROCCO): A STUDY OF 390 PATIENTS

Concordance among Gene-Expression Based Predictors for Breast Cancer

TITAN: Subclonal copy number and LOH prediction from whole genome sequencing of tumours

Breast cancer in elderly patients (70 years and older): The University of Tennessee Medical Center at Knoxville 10 year experience

Breast cancer: Molecular STAGING classification and testing. Korourian A : AP,CP ; MD,PHD(Molecular medicine)

The LiquidAssociation Package

FISH mcgh Karyotyping ISH RT-PCR. Expression arrays RNA. Tissue microarrays Protein arrays MS. Protein IHC

Genomic landscape of breast cancer

Reporting of Breast Cancer Do s and Don ts

J Clin Oncol 24: by American Society of Clinical Oncology INTRODUCTION

Molecular subclasses of breast cancer: how do we define them? The IMPAKT 2012 Working Group Statement

How To Use SubpathwayGMir

Implications of Progesterone Receptor Status for the Biology and Prognosis of Breast Cancers

Applying Metab. Raphael Aggio. October 30, This document describes how to use the function included in the R package Metab.

Heather M. Gage, MD, Avanti Rangnekar, Robert E. Heidel, PhD, Timothy Panella, MD, John Bell, MD, and Amila Orucevic, MD, PhD

OVERVIEW OF GENE EXPRESSION-BASED TESTS IN EARLY BREAST CANCER

ThePrognosticEaseandDifficultyofInvasiveBreast Carcinoma

Relevancia práctica de la clasificación de subtipos intrínsecos en cáncer de mama Miguel Martín Instituto de Investigación Sanitaria Gregorio Marañón

10/15/2012. Biologic Subtypes of TNBC. Topics. Topics. Histopathology Molecular pathology Clinical relevance

SubLasso:a feature selection and classification R package with a. fixed feature subset

Profili Genici e Personalizzazione del trattamento adiuvante nel carcinoma mammario G. RICCIARDI

Predictive Assays in Radiation Therapy

Table S2. Expression of PRMT7 in clinical breast carcinoma samples

Histological Type. Morphological and Molecular Typing of breast Cancer. Nottingham Tenovus Primary Breast Cancer Study. Survival (%) Ian Ellis

Package AbsFilterGSEA

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

A new way of looking at breast cancer tumour biology

CLINICAL IMPLEMENTATION OF BREAST CANCER GENOMICS. Joel Stanley Parker

30 years of progress in cancer research

Quality assessment of TCGA Agilent gene expression data for ovary cancer

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Multigene Testing in NCCN Breast Cancer Treatment Guidelines, v1.2011

Role of Genomic Profiling in (Minimally) Node Positive Breast Cancer

Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

CHARLES M. PEROU. The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Supplementary Data for:

Response to Paclitaxel in Node-positive Triple Negative Breast Cancer

Gene Signatures in Breast Cancer: Moving Beyond ER, PR, and HER2? Lisa A. Carey, M.D. University of North Carolina USA

Prosigna BREAST CANCER PROGNOSTIC GENE SIGNATURE ASSAY

Prosigna BREAST CANCER PROGNOSTIC GENE SIGNATURE ASSAY

Extent of ductal carcinoma in situ according to breast cancer subtypes: a population-based cohort study

A Retrospective Analysis of Clinical Utility of AJCC 8th Edition Cancer Staging System for Breast Cancer

Genes and functions from breast cancer signatures

Package MethPed. September 1, 2018

Principles of breast radiation therapy

Question 1 A. ER-, PR-, HER+ B. ER+, PR+, HER2- C. ER-, PR+, HER2- D. ER-, PR-, HER2- E. ER-, PR+, HER2+

A fatty liver study on Mus musculus

A fatty liver study on Mus musculus

Prognostic implications of the intrinsic molecular subtypes in male breast cancer

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2

Module 3: Pathway and Drug Development

Updates in Molecular Classification of Breast Cancer. Emmanuel Agosto-Arroyo, MD Assistant Member Department of Anatomic Pathology

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers q

Alternate Gene Signatures, or Not

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

Poly ADP-ribose Polymerase PARP Staining for Immunohistological Investigation of Primary Breast Cancer

Molecular Classification of Triple-Negative Breast Cancer (TNBC) Aleix Prat, MD

Microarray bioinformatics in cancer- a review

Accelerating Innovation in Statistical Design

Multiplexed Cancer Pathway Analysis

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

A Prospective Comparison of the 21-Gene Recurrence Score and the PAM50-Based Prosigna in Estrogen Receptor-Positive Early-Stage Breast Cancer

Molecular Classification of Iraqi Breast Cancer Patients and Its Correlation with Patients Profile

Predicting outcome in metastatic breast cancer

Commentary The promise of microarrays in the management and treatment of breast cancer Jenny C Chang, Susan G Hilsenbeck and Suzanne AW Fuqua

SUPPLEMENTAY FIGURES AND TABLES

Proteomic Biomarker Discovery in Breast Cancer

Morphological and Molecular Typing of breast Cancer

CME. CK5 Is More Sensitive Than CK5/6 in Identifying the Basal-like Phenotype of Breast Carcinoma

FAQs for UK Pathology Departments

Molecular classification of breast cancer in Egyptian cohort: 10 years follow-up and survival

Classification of Human Breast Cancer Using Gene Expression Profiling as a Component of the Survival Predictor Algorithm

Clinical pathological and epidemiological study of triple negative breast cancer

Mirsynergy: detect synergistic mirna regulatory modules by overlapping neighbourhood expansion

T. R. Golub, D. K. Slonim & Others 1999

Long non-coding RNA expression profiles predict metastasis in lymph node-negative breast cancer independently of traditional prognostic markers

CoINcIDE: A framework for discovery of patient subtypes across multiple datasets

Microarrays in breast cancer research and clinical practice the future lies ahead

She counts on your breast cancer expertise at the most vulnerable time of her life.

Breast Cancer. Molecular Classification of Breast Cancer: Limitations and Potential. The Oncologist 2006;11:

Emerging Data on Multianalyte Algorithm Assays in Breast Cancer: A Clinical Review

User Guide. Association analysis. Input

Transcription:

AIMS: Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype Eric R. Paquet (eric.r.paquet@gmail.com), Michael T. Hallett (michael.t.hallett@mcgill.ca) 1 1 Department of Biochemistry, Breast cancer informatics, McGill University, Montreal, Canada October 30, 17 Contents 1 Introduction 2 2 Case Study: Assigning breast cancer subtype to a dataset of breast cancer microarray data 3 3 Session Info 8 1

1 Introduction Technologies to measure gene expression in a massively parallel fashion have facilitated a more complete molecular understanding of breast cancer (BC) and a deeper appreciation for the heterogeneity of this disease[1, 2, 3]. Although the subtypes defined by Estrogen Receptor (ER) and Human Epidermal growth factor Receptor 2 (Her2) status have long been recognized as distinct forms of the disease, early genomic studies emphasized the magnitude of the differences between the subtypes at the molecular level [4, 5]. Unbiased bioinformatic analyses of expression profiles provided the so-called intrinsic subtyping scheme, consisting of the Luminal A (LumA), Luminal B (LumB) and Normal-like (NormL) subtypes enriched for ER+ tumors, the Her2-enriched (Her2E) subtype containing many Her2+ tumors and the Basal-like (BasalL) subtype enriched for ER-/Her2- tumors [1, 2]. The subtypes have differing clinicopathological attributes, prognostic characteristics, and treatment options, providing sufficient clinical utility as to support their inclusion within international guidelines for treatment[6]. In fact, the ProsignaTM PAM50-based risk of recurrence (ROR) score generated from the intrinsic subtypes has recently received FDA approved for clinical use [7]. Several alternative omics-based subtyping schemes promise similar utility[8, 9, 10, 11]. Subtyping tools built in the manner of PAM50 have severe shortcomings[12, 13, 14]. The application of normalization procedures and gene-centering techniques are necessary to remove batch effects and insure comparability between expression levels of genes across patients[12, 13]. However, different normalization procedures have been shown to lead to different subtype classifications for patients[12, 13, 14, 15]. In lieu of absolute estimations of mrna copy number per gene per individual, gene-centering techniques are used to adjust expression measurements to represent the change in abundance of a specific mrna species relative to the cohort of patients. However, these steps make such methods sensitive to the composition of patients in the dataset. Both differences in the ratio of ER+ to ER- samples and differences in the frequency of each intrinsic subtype within a dataset have been shown to influence subtype assignments[12]. It remains unknown as to the degree of imbalance necessary to cause such subtype instability. We present a bioinformatics approach entitled Absolute Intrinsic Molecular Subtyping AIMS for estimating patient subtype that circumvents these shortcomings. The method does not require a panel of gene expression samples. As such, it is the first method that can accurately assign subtype to a single patient whilst being insensitive to changes in normalization procedures or the relative frequencies of ER+ tumors, of subtypes or of other clinicopathological patient attributes. The stability and accuracy of AIMS is explored for the intrinsic subtyping scheme across several datasets generated via microarrays or RNA-Seq. More detail about AIMS could be found in Paquet et al. (in review at the Journal of the National Cancer Institute). The AIMS package is providing the necessary functions to assign the five intrinsic subtypes of breast cancer to either a single gene expression experiment or to a dataset of gene expression data [16]. 2

2 Case Study: Assigning breast cancer subtype to a dataset of breast cancer microarray data We first need to load the package AIMS and our example dataset. In this case study we will use a fraction of the McGill dataset describe in the paper and also breastcancervdx. > library(aims) > data(mcgillexample) To get breast cancer subtypes for a dataset we need to provide the expression values that have not been gene centered. It means all the expression values will be positive. In the case of a two-colors array the user should select only the channel that contains the tumor sample (usually the Cy5 channel). In the previous code we have shown the size of the expression matrix for the McGill example as well as the first expression values and the characters vector of Entrez ids. AIMS require the use of Entrez ids to prevent any confusion related to unstable gene symbols. > mcgill.subtypes <- applyaims(mcgillexample$d, + mcgillexample$entrezid) > names(mcgill.subtypes) [1] "cl" "prob" "all.probs" "rules.matrix" [5] "data.used" "EntrezID.used" > head(mcgill.subtypes$cl) 1 "Normal" 2 "LumA" 3 "LumB" 4 "Her2" 5 "LumA" 6 "LumB" > head(mcgill.subtypes$prob) 1 0.7784899 2 1.0000000 3 0.9929477 4 0.9722196 5 0.9998480 6 0.9533438 3

> table(mcgill.subtypes$cl) Basal Her2 LumA LumB Normal 52 65 98 54 52 applyaims is the function used to assign AIMS to both single sample and dataset of gene expression data. The first parameter should be a numerical matrix composed of positiveonly values. Rows represent genes and columns samples. The second argument represents the EntrezIds corresponding to genes in the first parameter. AIMS will deal with duplicated EntrezId so you should leave them there. applyaims will return a list of arguments. cl represents the molecular assignment in (Basal,Her2, LumA (Luminal A), LumB (Luminal B) and Normal). The variable prob corresponds to a matrix with five columns and number of rows corresponding to the number of samples in D. This matrix contains the posterior probabilities returned from the Naive Bayes classifier. > mcgill.first.sample.subtype <- applyaims(mcgillexample$d[,1,drop=false], + mcgillexample$entrezid) > names(mcgill.first.sample.subtype) [1] "cl" "prob" "all.probs" "rules.matrix" [5] "data.used" "EntrezID.used" > head(mcgill.first.sample.subtype$cl) 1 "Normal" > head(mcgill.first.sample.subtype$prob) 1 0.7784899 > table(mcgill.first.sample.subtype$cl) Normal 1 This is the same example as before except now we are assigning subtype to only one sample. 4

> library(breastcancervdx) > library(hgu133a.db) > data(vdx) > hgu133a.entrez <- as.character(as.list(hgu133aentrezid)[featurenames(vdx)]) > vdx.subtypes <- applyaims(vdx, + hgu133a.entrez) > names(vdx.subtypes) [1] "cl" "prob" "all.probs" "rules.matrix" [5] "data.used" "EntrezID.used" > head(vdx.subtypes$cl) VDX_3 "Normal" VDX_5 "LumB" VDX_6 "Basal" VDX_7 "Basal" VDX_8 "LumB" VDX_9 "Her2" > head(vdx.subtypes$prob) VDX_3 1.0000000 VDX_5 0.8993610 VDX_6 1.0000000 VDX_7 1.0000000 VDX_8 1.0000000 VDX_9 0.6806823 > table(vdx.subtypes$cl) Basal Her2 LumA LumB Normal 98 65 66 88 27 Here we are assigning AIMS subtypes on the vdx dataset. We are using ( hgu133a.db) to obtain EntrezIds corresponding to the probes of the HG-133A array. AIMS has been designed to be applicable on this platform. 5

References [1] Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours. Nature 00;406(6797):747-52. [2] Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 01;98(19):10869-74. [3] Weigelt B, Baehner FL, Reis-Filho JS. The contribution of gene expression profiling to breast cancer classification, prognostication and prediction: a retrospective of the last decade. J Pathol 10;2(2):263-80. [4] Gruvberger S, Ringner M, Chen Y, et al. Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 01;61(16):5979-84. [5] Pusztai L, Ayers M, Stec J, et al. Gene expression profiles obtained from fine-needle aspirations of breast cancer reliably identify routine prognostic markers and reveal large-scale molecular differences between estrogen-negative and estrogen-positive tumors. Clin Cancer Res 03;9(7):2406-15. [6] Goldhirsch A, Wood WC, Coates AS, et al. Strategies for subtypes dealing with the diversity of breast cancer: highlights of the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 11. Ann Oncol 11;22(8):1736-47. [7] Harbeck N, Sotlar K, Wuerstlein R, et al. Molecular and protein markers for clinical decision making in breast cancer: Today and tomorrow. Cancer Treat Rev 13; 10.1016/j.ctrv.13.09.014. [8] Guedj M, Marisa L, de Reynies A, et al. A refined molecular taxonomy of breast cancer. Oncogene 12;31(9):1196-6. [9] Lehmann BD, Bauer JA, Chen X, et al. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J Clin Invest 11;121(7):2750-67. [10] Jonsson G, Staaf J, Vallon-Christersson J, et al. Genomic subtypes of breast cancer identified by array-comparative genomic hybridization display distinct molecular and clinical characteristics. Breast Cancer Res 10;12(3):R42. [11] Curtis C, Shah SP, Chin SF, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 12;486(7403):346-52. [12] Lusa L, McShane LM, Reid JF, et al. Challenges in projecting clustering results across gene expression-profiling datasets. J Natl Cancer Inst 07;99(22):1715-23. [13] Sorlie T, Borgan E, Myhre S, et al. The importance of gene-centring microarray data. Lancet Oncol 10;11(8):719-; author reply 7-1. 6

[14] Weigelt B, Mackay A, A Hern R, et al. Breast cancer molecular profiling with single sample predictors: a retrospective analysis. Lancet Oncol 10;11(4):339-49. [15] Perou CM, Parker JS, Prat A, et al. Clinical implementation of the intrinsic subtypes of breast cancer. Lancet Oncol 10;11(8):718-9; author reply 7-1. [16] Parker JS, Mullins M, Cheang MCU, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. Journal of clinical oncology 09;27:1160-7. 7

3 Session Info ˆ R Under development (unstable) (17-10- r73567), x86_64-pc-linux-gnu ˆ Locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=C, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8, LC_IDENTIFICATION=C ˆ Running under: Ubuntu 16.04.3 LTS ˆ Matrix products: default ˆ BLAS: /home/biocbuild/bbs-3.7-bioc/r/lib/librblas.so ˆ LAPACK: /home/biocbuild/bbs-3.7-bioc/r/lib/librlapack.so ˆ Base packages: base, datasets, grdevices, graphics, methods, parallel, stats, stats4, utils ˆ Other packages: AIMS 1.11.0, AnnotationDbi 1.41.0, Biobase 2.39.0, BiocGenerics 0.25.0, IRanges 2.13.0, S4Vectors 0.17.0, breastcancervdx 1.15.0, e1071 1.6-8, hgu133a.db 3.2.3, org.hs.eg.db 3.4.2 ˆ Loaded via a namespace (and not attached): DBI 0.7, RSQLite 2.0, Rcpp 0.12.13, bit 1.1-12, bit64 0.9-7, blob 1.1.0, class 7.3-14, compiler 3.5.0, digest 0.6.12, memoise 1.1.0, pkgconfig 2.0.1, rlang 0.1.2, tibble 1.3.4, tools 3.5.0 8