PERSPECTIVE 1
LARGE SCALE DATASET EXAMPLES MolEcular Taxonomy of BReast cancer International Consortium (METABRIC) BC Cancer Agency, Vancouver Samuel Aparicio, PhD FRCPath Nan and Lorraine Robertson Chair of Breast Cancer Research BC Cancer Agency, Vancouver /UBC saparicio@bccrc.ca 2
People Cambridge Carlos Caldas (PI); Chris5na Cur5s (lead bioinforma5cian); James Brenton (Co- PI); Irene Papatheodorou (database bioinforma5cian) Vancouver Sam Aparicio (PI); Sohrab Shah (lead bioinforma5cian) Manitoba Canada (Lee Murphy, Co- PI) Guys UK (Arnie Purushotham, Co- PI) NoJngham (Ian Ellis) 3
METABRIC (MolEcular Taxonomy of BReast cancer Interna5onal Consor5um) 5 Centres, 2 Canadian, 3 UK Vancouver (Sam Aparicio), Manitoba Cambridge (Carlos Caldas), NoJngham, Guys Develop a robust mul5- modality molecular prognos5c classifica5on of breast cancer Establish a platorm of resources for future valida5on studies. Target discovery in 2000 frozen breast cancers annotated with 10 year outcomes 4
HOW MANY DISEASES MAKE UP BREAST CANCER? ~65% ER+ ER- ~35% HER2- ER+ ~15 % HER2+ trastuzumab HER2- ER- Express Basal markers Non basal Low mitotic rate High mitotic rate Special types (usually low mitotic rate) LOBULAR, TUBULAR Heterogenous group Younger patients Fewer therapeutic options Endocrine therapy +/- cytotoxic chemo 5
Breast cancer is heterogeneous and we cannot fully explain the variation (Caldas & Aparicio, Nature 2002) 6
HOW MANY DISEASES IS BREAST CANCER? LUMINAL - ER+, PR+, HER2 nonamp TRIPLE NEGATIVE (HER2 non amp, ER-, PR-) HER2+, LUMINAL (ER+) HER2+, ER-, PR- (Cheang et al Clin. Cancer Res 2008) 7
Clinical and biological ques5ons being addressed by METABRIC Can a good prognosis group of women with node- nega3ve disease at very low risk of relapse be clearly iden5fied, (and hence for whom the risk- benefit ra5o might be in favour of withholding chemotherapy)? Can women with aggressive disease at risk of local or distance relapse be more clearly iden5fied? Can surrogate molecular markers of nodal status be developed? What are the underlying molecular drivers of the subtypes? 8
Frustra5ons that led to METABRIC The expression datasets are too small (a few hundred pa5ents at most) to fully represent the breast cancer subtypes The current expression sets have been measured on different platorms different probesets/platorm variance Highly curated long term clinical outcomes data is missing on most Fixed expression changes driven by copy number have not been systema5cally mapped Mapping of muta5on space not done 9
METABRIC core dataset SNP6.0 and Illumina arrays in > 2100 breast cancers with > 5yrs clinical follow up 1000 node - ve, no systemic therapy 500 node +ve, ER+ve, hormone therapy alone 500 node +ve, ER- ve, with systemic therapy 450 matched normals Mul5point high resolu5on genomic analysis High density acgh Expression profiling including alterna5ve splicing microrna/epigenomic analysis Resequencing of hostpot genes (p53, PIK3CA) in 2000 cases PlaTorm resources (amplified DNA, RNA, TMA) microrna expression (in progress) Whole genome next gen sequencing of a subset 10
Augmented HMM distinguishes CNVs from somatic changes
Segmental copy number instability dominates the landscape in breast cancers IGF1R ERBB2 ESR1 FRS2 DUSP6 RASSF9 12
Small subgroups become apparent with 1000+ cases EGFR (7p12) IGF1R (15q26) K-Ras (12p12) Mir21 (7q21) 13
Homozygous dele5ons form part of the landscape PTEN BRCA2 CDKN2A UTX 14
Approximately 30% of genes have cis-acting copy number induced expression HER2 locus:
The era of cancer genome sequencing is upon us how should we traverse the landscape? 16
Individual breast cancer patients have multiple diseases mutational heterogeneity lymphocytes Supporting normal cells Malignant cells +/- mutations primary 5 somatic mutations were present at dominant frequencies in the primary: PALB2, HAUS3, SLC24A4, ABCG11, SNX4 mutations 9 YEAR INTERVAL recurrence 6 somatic mutations were present at low frequencies (1-13%) in the primary: KIF1C, USP28, MYH8, MORC1, KIAA1468, RNASEH2A The rest ie. 19 of 32 somatic coding nonsynonymous mutations, were not detectable in the primary tumour cell population 17
Challenges for the network modeling We have disparate data types con5nuous expression measurements & discrete types The genome aberra5on data have a sparse distribu5on There is intra- tumoural heterogeneity 18
Ongoing work Genome sequencing of 200 triple nega5ve breast cancers incorpora5ng revised copy number/ expression classifica5on. Clinical trials associated sequencing of tumours. (PARP) Profiling of drug resistance by sequencing and genome wide sirna screens 19