A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers

Size: px
Start display at page:

Download "A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers"

Transcription

1 Article A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Highlights d Integrated analysis finds molecular features characteristic of gynecologic tumors d d d Subtypes with high leukocyte infiltration, a marker for immune response, identified Gene-lncRNA interaction network of ESR1, DKC1, and lncrnas TERC, NEAT1, and TUG1 identified Decision tree to group patients into clinically relevant prognostic subtypes proposed Authors Ashton C. Berger, Anil Korkut, Rupa S. Kanchi,..., Gordon B. Mills, Douglas A. Levine, Rehan Akbani Correspondence jweinste@mdanderson.org (J.N.W.), gmills@mdanderson.org (G.B.M.), douglas.levine@nyumc.org (D.A.L.), rakbani@mdanderson.org (R.A.) In Brief By performing molecular analyses of 2,579 TCGA gynecological (OV, UCEC, CESC, and UCS) and breast tumors, Berger et al. identify five prognostic subtypes using 16 key molecular features and propose a decision tree based on six clinically assessable features that classifies patients into the subtypes. Berger et al., 2018, Cancer Cell 33, April 9, 2018 ª 2018 Elsevier Inc.

2 Cancer Cell Article A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Ashton C. Berger, 1,36 Anil Korkut, 2,36 Rupa S. Kanchi, 2,36 Apurva M. Hegde, 2 Walter Lenoir, 2 Wenbin Liu, 2 Yuexin Liu, 2 Huihui Fan, 3 Hui Shen, 3 Visweswaran Ravikumar, 2 Arvind Rao, 2 Andre Schultz, 2 Xubin Li, 2 Pavel Sumazin, 4 Cecilia Williams, 5 Pieter Mestdagh, 6 Preethi H. Gunaratne, 7,8 Christina Yau, 9,10 Reanne Bowlby, 11 A. Gordon Robertson, 11 Daniel G. Tiezzi, 12 Chen Wang, 13,14 Andrew D. Cherniack, 1,15 Andrew K. Godwin, 16 Nicole M. Kuderer, 17 Janet S. Rader, 18 Rosemary E. Zuna, 19 Anil K. Sood, 20 Alexander J. Lazar, 21,22,23 Akinyemi I. Ojesina, 24 Clement Adebamowo, 25,26 Sally N. Adebamowo, 25 Keith A. Baggerly, 2 Ting-Wen Chen, 4,27 Hua-Sheng Chiu, 4 Steve Lefever, 6 Liang Liu, 28 Karen MacKenzie, 29 Sandra Orsulic, 30 Jason Roszik, 22,31 Carl Simon Shelley, 32 Qianqian Song, 28 Christopher P. Vellano, 33 Nicolas Wentzensen, 34 The Cancer Genome Atlas Research Network, John N. Weinstein, 2,33, * Gordon B. Mills, 33, * Douglas A. Levine, 35, * and Rehan Akbani 2,37, * 1 The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA 2 Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 3 Center for Epigenetics, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, MI 49503, USA 4 Texas Children s Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA 5 Department of Protein Sciences, CBH, KTH Royal Institute of Technology, Science for Life Laboratory, Tomtebodav agen 23, Solna, Sweden 6 Department of Pediatrics and Medical Genetics, Ghent University, Ghent, Belgium 7 Department of Biology & Biochemistry, UH-Sequencing Core, University of Houston, Houston, TX 77204, USA 8 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA 9 Buck Institute of Research on Aging, Novato, CA 94945, USA 10 Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA 11 BC Cancer Agency, Canada s Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada (Affiliations continued on next page) SUMMARY We analyzed molecular data on 2,579 tumors from The Cancer Genome Atlas (TCGA) of four gynecological types plus breast. Our aims were to identify shared and unique molecular features, clinically significant subtypes, and potential therapeutic targets. We found 61 somatic copy-number alterations (SCNAs) and 46 significantly mutated genes (SMGs). Eleven SCNAs and 11 SMGs had not been identified in previous TCGA studies of the individual tumor types. We found functionally significant estrogen receptor-regulated long non-coding RNAs (lncrnas) and gene/lncrna interaction networks. Pathway analysis identified subtypes with high leukocyte infiltration, raising potential implications for immunotherapy. Using 16 key molecular features, we identified five prognostic subtypes and developed a decision tree that classified patients into the subtypes based on just six features that are assessable in clinical laboratories. INTRODUCTION Gynecologic cancers share a variety of characteristics: they arise from similar embryonic origins in the M ullerian ducts, their development is influenced by female hormones, and they are managed by a particular medical specialty, gynecologic oncology, as reflected in the departmental organizations of academic medical centers (Mullen and Behringer, 2014). Recently, Significance Gynecologic and breast (Pan-Gyn) cancers have a projected incidence of more than 350,000 cases in the United States in 2017, with much larger numbers worldwide. Despite recent clinical advances, more comprehensive information on molecular characteristics of the tumors is a priority. As part of The Cancer Genome Atlas (TCGA) Pan-Cancer Atlas project, we present here an integrated analysis of 2,579 patients Pan-Gyn cancers at the DNA, RNA, protein, histopathological, and clinical levels. We highlight shared characteristics and unique molecular features of the tumors, identifying clinically significant subtypes and suggesting potential therapeutic targets. Finally, we present a practical decision tree with only six laboratory-assessable molecular features, which classifies patient samples into one of five prognostic molecular subtypes. 690 Cancer Cell 33, , April 9, 2018 ª 2018 Elsevier Inc. This is an open access article under the CC BY-NC-ND license (

3 12 Breast Disease and Gynecologic Oncology Division - Department of Gynecology and Obstetrics, Ribeirão Preto Medical School, University of São Paulo, 3900 Bandeirantes Avenue, Ribeirão Preto, SP , Brazil 13 Department of Health Sciences Research, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA 14 Department of Obstetrics and Gynecology, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA 15 Department of Medical Oncology, Dana Farber Cancer Institute, Boston, MA 02215, USA 16 Department of Pathology and Laboratory Medicine, The University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, KS 66160, USA 17 Advanced Cancer Research Group, Seattle, Washington, and Center for Cancer Innovation, Department of Medicine, University of Washington, WA 98195, USA 18 Department of Obstetrics and Gynecology, Medical College of Wisconsin, Milwaukee, WI 53226, USA 19 Pathology Department, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA 20 Department of Gynecologic Oncology and Reproductive Medicine, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 21 Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 22 Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 23 Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 24 Department of Epidemiology and Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL 35294, USA 25 Department of Epidemiology and Public Health, Institute of Human Virology and Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA 26 Institute of Human Virology, Abuja, Nigeria 27 Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan 28 Department of Cancer Biology, Wake Forest Baptist Health Center, Winston Salem, NC 27157, USA 29 School of Women s and Children s Health, University of New South Wales, Sydney, Australia 30 Women s Cancer Program, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA 31 Department of Melanoma Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 32 Leukemia Therapeutics, LLC, Hull, MA 02045, USA 33 Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA 34 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA 35 Gynecologic Oncology, Perlmutter Cancer Center, New York University Langone Health, New York, NY 10016, USA 36 These authors contributed equally 37 Lead Contact *Correspondence: jweinste@mdanderson.org (J.N.W.), gmills@mdanderson.org (G.B.M.), douglas.levine@nyumc.org (D.A.L.), rakbani@ mdanderson.org (R.A.) similarities at the molecular level have been identified across gynecologic and breast cancers in a comprehensive analysis of all 33 TCGA tumor types (Hoadley et al., 2018). Despite the commonalities, however, the various gynecologic cancer types do differ from each other in a variety of intriguing and important ways. The principal aims of the present study are to highlight both similarities and differences among types and subtypes of gynecologic cancers, in addition to the ways in which they differ from non-gynecologic cancers. Because breast tumors share most of the generic characteristics listed above, we have chosen to include them in the analysis. The study focuses on the following five TCGA tumor types: high-grade serous ovarian cystadenocarcinoma (OV), uterine corpus endometrial carcinoma (UCEC), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), uterine carcinosarcoma (UCS), and invasive breast carcinoma (BRCA). Although each Pan-Gyn organ site is subject to a variety of uncommon histologic cancer subtypes not studied by TCGA, the most frequent and/or aggressive tumors are represented. Despite impressive recent advances in diagnosis and management, these tumors share unmet needs for effective treatment. The analyses here can provide background biological information and prompt hypotheses about therapeutic choices or provide evidence for pre-existing hypotheses. Taken together, the Pan-Gyn cohort reflects a projected incidence of more than 350,000 cases in the United States in 2017 (Siegel et al., 2017), with many more worldwide. Many of the commonalities and differences among cancer types and subtypes presented here were not identified in the individual TCGA disease-type projects (Cancer Genome Atlas Research Network, 2011, 2012, 2017; Cancer Genome Atlas Research Network et al., 2013; Cherniack et al., 2017). RESULTS We used data generated from 2,579 TCGA patient samples (the Pan-Gyn cohort; n = 1,087 BRCA, 308 CESC, 579 OV, 548 UCEC, and 57 UCS) using fresh-frozen primary samples prior to any chemotherapy or radiation therapy. All sample collections were approved by local institutional review boards. We analyzed data of multiple types, including clinical, somatic copy-number alterations (SCNAs), mutations, DNA methylation, and expression of mrna, microrna (mirna), long non-coding RNA (lncrna), and proteins. The data were adjusted for batch effects before further analysis (see the STAR Methods). Here, we (1) present results that distinguish Pan-Gyn from the rest of the TCGA tumor types, (2) summarize platform-specific analysis results, and (3) propose cross-tumor type subtypes with potential prognostic and therapeutic value. Cancer Cell 33, , April 9,

4 SAsS A ARID1A BCORL1 CCND1 CDH1 CHD4 CTCF ERBB3 FBXW7 GATA3 KMT2C MAP3K1 MED12 NSD1 PIK3CA PIK3R1 PPP2R1A PTEN RB1 RNF43 RPL22 SPOP TP53 ZFHX3 Pan-Gyn Non-Gyn BRAF BRD4 CCND1 CCNE1 CD274 ERBB2 FGFR3 IGF1R JAK2 KAT6A KAT6B KEAP1 MCL1 MYC NEDD9 RECQL5 RIT1 TERC TERT YAP1 YEATS4 ZNF217 Pan-Gyn Non-Gyn 0% 2.5% 5% 10% 15% 20% 25% 30% 50% 70% 90% 100% 0% 2.5% 5% 10% 15% 20% 25% 30% 50% 70% 90% 100% BRCA UCEC CESC UCS OV ACC BLCA CHOL COAD DLBC ESCA GBM HNSC KICH KIRC KIRP LGG LIHC LUAD LUSC MESO PAAD PCPG PRAD READ SARC SKCM STAD TGCT THCA THYM UVM BRCA UCEC CESC UCS OV ACC BLCA CHOL COAD DLBC ESCA GBM HNSC KICH KIRC KIRP LAML LGG LIHC LUAD LUSC MESO PAAD PCPG PRAD READ SARC SKCM STAD TGCT THCA THYM UVM B SOX2 MYC EGFR CCND1 PTPRD CDKN2A CSMD1 Non-Gyn cancers q value MDM2/FRS2 TERT CDK4 FGFR1 MDM4/PIK3C2B CDK6 SOX4/E2F3 VEGFA 17q25.1 PDGFRA KRAS HCFC1 NKX2-1/ KDM5A ZNF217 NKX2-8 TLN1 CUL4A IGF1R FLT4 GATA6 KAT6B 7q p q13.2 6q21 MYCL1 KAT6A 1q42.3 NEDD9 BRD4 MCL1 CCNE1 ERBB2 MECOM TP53 1p q16.3 6q21 18q23 FHIT FAM190A LRP1B PTEN RB1 2q37.3 WWOX PARK2 LINC00290 PDE4D 16q q26.3 CDKN1B MACROD2 11p15.5 LRRN3 DMD 22q13.32 POLE 15q15.1 MLL3 RAD51B TMPRSS2 NF1 MAP2K4 9q34.3 ANKS1B NUDT1 1p36.11 PPP2R1B 11q25 STK11 Pan-Gyn cancers q value Deletions Figure 1. Genomic Features that Distinguish Pan-Gyn from Other Tumor Types (A) Heatmap showing the frequencies of mutations (green) in 23 genes across all 33 TCGA tumor types and frequencies of amplifications (red) in 23 genes across all 33 TCGA tumor types. (B) Amplification (red) and deletion (blue) q values from GISTIC2.0 for SCNA peaks of significant copy-number gain and loss plotted for Pan-Gyn versus non-gyn cohorts. Genes named are the suspected targets of amplification or deletion, if identifiable. Otherwise, peaks are labeled with the nearest cytoband s designation. Peaks found in only one cohort were assigned values of NS (not significant) in the other cohort. See also Figure S1 and Table S1. Molecular Features that Distinguish Pan-Gyn from Other Tumor Types We identified molecular features that differed in frequency among the five Pan-Gyn tumor types and the remaining 28 TCGA non-gynecologic (non-gyn) tumor types ( After adjusting for sample size per tumor type, we found 23 genes (including ARID1A, ERBB3, BRCA1, FBXW7, KMT2C, PIK3CA, PIK3R1, PPP2R1A, PTEN, and TP53) that were mutated at higher frequencies across the Pan-Gyn tumor types than across the non-gyn types (false discovery rate [FDR] < 0.01, Fisher s exact test) (Figure 1A). Eighteen of those genes were found to be significantly mutated genes (SMGs) in the Pan-Gyn cohort (as described later). 692 Cancer Cell 33, , April 9, 2018

5 Next, we used GISTIC2.0 (Mermel et al., 2011) to identify statistically significant recurring SCNAs in the Pan-Gyn cohort and, separately, in the non-gyn cohort. We identified 61 significant regions in the Pan-Gyn tumors, 27 amplifications and 34 deletions, of which 12 amplifications and 6 deletions were not found in the non-gyn cohort, suggesting a relative specificity for Pan-Gyn tumors (Figures 1B and S1A; Table S1). Two of the 12 uniquely Pan-Gyn amplifications and one of the 6 deletions had not previously been reported in single-disease TCGA studies of the same tumor types (Cancer Genome Atlas Research Network, 2011, 2012, 2017; Cancer Genome Atlas Research Network et al., 2013; Cherniack et al., 2017). One of the previously unreported amplifications was a focal region in 1q42.3 covering IRF2BP2, which encodes an interferon regulatory factor binding protein that is implicated in cellular differentiation, proliferation, and survival processes (Stadhouders et al., 2015). The other unreported amplification, located in 10p15.1, included an intergenic non-coding region downstream of KLF6 that bears striking resemblance to known oncogenic super-enhancer regions (Zhang et al., 2016) and PFKFB3, a gene that is being investigated as a therapeutic target in various cancers (Cantelmo et al., 2016; Li et al., 2017; Peng et al., 2018). The deletion consisted of a 7 MB region in 9q34.3 that contains the tumor suppressor genes TSC1 and NOTCH1. Figures 1B and S1A depict suspected targets of the significant SCNAs in the Pan-Gyn and non-gyn cohorts without adjusting for sample size per tumor type. MECOM, KAT6A, BRD4, NEDD9, MYCL1, and KAT6B were selectively amplified in the Pan-Gyn cohort, whereas SOX2, EGFR, CDK4, MDM4, and CDK6 were selectively amplified in the non-gyn cohort. MAP2K4 and NF1 were notable tumor suppressor genes with recurring copy-number losses specific to Pan-Gyn tumors, whereas PTPRD, RBFOX1, and TP53 were among the tumor suppressors more commonly deleted in non-gyn samples. Significantly recurring deletions were found in known or putative fragile site genes, including LRRN3 (7q31.1) in non-gyn, ANKS1B (12q23.1) in Pan-Gyn, and RAD51B (14q24) in both cohorts (McAvoy et al., 2007; Miron et al., 2015). Adjusting for sample size per tumor type, we identified 23 oncogenes among the genes in the 27 Pan-Gyn amplification regions that were consistently more frequently amplified across the five Pan-Gyn tumor types than across the non-gyn types (FDR < 0.05, Fisher s exact test) (Figure 1A). We found no known tumor suppressors within the 34 somatic deletion regions that were more frequently deleted across the Pan-Gyn tumor types than across the non-gyn types. In addition, we identified 197 genes that were statistically significantly hyperor hypomethylated at different frequencies in the two cohorts (Figure S1B). We performed bootstrapping-based analyses to investigate whether there were greater numbers of shared mutated or copy-number altered genes among the five Pan-Gyn tumor types versus random sets of five tumor types. The results showed that 23 mutated genes were enriched in the Pan-Gyn tumor types versus only 6 mutated genes expected by random chance (p = 0.10) (Figure S1C), whereas 122 SCNA genes were enriched in Pan-Gyn versus 2 by random chance (p < ) (Figure S1D). Individual Data Platform Analyses Mutation Analysis We analyzed 2,258 patient samples with mutation data from TCGA for SMGs and operative mutational processes across the Pan-Gyn tumor types. The types of mutations in the Pan- Gyn cohort are summarized in Table S2. The average mutation load varied widely by tumor type, with CESC samples having the highest median frequency (5.3 mutations/mbp). UCEC samples showed a bimodal distribution due to a subset of hypermutators described previously (Cancer Genome Atlas Research Network et al., 2013). There were 46 SMGs based on the intersection of those genes identified by MutSigCV v.1.4 (Lawrence et al., 2013) and those identified by previous methods (Vogelstein et al., 2013) (Figure 2A). The top five most frequently mutated genes were TP53 (44% of samples mutated), PIK3CA (32%), PTEN (20%), ARID1A (14%), and PIK3R1 (11%). Eleven of the 46 SMGs had not been previously reported in any of the TCGA gynecologic or breast marker papers (Cancer Genome Atlas Research Network, 2011, 2012, 2017; Cancer Genome Atlas Research Network et al., 2013; Cherniack et al., 2017) (Table S3). Among them, ACVR2A, a member of the transforming growth factor b superfamily that functions in pathways implicated in both tumor progression and suppression (Ikushima and Miyazono, 2010), was the most frequently mutated (in 4.8% of the cohort). LATS1 was the next most frequently mutated (3.8%) and functions in the Hippo signaling pathway, which controls organ size, restricts proliferation, promotes apoptosis, and has been implicated in multiple cancer types (Yu et al., 2015; Deng et al., 2017). CCAR1 was mutated at 3.6%; its protein product functions as a p53 coactivator and plays roles in cell proliferation, apoptosis, and, in breast cancer, estrogen-dependent growth (Kim et al., 2008; Muthu et al., 2015). We found 220 patients (10%) that had no detectable SMGs. Mutation Signatures Mutation signatures have provided insight into mechanisms underlying tumor development and have informed patient therapy (Helleday et al., 2014). Analysis by non-negative matrix factorization on the Pan-Gyn dataset suggested that 10 mutation signatures could explain nearly 90% of the variability observed in the original mutation/sample matrix (Figures S2A and S2B). The 10 Pan-Gyn signatures (S1 to S10) variably correlated with the 30 COSMIC signatures ( signatures) (Forbes et al., 2011) (Figure 2B). S1 correlated strongly with COSMIC signature 13 (r = 0.99) and S2 correlated with COSMIC signature 2 (r = 0.95); both signatures suggest activity of the AID/APOBEC family of cytidine deaminases. S3 correlated with COSMIC signature 1 (r = 0.94), indicating an endogenous process initiated by spontaneous deamination of 5-methylcytosine. S4 and the ultramutator COSMIC signature 10 were highly correlated (r = 0.97), presumably reflecting altered activity of POLE. A smaller correlation was found between S10 and COSMIC signature 3 (r = 0.58), associated with germline and somatic BRCA1 and BRCA2 mutations. All of the correlations were statistically significant (FDR < 0.05). Unsupervised hierarchical clustering based on the contribution of each signature divided the Pan-Gyn samples into 10 clusters that showed associations with various molecular/clinical features (Figures 2C and S2C; Tables S4 and S5). Cluster C1 Cancer Cell 33, , April 9,

6 Figure 2. Landscape of Mutations in Pan-Gyn Tumor Types (A) Mutation profiles of 2,029 Pan-Gyn samples (columns) in which at least one somatic mutation occurred in at least one of the 46 significantly mutated genes (SMGs). Top: mutation burdens per sample, divided into synonymous and non-synonymous mutation types. Middle: types of mutations in each of the 46 SMGs per sample. Bottom: covariate bars showing the mutation cluster, genomic alterations in six genes from the DNA damage-response pathway, and tumor type for each sample. (B) Clustered heatmap showing correlations between 10 of our mutation signatures (rows labeled S1 to S10) and 30 COSMIC signatures (columns). (C) Clustered heatmap of the mutation signatures (rows) present in each sample (columns) showing ten clusters. The dendrogram is color coded by predominant COSMIC signature. See also Figure S2 and Tables S2, S3, S4, and S5. was highly enriched with OV samples (and basal BRCA and UCEC to a lesser extent) and contributed strongly to S10, a signature associated with germline and somatic BRCA1 and BRCA2 mutations that correlate with responsiveness to PARP inhibitors and platinum-based therapy (Konecny and Kristeleit, 2016). C1 also had samples with frequent TP53 mutations and homozygous deletions, supporting the association with an ineffective DNA double-strand break repair COSMIC signature. C2, which contained BRCA, OV, and UCEC samples, was associated with transcriptional strand bias for T > C substitutions, 694 Cancer Cell 33, , April 9, 2018

7 Figure 3. Clustered Heatmap of Significantly Recurring SCNAs as Determined by GISTIC2.0 Analysis across Pan-Gyn Cancers The heatmap shows SCNAs in tumor samples (columns) plotted by chromosomal location (rows). Red and blue indicate amplifications and deletions, respectively. See also Figure S3 and Tables S4 and S5. whereas C3, which contained BRCA and OV samples, was associated with transcriptional strand bias for T > A mutations. C4 consisted principally of breast samples and contributed to S8, the signature most associated with COSMIC 5 (etiology unknown). C5, principally composed of UCEC tumors with high microsatellite instability and mutations in MLH1, MSH2, MSH3, or MSH6, contributed most strongly to signature S6. S6 is correlated with COSMIC signatures 6, 15, and 20, which are associated with defective DNA mismatch repair (suggesting possible sensitivity to immune checkpoint inhibitors). C9 comprised CESC and BRCA samples and represented the AID/APOBEC signatures S1 and S2, providing further evidence for enrichment of APOBEC mutagenesis in these cancers (Roberts et al., 2013). C10 was associated with POLE-mutant UCEC samples. Somatic Copy-Number Alterations Unsupervised hierarchical clustering of the Pan-Gyn cohort using in silico admixture removal-corrected (Zack et al., 2013) segmentation data revealed six clusters with distinct copy-number profiles (Figure 3; Tables S4 and S5). Prominent features that distinguished the clusters included SCNAs in chr 8, 16q, and 1q, among others. OV, serous UCEC, UCS and basal-like, HER2 +, and luminal B BRCA tumors clustered almost exclusively into C4 and C6. Conversely, luminal A BRCA and endometrioid UCEC samples were divided among all clusters, providing evidence for additional tumor subtypes beyond the traditional clinical classifications (Cancer Genome Atlas Research Network et al., 2013). C4 and C6 showed a high degree of genomic copy-number instability, consistent with their prevailing TP53 mutation signatures (Ciriello et al., 2013), and contained the highest numbers of advanced-stage cancers (Figure S3A). Unlike other clusters, more than 50% of the samples in C4 and C6 had undergone at least one whole-genome doubling event. C3 accounted for the largest proportion of CESC samples and uniquely exhibited a focal 11q22 amplification containing the oncogene YAP1. C2, with 74% endometrioid UCEC, contained a majority of the POLE-mutant cases and exhibited a quiet SCNA landscape with few broad-level gains or losses. C1 and C5 consisted primarily of endometrioid UCEC and luminal A BRCA tumors, accounting for 85% and 72% of the samples in the two clusters, respectively. Both clusters had similar alteration profiles genome wide, except in the frequencies of 1q and chr 8 gains (p < , Fisher s exact test); the former occurred twice as frequently in C1 and the latter seven times as frequently in C5. Overall, gain of 1q was the most frequent chromosomal armlevel event, occurring in 49.5% of samples across all five Pan- Gyn cancer types. Other frequently recurring arm-level events included gain of 3q, 8q, and chr 20, and loss of 4p, 13q, 16q, 17p, and 22q. DNA Methylation Unsupervised clustering of 2,586 cancer-specific, hypermethylated loci across all Pan-Gyn tumors revealed heterogeneity of DNA methylation patterns (Figure S3B; Tables S4 and S5). Unsurprisingly, tumor samples from the same tissue of origin (e.g., OV, UCS, or CESC) clustered together with the exception of two major groups, which were found to be highly robust via cluster stability analysis (83% and 90% for left and right branches, respectively) (Figures S3C and S3D). The left branch with lower degrees of hypermethylation consisted of the majority of OV and UCS, normal and basal-like BRCA, and microsatellite-stable UCECs Cancer Cell 33, , April 9,

8 B C A C1 C2 C3 C4 C5 C UCEC_5 UCEC_6 UCEC_7 UCEC_8 UCEC_9 C7 Tumor ype Tumor purity EC_grade EC_histo CESC_histo BRCA_histo PAM50 Cluster OS Percentage Months C8 C9 OS Percentage Months EC_grade Grade 1/2 Grade 3 EC_histology Endo Serous CESC_histology Other Serous BRCA_histology Other ILC D Relative expression ESR1 AR Relative expression C1 C2 C3 C4 C5 C6 C7 C8 C9-6.0 C1 C2 C3 C4 C5 C6 C7 C8 C9 mrna cluster mrna cluster CDH1 SOX Tumor type BRCA CESC OV UCEC UCS Tumor purity 0 1 BRCA_PAM50 Cluster LumA/B HER2 Basal Normal C1 C2 C3 C4 C5 C6 C7 C8 C9 Relative expression Relative expression C1 C2 C3 C4 C5 C6 C7 C8 C9-2.0 C1 C2 C3 C4 C5 C6 C7 C8 C9 mrna cluster mrna cluster Figure 4. mrna Expression Clusters and their Association with Overall Survival (A) Unsupervised hierarchical clustering of previously reported cancer genes identifies nine mrna-based subtypes/clusters. Clinical and molecular features are indicated by the annotation bars above the heatmap. (B) Overall survival for each of the gene expression clusters (chi-square test p < , adjusted for differences in tumor type survival rates). (C) Overall survival for endometrial cancer (UCEC) patients in the gene expression clusters (log rank test p < ). (D) Differential expression of ESR1, AR, SOX2, and CDH1 in different clusters (Kruskal-Wallis test p < for all four genes). The bars represent mean expression of the gene (log 2 scale) in each cluster, together with the upper or lower 95% confidence interval (whiskers above or below the bars, respectively). See also Tables S4 and S5. (both endometrioid and serous subtype). The hypermethylator (right) cluster included most CESC tumors, the majority of BRCA, and microsatellite-unstable UCEC. The seven-cluster resolution was retained when perturbing samples across all of the TCGA Pan-Can cohort (Figure S3E), with a small subset of UCEC samples reassigned. C7 (mostly CESC) had the highest degree of hypermethylation across all tumor types in the study, followed by a luminal B BRCA-rich C4, which also consisted of HER2 + and a small fraction of basal-like BRCA. Within tumor subtype (e.g., endometrioid UCEC), the heterogeneity of DNA methylation patterns identified samples that showed greater deficiency in DNA mismatch repair pathways (via MLH1 silencing). Hypermethylation and concomitant downregulation of two genes in the homologous repair pathway, BRCA1 and RAD51C, were observed almost exclusively in OV (12.7% and 3.0%, respectively) and basal-like BRCA cancers (2.8% and 2.6%, respectively). mrna Analysis Unsupervised hierarchical clustering of 1,860 previously defined cancer genes (Sadelain et al., 2012) in 2,296 Pan-Gyn samples resulted in the identification of nine mrna clusters with distinct clinicopathologic characteristics (Figure 4A; Tables S4 and S5). Both C1 and C2 were BRCA enriched, and C2 consisted of the majority of HER2 + and normal-like tumors. C2 was also significantly enriched with infiltrating lobular carcinomas, whereas over 95% of cases in C4 were basal-like ductal BRCA. C5 consisted mainly of OV and serous-like UCEC, a similarity noted previously (Cancer Genome Atlas Research Network et al., 2013). Over 50% of cases in C7 were UCS and, given its high EMT signature (Cherniack et al., 2017), C7 therefore likely exhibits EMT characteristics. Overall, the Pan-Gyn mrna subtypes showed prognostic value, even after adjusting for lineage (p < , chi-square test) (Figure 4B). UCEC, in particular, appeared in five of the nine clusters and exhibited significant differences in overall survival, depending on cluster membership (Figure 4C). We investigated which genes were differentially expressed among the clusters (Figure 4D). ESR1 and AR were significantly higher in C1 and C2 than in others, whereas C3 had high expression of SOX2. C3 consisted of cervical cancer samples with squamous histology, characterized by 3q26 amplification (the SOX2 gene loci). C7 had significantly lower expression of the classical epithelial marker CDH1, which is consistent with an EMT signature. Proteomic Analysis Unsupervised hierarchical clustering of protein expression data for 1,967 samples across 216 proteins identified 5 clusters (Figure S4A and Tables S4 and S5). C1 principally consisted of nonbasal BRCA, C3 was enriched with endometrioid UCEC, and C4 was enriched with OV. Interestingly, C2 and C5 contained a mixture of samples across multiple disease types. C2 had high levels of caveolin1, MYH11, and HSP70 proteins, which have previously been identified as biomarkers for the reactive subtype found in luminal BRCA (Cancer Genome Atlas Research Network, 2012). In addition to luminal BRCA samples, C2 included some basal-like BRCA, CESC, OV, and UCEC samples (but not UCS). Cluster C5 contained most of the basal-like BRCA, squamous CESC, serous UCEC, UCS, and 10% of the serous OV samples. It had a low hormone receptor pathway score (Akbani et al., 2014) and high levels of cell-cycle and 696 Cancer Cell 33, , April 9, 2018

9 Figure 5. lncrna Clusters and Gene/lncRNA Interaction Networks (A) Clustered heatmap based on expression of cancer lncrna regulators. The rows have 1,986 lncrnas, whereas the columns have 1,597 samples. L1 L6 indicate the six clusters and their association with protein clusters is shown (p < 0.05, Fisher s exact test). (B) Schematic illustration of dual-layer ER-competing endogenous RNA regulation of BRCA1. ERs transcriptionally regulate both BRCA1 and non-coding TUG1 in ER-positive breast cancer. Those RNAs subsequently compete for mirna binding. (legend continued on next page) Cancer Cell 33, , April 9,

10 DNA damage-response activity, features that could indicate sensitivity to drugs that target DNA damage repair. mirna Analysis Unsupervised hierarchical clustering of the 293 most variable mirnas in 2,417 samples grouped samples largely by disease type (Figures S4B S4D; Tables S4 and S5). The mirna profile for OV, however, was especially distinct from other Pan-Gyn tumor types. Basal-like BRCA samples were more similar to CESC (C6), and UCEC and UCS samples (C4 and C5), than they were to the non-basal BRCA subtypes in C2 and C3. lncrnas We processed raw RNA sequencing data to extract 1,986 lncrnas that were predicted to regulate the 216 cancer-related proteins profiled by TCGA across 4 of the 5 tumor types (UCS did not have sufficient samples for the lncrna extraction). An unsupervised consensus clustering of the data revealed six clusters (L1 to L6) that coincided significantly with protein-based clusters (C1 to C5) (p < 0.05, Fisher s exact test) (Figures 5A and S4A; Tables S4 and S5). BRCA and CESC had very similar lncrna profiles and grouped together in clusters L2 and L3. UCEC (in L5) and OV (in L6) each had very distinct lncrna profiles from those of BRCA and CESC. Portions of the OV (31%) and UCEC (11%) samples were both present in cluster L4. Previous studies have suggested that estrogen receptors (ERs) regulate BRCA1 expression, dyskerin (DKC1) expression (a binding partner of the lncrna TERC), and the lncrna TUG1 (Figure 5B) (Jonsson et al., 2015; Hurtado et al., 2011). ERs bind to regulatory regions of DKC1, either to induce or to repress multiple lncrnas (Figure 5C). In the present study, our analysis has revealed significant Pearson s correlation (t test p < 0.05) between key lncrnas and their regulator genes transcripts, ESR1, OIP5, and DKC1, in a context-specific manner (Figure 5D). Using gene set enrichment analysis, we found 12.04% of the 1,537 gene ontology gene sets to be significantly enriched (FDR < 0.05) with TERC-correlated genes across all four cancer types (Figure S5). Included were gene sets associated with TERT and telomere maintenance and packaging as well as gene sets linked to MYC. The latter result supports earlier findings of TERC binding peaks in the MYC promoter region (Chu et al., 2011). Pathway Analysis We performed PARADIGM pathway analysis (Vaske et al., 2010) followed by unsupervised consensus clustering of pathway scores that clustered samples primarily by tissue type, with a few notable exceptions (Figures 6A and 6B; Tables S4 and S5). A subset of basal-like BRCA cancers co-clustered with a subset of UCEC and UCS in C2, whereas the remaining basal-like BRCA samples clustered with non-basal BRCA in C4. Contrary to transcriptomic analysis, pathway analysis clustered approximately half of the basal-like BRCA cancer samples together with the HER2 + and luminal B samples. All PARADIGM clusters had distinct patterns of high or low immune-related signaling, assessed by inferred activation (Figure 6A) and pathway enrichment (Figure 6B), suggesting an important role for immune response in subsets of Pan-Gyn cancers. Interestingly, the two basal-like BRCA subtypes differed between inferred activation of immune-related signaling pathways. Enrichment with adhesion-related proteins, such as the integrins, matrix metalloproteinases, and syndecans, were also distinguished between the two basal-like subtypes, suggesting distinctive tumor microenvironments. As with basal-like BRCA, UCEC split into two clusters (C2 and C3) that did not correspond to obvious variations in UCEC histology. These clusters were mainly differentiated by proliferation, Notch signaling, and immune activity levels. Integrated Analysis across Pan-Gyn Tumor Types We used cluster assignments from the six major TCGA platforms (mutations, SCNA, DNA methylation, mrna, mirna, and protein) to perform integrated clustering across the Pan-Gyn cohort using the CoCA algorithm (Figure S6A). The resulting CoCA clusters were heavily dominated by tumor type because the intrinsic gene expression patterns were lineage dependent. The association with tumor type was especially prominent in the DNA methylation, mrna, mirna, and protein clusters. Therefore, we turned to an alternative method (described next) to define subtypes that would span the Pan-Gyn tumor types and emphasize high-level similarities among them. Subtypes across the Pan-Gyn Tumors We present molecular subtypes that illuminate commonalities and distinguishing features across the Pan-Gyn tumor types, with the potential to inform future cross-tumor-type therapies. We first identified 16 features (listed in the STAR Methods) across 1,956 samples that were either (1) currently used in the clinic for at least 1 of the 5 tumor types, or (2) identified as informative in previous TCGA gynecologic and breast cancer studies. Next, we clustered the feature matrix and obtained 5 clusters (Figure 7A; Tables S4 and S5). SCNA load was the predominant feature and produced the first division. In the low-scna-load group, we found two clusters, non-hypermutator (C1) and hypermutator (C2). The non-hypermutator cluster had virtually no hypermutators but had high levels of ER +,PR +, and/or AR + samples, indicating potential susceptibility to hormone therapies. C2, the hypermutator cluster, could be further subdivided into four subclusters (clusters C2A-C2D). C2A was enriched with POLE mutations, which have previously been associated with ultramutators and their extremely high mutation rates (>100 mutations/mbp) (Cancer Genome Atlas Research Network et al., 2013). C2B showed enrichment with MSI-high samples and C2C showed high immune-infiltration levels. C2D was depleted of hypermutators and showed enrichment with high immune-infiltration and HPV-positive samples. The high-scnaload group consisted of three clusters: immune high (C3), AR or PR low (C4), and AR or PR high (C5). The immune high cluster showed low levels of hormone receptors and enrichment with HPV-positive samples. Interestingly, samples with ERBB2 (C) ERs modulate the TERC-DKC1 complex and its transcriptional activity. Estradiol (E2)-activated ERs bind to cis-regulatory DNA regions of both DKC1 and TERC and regulate their activity. Further, ERs bind to regulatory regions of DKC1-regulated lncrnas (listed on the right) and modulate their expression. (D) Gene/lncRNA interaction networks in the overall Pan-Gyn lncrna cohort and each of the four individual disease types. The nodes represent genes (green) or lncrnas (burgundy), whereas each edge represents statistically significant Pearson s correlation coefficient between the connected nodes. See also Figures S4 and S5; Tables S4 and S Cancer Cell 33, , April 9, 2018

11 (legend on next page) Cancer Cell 33, , April 9,

12 amplification fell into two main clusters; those in clusters C3 (n = 39) and C4 (n = 30) showed high and low immune infiltration levels, respectively (purple and black rectangles in Figure 7A). C3 displayed a tendency toward better survival than C4 (hazard ratio = 2.8), with a p value that trended toward significance (p = 0.087) (Figure S6B). C4 showed low levels of AR and PR and had a subcluster with BRCA1 or BRCA2 somatic mutations. C5 had high levels of at least one of the three hormone receptors, again suggesting sensitivity to hormone therapies. Each cluster had varying levels of representation of samples from each disease, mitigating tissue specificity (Figure 7B). We then performed overall survival analysis on the five clusters and obtained very significant survival differences among them (p < , log rank test) (Figure 7C). The 5-year survival rate ranged from 83% (C1) to 44% (C4), and the 10-year survival rate ranged from 64% (C2) to 20% (C4). We assessed the statistical significance of the added prognostic value of the 16-feature clusters after accounting for tumor type differences to control for effects that may be due to individual tumor type contributions; the resulting p value was still significant (p = , log rank test). Finally, we used dichotomous decision tree methodology (Quinlan, 1983) to reduce the number of assessed molecular variables needed to classify patients into 1 of the 5 subtypes. The resulting tree required specification of only 6 of the original 16 features (Figure 7D). The tree had an accuracy of 82% predicting the original 16-feature-based clusters, with a receiver-operator characteristic area under the curve of We repeated the same type of survival analysis for the clusters predicted by the decision tree as we did for the original clusters (Figure 7E). Log rank test p values for the tumor type-unadjusted and -adjusted methods were both highly significant (p < ), showing that the decision tree-based clusters retained prognostic value despite not having 100% accuracy. These survival rates were comparable with the original clusters, with a 5-year survival rate ranging from 85% (C1) to 39% (C4), and a 10-year survival rate ranging from 67% (C1) to 14% (C4). DISCUSSION We performed an integrative, multi-platform analysis of the TCGA Pan-Gyn tumors based on 2,579 clinical cases. In addition to confirming the robustness of many observations cited in previous TCGA publications on the individual tumor types, our approaches also provided a considerable number of additional findings: (1) multiple genomic and epigenomic features that help to distinguish gynecologic and breast tumors from the other 28 TCGA tumor types; (2) 61 somatic copy-number peaks in the Pan-Gyn cohort, 11 not previously reported by TCGA; (3) 3 somatic copy-number alterations (containing genes of potential therapeutic relevance) unique to gynecologic cancers among the 33 TCGA tumor types; (4) 46 SMGs in the Pan-Gyn cohort, 11 not previously reported by TCGA; (5) 10 predominant mutation signatures, with 10% of the samples lacking identified SMGs; (6) analyses of the 10 mutation signatures in relation to the 30 COSMIC signatures, demonstrating relationships between the two sets of signatures; (7) shared similar mirna profiles between most of the Pan-Gyn tumor types; the exception, OV, was extremely different from the rest, and, unexpectedly, the mirna profiles of basal-like BRCA cancers closely resembled those of CESC cancers; (8) some OV and UCEC samples exhibited the reactive proteomic signature previously identified and shown to be prognostically relevant in BRCA; (9) identification of a subtype with low protein expression of ERs and AR (important markers for hormone therapy) that spanned all five tumor types; (10) large-scale lncrna analysis not performed previously for any of the TCGA gynecologic or breast marker papers (our findings included several ER-regulated lncrnas and an ER- TERC/DKC1-NEAT1/OIP5-AS1-TUG1 gene/lncrna network); (11) similar lncrna profiles in BRCA and CESC, in contrast to the very distinct profiles in UCEC and OV; (12) lineage-specific gene expression patterns and lineage-related (but not always cancer type-specific) features revealed by multi-platform clustering of tumor samples; (13) pathway analyses that revealed subsets of BRCA, OV, and UCEC samples with high levels of leukocyte infiltration, a primary marker of immune response and possible susceptibility to immunotherapy (most of the CESC samples, but virtually none of the UCS samples, showed high leukocyte infiltration); (14) roughly half of the basal-like BRCA samples resembled luminal/her2 + BRCA samples at the pathway level (but not the gene expression level; this pattern suggests convergence of independent gene expression changes to drive a limited number of pathway outputs and could prove useful with respect to development and selection of therapies across BRCA subtypes); (15) five cross-pan-gyn subtypes defined by multi-platform clustering of 16 molecular features; these five clusters have possible clinical implications and predictive value for survival beyond that of tumor type alone; (16) reduction of the 16 molecular features to six in the form of a binary decision tree that retained prognostic value. From a potential therapeutic perspective, two of the Pan-Gyn clusters (C1 and C5) in (15) showed high levels of hormone receptors (ERs, PR, and/or AR), suggesting possible responsiveness to hormone therapy. C3 showed high levels of immune markers, warranting further exploration for possible value in selecting patients for immunotherapy. C2 included hypermutators and ultramutators, which have been associated with relatively good survival on conventional therapy. A subset of C4 showed ERBB2 amplification, suggesting possible responsiveness to Figure 6. Pathways-Based Clusters (A) Consensus-clustered heatmap based on PARADIGM integrated pathway levels. Selected pathway features with characteristic patterns of inferred activation across clusters are labeled on the rows. Samples are in columns. (B) Constituent pathways with differential single-sample gene set enrichment analysis (ssgsea) scores across PARADIGM clusters. A comparison of ssgsea scores of constituent pathways integrated by the PARADIGM algorithm identified 263 differentially enriched pathways across clusters. Samples are arranged in the same order as in (A), and differentially expressed pathways are arranged based on unsupervised clustering of their ssgsea scores. Dominant themes within subgroupings of differential pathways across PARADIGM clusters are labeled. Examples of immune-related pathways include interleukin-12 (IL-12), IL-23, IL-27, IFNG, STAT, and T cell receptor signaling pathways. Proliferation and DNA damage repair-related pathways include FOXM1, PLK2, cyclins, MYC, E2F, ATM, ATR, BARD1, and Fanconi anemia pathways. See also Tables S4 and S Cancer Cell 33, , April 9, 2018

13 (legend on next page) Cancer Cell 33, , April 9,

14 HER2-targeted therapy. ERBB2 mutation and amplification were mutually exclusive, but both sets of tumors might benefit from HER2-targeted therapy. The decision tree we propose could potentially enable clinicians to classify patients more easily into one of the five Pan-Gyn subtypes. The tree is based on six features, three of which (ER, PR, and AR status) are already routinely used in the clinic. Widely available CLIA-certified gene-panel assays can estimate SCNA and mutation loads, and immune infiltration can be assessed by standard immunohistochemistry or new imaging technologies. Therefore, after further study and validation, our decision tree might be able to aid in assignment of patients to treatment groups. It should be understood, however, that all of the clinically interesting possibilities illuminated by a project like Pan-Gyn should be considered as hypothesis-generators, yielding clues to be tested and, if possible, validated in follow-up studies. DNA methylation data revealed large high- and low-methylation clusters. CESC, as well as luminal B and HER2 + BRCA tumors, showed high levels of DNA methylation, suggesting epigenetics as a driving force in those tumor types. Clustering based on DNA methylation separated MLH1-silenced (i.e., hypermutator) endometrioid UCEC samples from the non-mlh1- silenced ones, suggesting that MLH1 may not be specifically targeted for epigenetic silencing but, instead, may be silenced by a more generic mechanism that silences multiple genes. Gene sets associated with myeloid and stem cell development suggest that TERC activity, initially identified in zebrafish, might play a role in human development as well (Chiu et al., 2018). In the present study, CESC and OV showed positive correlation of TERC with MYC, TERT, telomere maintenance targets, mir-21, and CTNNB1 gene targets. However, serous UCEC showed a unique pattern of negative correlation with TERT targets, positive correlation with mir-21 targets, and no correlation with MYC, CTNNB1, or telomere maintenance targets. In luminal A BRCA, mir-21 targets were positively correlated with TERC. Pathway and subtype analyses revealed an important role for immune markers. OV, basal-like BRCA, luminal BRCA, and HER2 + BRCA cancer samples split into immune-high and immune-low subtypes. Immune-high HER2 + tumors showed a trend toward longer survival than their immune-low counterpart, but the difference was not quite statistically significant for the sample size available. Most of the CESC samples showed high immune marker signatures, likely due to their almost 100% prevalence of HPV. In contrast, most of the UCEC and UCS samples showed little immune infiltration. The high-immune subsets might potentially benefit from immunotherapy. Pathway analysis unexpectedly showed that approximately half of the basal-like BRCA cancers clustered together with the HER2 and luminal B samples, whereas the other half did not, suggesting pathway-level similarities not detected at the level of single RNAs. The similarities included higher inferred activation of AR signaling and lower enrichment of FOXA1, FOXA2, and XBP1/2, as well as the WNT and SHH pathways. Those observations are consistent with convergence of diverse transcriptional events on a limited number of functional pathways. Additional study will be required to test the robustness of those observations. In summary, this integrative, multi-platform Pan-Gyn analysis has confirmed similarities previously identified across the five tumor types and identified relationships not observed in previous studies of the individual diseases. A number of the observations have possible prognostic and/or therapeutic relevance. Our capture of major molecular information content using a simple sixparameter binary decision tree could facilitate the clinical use of Pan-Gyn molecular subtypes and may help in selection for and administration of therapeutic trials across the Pan-Gyn spectrum. However, all of the clinical possibilities illuminated by this study will require extensive additional research, particularly functional validation (which is beyond the intended scope of TCGA studies), before they would be ready for practical application. In addition to its particular observations, this study presents a broad-based, curated atlas of Pan-Gyn molecular features that we believe will be useful as a starting point for many researchers in the field. STAR+METHODS Detailed methods are provided in the online version of this paper and include the following: d KEY RESOURCES TABLE d CONTACT FOR RESOURCE SHARING d SUBJECT DETAILS B Human Data and Tumor Data Selection d METHOD DETAILS B Sample Processing B DNA Sequencing Data B Molecular Features that Distinguished Pan-Gyn from Other Tumor Types B Copy Number Alteration (CNA) Analysis B DNA Methylation B mrna Analysis B Pathways Analysis d QUANTIFICATION AND STATISTICAL ANALYSIS B Statistical Tests for Distinguishing Pan-Gyn from Other Tumor Types Figure 7. Cross-Tumor Type Pan-Gyn Subtypes with Prognostic Significance (A) Clustered heatmap of 16 features across 1,956 Pan-Gyn samples. Cluster 2 is split further into four subclusters, 2A 2D. Purple rectangles highlight HER2 + samples that have high immune infiltration scores; black rectangles highlight HER2 + samples with low immune infiltration scores. (B) Cross-tabulation showing the distribution of Pan-Gyn tumor types across the five clusters. (C) Kaplan-Meier curves showing differences in overall survival among the five clusters (with 5- and 10-year survival rates shown). Before adjusting for tumor type differences in overall survival rates, the log rank test p < , and after adjusting for tumor type differences, p = (chi-square test). (D) Decision tree that predicts clusters using just 6 of the 16 features. The predicted clusters are shown in a covariate bar in the heatmap in (A). (E) Kaplan-Meier curves showing differences in overall survival among the five decision tree-based predicted clusters (with 5- and 10-year survival rates shown). Log rank test p < , before (log rank test) and after (chi-square test) adjusting for tumor type differences in overall survival rates. See also Figure S6; Tables S4 and S Cancer Cell 33, , April 9, 2018

15 B Mutation Analysis B Determining Significant Patterns of Somatic Copy Number Aberration B mrna Analysis B lncrna Statistical Analysis B Pathway Differences between Pan-Gynecological PARADIGM Clusters B Subtypes across the Pan-Gyn Tumors Survival Analysis d DATA AND SOFTWARE AVAILABILITY SUPPLEMENTAL INFORMATION Supplemental Information includes details of the list of members of The Cancer Genome Atlas Research Network for this project, six figures, and five tables and can be found with this article online at ACKNOWLEDGMENTS This work was supported by the following NCI grants: U54 HG003273, U54 HG003067, U54 HG003079, U24 CA143799, U24 CA143835, U24 CA143840, U24 CA143843, U24 CA143845, U24 CA143848, U24 CA143858, U24 CA143866, U24 CA143867, U24 CA143882, U24 CA143883, U24 CA144025, U24 CA210949, U24 CA210950, P30 CA016672, P50 CA AUTHOR CONTRIBUTIONS Sample freeze list: R.S.K., W.Li, C.P.V., D.A.L., and R.A. Pan-Gyn versus non- Gyn analysis: A.K., A.C.B., R.A., V.R., A.R., and A.D.C. Mutations: D.G.T., S.C.N.A., A.C.B., C.W., A.I.O., and A.D.C. DNA methylation: H.F. and H.S. mrna: Y.L. and W. Le. mirna: R.B. and A.G.R. RPPA: A.M.H., R.S.K., C.P.V., W. Li, G.B.M., and R.A. lncrna: P.S., C.W., P.M., L.L., A.K.S., K.M., T.W.C., S.L., H.S.C. and P.H.G. PARADIGM: C.Y. Pan-Gyn Subtyping: W. Le, A.D.C., A.K.G., K.A.B., N.M.K., J.S.R., R.E.Z., D.A.L., A.R., A.S. and R.A. Clinical and Pathology: A.K.G., A.I.O., N.M.K., J.S.R., R.E.Z., A.K.S., A.J.L., J.N.W., G.B.M., and D.A.L. General discussion and feedback: C.A., S.N.A., S.O., J.R., C.S.S., Q.S., and N.W. Editing team: A.C.B., A.K., A.K.S., A.J.L., N.M.K., A.S., G.B.M., D.A.L., J.N.W., and R.A. Project leadership: J.N.W., G.B.M., D.A.L., and R.A. DECLARATION OF INTEREST Michael Seiler, Peter G. Smith, Ping Zhu, Silvia Buonamici, and Lihua Yu are employees of H3 Biomedicine, Inc. Parts of this work are the subject of a patent application: WO titled Splice variants associated with neomorphic sf3b1 mutants. Shouyoung Peng, Anant A. Agrawal, James Palacino, and Teng Teng are employees of H3 Biomedicine, Inc. Andrew D. Cherniack, Ashton C. Berger, and Galen F. Gao receive research support from Bayer Pharmaceuticals. Gordon B. Mills serves on the External Scientific Review Board of Astrazeneca. Anil Sood is on the Scientific Advisory Board for Kiyatec and is a shareholder in BioPath. Jonathan S. Serody receives funding from Merck, Inc. Kyle R. Covington is an employee of Castle Biosciences, Inc. Preethi H. Gunaratne is founder, CSO, and shareholder of NextmiRNA Therapeutics. Christina Yau is a part-time employee/consultant at NantOmics. Franz X. Schaub is an employee and shareholder of SEngine Precision Medicine, Inc. Carla Grandori is an employee, founder, and shareholder of SEngine Precision Medicine, Inc. Robert N. Eisenman is a member of the Scientific Advisory Boards and shareholder of Shenogen Pharma and Kronos Bio. Daniel J. Weisenberger is a consultant for Zymo Research Corporation. Joshua M. Stuart is the founder of Five3 Genomics and shareholder of NantOmics. Marc T. Goodman receives research support from Merck, Inc. Andrew J. Gentles is a consultant for Cibermed. Charles M. Perou is an equity stock holder, consultant, and Board of Directors member of BioClassifier and GeneCentric Diagnostics and is also listed as an inventor on patent applications on the Breast PAM50 and Lung Cancer Subtyping assays. Matthew Meyerson receives research support from Bayer Pharmaceuticals; is an equity holder in, consultant for, and Scientific Advisory Board chair for OrigiMed; and is an inventor of a patent for EGFR mutation diagnosis in lung cancer, licensed to LabCorp. Eduard Porta-Pardo is an inventor of a patent for domainxplorer. Han Liang is a shareholder and scientific advisor of Precision Scientific and Eagle Nebula. Da Yang is an inventor on a pending patent application describing the use of antisense oligonucleotides against specific lncrna sequence as diagnostic and therapeutic tools. Yonghong Xiao was an employee and shareholder of TESARO, Inc. Bin Feng is an employee and shareholder of TESARO, Inc. Carter Van Waes received research funding for the study of IAP inhibitor ASTX660 through a Cooperative Agreement between NIDCD, NIH, and Astex Pharmaceuticals. Raunaq Malhotra is an employee and shareholder of Seven Bridges, Inc. Peter W. Laird serves on the Scientific Advisory Board for AnchorDx. Joel Tepper is a consultant at EMD Serono. Kenneth Wang serves on the Advisory Board for Boston Scientific, Microtech, and Olympus. Andrea Califano is a founder, shareholder, and advisory board member of DarwinHealth, Inc. and a shareholder and advisory board member of Tempus, Inc. Toni K. Choueiri serves as needed on advisory boards for Bristol-Myers Squibb, Merck, and Roche. Lawrence Kwong receives research support from Array BioPharma. Sharon E. Plon is a member of the Scientific Advisory Board for Baylor Genetics Laboratory. Beth Y. Karlan serves on the Advisory Board of Invitae. Received: October 26, 2017 Revised: February 22, 2018 Accepted: March 12, 2018 Published: April 2, 2018 REFERENCES Akbani, R., Ng, P.K., Werner, H.M., Shahmoradgoli, M., Zhang, F., Ju, Z., Liu, W., Yang, J.Y., Yoshihara, K., Li, J., et al. (2014). A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat. Commun. 5, Alexandrov, L.B., Nik-Zainal, S., Wedge, D.C., Aparicio, S.A., Behjati, S., Biankin, A.V., Bignell, G.R., Bolli, N., Borg, A., Børresen-Dale, A.L., et al. (2013). Signatures of mutational processes in human cancer. Nature 500, Altschul, S.F., Madden, T.L., Sch affer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, Barbie, D.A., Tamayo, P., Boehm, J.S., Kim, S.Y., Moody, S.E., Dunn, I.F., Schinzel, A.C., Sandy, P., Meylan, E., Scholl, C., et al. (2009). Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, Cancer Genome Atlas Research Network. (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474, Cancer Genome Atlas Research Network. (2012). Comprehensive molecular portraits of human breast tumours. Nature 490, Cancer Genome Atlas Research Network, Kandoth, C., Schultz, N., Cherniack, A.D., Akbani, R., Liu, Y., Shen, H., Robertson, A.G., Pashtan, I., Shen, R., et al. (2013). Integrated genomic characterization of endometrial carcinoma. Nature 497, Cancer Genome Atlas Research Network. (2017). Integrated genomic and molecular characterization of cervical cancer. Nature 543, Cantelmo, A.R., Conradi, L.C., Brajic, A., Goveia, J., Kalucka, J., Pircher, A., Chaturvedi, P., Hol, J., Thienpont, B., Teuwen, L.A., et al. (2016). Inhibition of the glycolytic activator PFKFB3 in endothelium induces tumor vessel normalization, impairs metastasis, and improves chemotherapy. Cancer Cell 30, Carter, S.L., Cibulskis, K., Helman, E., McKenna, A., Shen, H., Zack, T., Laird, P.W., Onofrio, R.C., Winckler, W., Weir, B.A., et al. (2012). Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, Cherniack, A.D., Shen, H., Walter, V., Stewart, C., Murray, B.A., Bowlby, R., Hu, X., Ling, S., Soslow, R.A., Broaddus, R.R., et al. (2017). Integrated molecular characterization of uterine carcinosarcoma. Cancer Cell 31, Cancer Cell 33, , April 9,

16 Chiu, H.-S., Somvanshi, S., Patel, E., Chen, T.-W., Singh, V.P., Zorman, B., Patil, S.L., Pan, Y., Chatterjee, S.S., The Cancer Genome Atlas Network, et al. (2018). Pan-cancer analysis of lncrna regulation supports their targeting of cancer genes in each tumor context. Cell Rep. celrep Chu, C., Qu, K., Zhong, F.L., Artandi, S.E., and Chang, H.Y. (2011). Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell 44, Chu, J., Sadeghi, S., Raymond, A., Jackman, S.D., Nip, K.M., Mar, R., Mohamadi, H., Butterfield, Y.S., Robertson, A.G., and Birol, I. (2014). BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters. Bioinformatics 30, Cibulskis, K., Lawrence, M.S., Carter, S.L., Sivachenko, A., Jaffe, D., Sougnez, C., Gabriel, S., Meyerson, M., Lander, E.S., and Getz, G. (2013). Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, Cibulskis, K., McKenna, A., Fennell, T., Banks, E., DePristo, M., and Getz, G. (2011). ContEst: estimating cross-contamination of human samples in nextgeneration sequencing data. Bioinformatics 27, Ciriello, G., Miller, M.L., Aksoy, B.A., Senbabaoglu, Y., Schultz, N., and Sander, C. (2013). Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, Deng, J., Zhang, W., Liu, S., An, H., Tan, L., and Ma, L. (2017). LATS1 suppresses proliferation and invasion of cervical cancer. Mol. Med. Rep. 15, Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, Forbes, S.A., Bindal, N., Bamford, S., Cole, C., Kok, C.Y., Beare, D., Jia, M., Shepherd, R., Leung, K., Menzies, A., et al. (2011). COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39, D945 D950. Frank, E., Hall, M.A., and Witten, I.H. (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Fourth Edition (Morgan Kaufmann). Gehring, J.S., Fischer, B., Lawrence, M., and Huber, W. (2015). SomaticSignatures: inferring mutational signatures from single-nucleotide variants. Bioinformatics 31, Gonzalez-Angulo, A.M., Hennessy, B.T., Meric-Bernstam, F., Sahin, A., Liu, W., Ju, Z., Carey, M.S., Myhre, S., Speers, C., Deng, L., et al. (2011). Functional proteomics can define prognosis and predict pathologic complete response in patients with breast cancer. Clin. Proteomics 8, 11. H anzelmann, S., Castelo, R., and Guinney, J. (2013). GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 14, 7. Helleday, T., Eshtad, S., and Nik-Zainal, S. (2014). Mechanisms underlying mutational signatures in human cancers. Nat. Rev. Genet. 15, Hoadley, K.A., Yau, C., Wolf, D.M., Cherniack, A.D., Tamborero, D., Ng, S., Leiserson, M.D., Niu, B., McLellan, M.D., Uzunangelov, V., et al. (2014). Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, Hoadley, K.A., Yau, C., Hinoue, T., Wolf, D.M., Lazar, A.J., Drill, E., Shen, R., et al. (2018). Cell-of-origin patterns dominate molecular classification of 10,000 tumors from 33 types of cancer. Cell Hurtado, A., Holmes, K.A., Ross-Innes, C.S., Schmidt, D., and Carroll, J.S. (2011). FOXA1 is a key determinant of estrogen receptor function and endocrine response. Nat. Genet. 43, Ikushima, H., and Miyazono, K. (2010). TGFbeta signalling: a complex web in cancer progression. Nat. Rev. Cancer 10, Johnson, W.E., Rabinovic, A., and Li, C. (2007). Adjusting batch effects in microarray expression data using Empirical Bayes methods. Biostatistics 8, Jonsson, P., Coarfa, C., Mesmar, F., Raz, T., Rajapakshe, K., Thompson, J.F., Gunaratne, P.H., and Williams, C. (2015). Single-molecule sequencing reveals estrogen-regulated clinically relevant lncrnas in breast cancer. Mol. Endocrinol. 29, Kim, J.H., Yang, C.K., Heo, K., Roeder, R.G., An, W., and Stallcup, M.R. (2008). CCAR1, a key regulator of mediator complex recruitment to nuclear receptor transcription complexes. Mol. Cell 22, Konecny, G.E., and Kristeleit, R.S. (2016). PARP inhibitors for BRCA1/2- mutated and sporadic ovarian cancer: current practice and future directions. Br. J. Cancer 115, Korn, J.M., Kuruvilla, F.G., McCarroll, S.A., Wysoker, A., Nemesh, J., Cawley, S., Hubbell, E., Veitch, J., Collins, P.J., Darvishi, K., et al. (2008). Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, Kostic, A.D., Ojesina, A.I., Pedamallu, C.S., Jung, J., Verhaak, R.G., Getz, G., and Meyerson, M. (2011). PathSeq: software to identify or discover microbes by deep sequencing of human tissue. Nat. Biotechnol. 29, Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G.V., Cibulskis, K., Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S.A., et al. (2013). Mutational heterogeneity in cancer and the search for new cancerassociated genes. Nature 499, Li, S., Dai, W., Mo, W., Li, J., Feng, J., Wu, L., Liu, T., Yu, Q., Xu, S., Wang, W., et al. (2017). By inhibiting PFKFB3, aspirin overcomes sorafenib resistance in hepatocellular carcinoma. Int. J. Cancer 141, Li, B., and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323. Li, C., and Hung Wong, W. (2001). Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol. 2, RESEARCH0032. McAvoy, S., Ganapathiraju, S.C., Ducharme-Smith, A.L., Pritchett, J.R., Kosari, F., Perez, D.S., Zhu, Y., James, C.D., and Smith, D.I. (2007). Nonrandom inactivation of large common fragile site genes in different cancers. Cytogenet. Genome Res. 118, McCarroll, S.A., Kuruvilla, F.G., Korn, J.M., Cawley, S., Nemesh, J., Wysoker, A., Shapero, M.H., de Bakker, P.I., Maller, J.B., Kirby, A., et al. (2008). Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, McPherson, A., Hormozdiari, F., Zayed, A., Giuliany, R., Ha, G., Sun, M.G., Griffith, M., Heravi Moussavi, A., Senz, J., Melnyk, N., et al. (2011). defuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput. Biol. 7, e Mermel, C.H., Schumacher, S.E., Hill, B., Meyerson, M.L., Beroukhim, R., and Getz, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41. Miron, K., Golan-Lev, T., Dvir, R., Ben-David, E., and Kerem, B. (2015). Oncogenes create a unique landscape of fragile sites. Nat. Commun. 6, Monti, S., Tamayo, P., Mesirov, J., and Golub, T. (2003). Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learn. 52, Mullen, R.D., and Behringer, R.R. (2014). Molecular genetics of M ullerian duct formation, regression, and differentiation. Sex. Dev. 8, Muthu, M., Cheriyan, V.T., and Rishi, A.K. (2015). CARP-1/CCAR1: a biphasic regulator of cancer cell growth and apoptosis. Oncotarget 6, Olshen, A.B., Venkatraman, E.S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, Peng, F., Li, Q., Sun, J.Y., Luo, Y., Chen, M., and Bao, Y. (2018). PFKFB3 is involved in breast cancer proliferation, migration, invasion and angiogenesis. Int. J. Oncol. 52, Quinlan, R. (1983). Learning efficient classification procedures. In Machine Learning: An Artificial Intelligence Approach, R.S. Michalski, J.G. Carbonell, and T.M. Mitchell, eds. (Morgan Kaufmann Publishers), pp Cancer Cell 33, , April 9, 2018

17 Radenbaugh, A.J., Ma, S., Ewing, A., Stuart, J.M., Collisson, E., Zhu, J., and Haussler, D. (2014). RADIA: RNA and DNA integrated analysis for somatic mutation detection. PLoS One 9, e Ramos, A.H., Lichtenstein, L., Gupta, M., Lawrence, M.S., Pugh, T.J., Saksena, G., Meyerson, M., and Getz, G. (2015). Oncotator: cancer variant annotation tool. Hum. Mutat. 36, E2423 E2429. Ratan, A., Olson, T.L., Loughran, T.P., Jr., and Miller, W. (2015). Identification of indels in next-generation sequencing data. BMC Bioinformatics 16, 42. Reich, M., Liefeld, T., Gould, J., Lerner, J., Tamayo, P., and Mesirov, J.P. (2006). GenePattern 2.0. Nat. Genet. 38, Roberts, S.A., Lawrence, M.S., Klimczak, L.J., Grimm, S.A., Fargo, D., Stojanov, P., Kiezun, A., Kryukov, G.V., Carter, S.L., Saksena, G., et al. (2013). An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers. Nat. Genet. 45, Robertson, G., Schein, J., Chiu, R., Corbett, R., Field, M., Jackman, S.D., Mungall, K., Lee, S., Okada, H.M., Qian, J.Q., et al. (2010). De novo assembly and analysis of RNA-seq data. Nat. Methods 7, Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B.S., and Swanton, C. (2016). DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31. Sadelain, M., Papapetrou, E.P., and Bushman, F.D. (2012). Safe harbours for the integration of new DNA in the human genome. Nat. Rev. Cancer 12, Saunders, C.T., Wong, W.S., Swamy, S., Becq, J., Murray, L.J., and Cheetham, R.K. (2012). Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics 28, Sedgewick, A.J., Benz, S.C., Rabizadeh, S., Soon-Shiong, P., and Vaske, C.J. (2013). Learning subgroup-specific regulatory interactions and regulator independence with PARADIGM. Bioinformatics 29, i62 i70. Siegel, R.L., Miller, K.D., and Jemal, A. (2017). Cancer statistics, CA Cancer J. Clin. 67, Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. (2009). ABySS: a parallel assembler for short read sequence data. Genome Res. 19, Smith, T.C., and Frank, E. (2016). Introducing machine learning concepts with WEKA. In Statistical Genomics: Methods and Protocols, E. Mathé and S. Davis, eds. (Humana Press), pp Stadhouders, R., Cico, A., Tharshana, S., Thongjuea, S., Kolovos, P., Baymaz, H.I., Yu, X., Demmers, J., Bezstarosti, K., Maas, A., et al. (2015). Control of developmentally primed erythroid genes by combinatorial co-repressor actions. Nat. Commun. 6, Tibes, R., Qiu, Y., Lu, Y., Hennessy, B., Andreeff, M., Mills, G.B., and Kornblau, S.M. (2006). Reverse phase protein array: validation of a novel proteomic technology and utility for analysis of primary leukemia specimens and hematopoietic stem cells. Mol. Cancer Ther. 5, Torres-Garcia, W., Zheng, S., Sivachenko, A., Vegesna, R., Wang, Q., Yao, R., Berger, M.F., Weinstein, J.N., Getz, G., and Verhaak, R.G. (2014). PRADA: pipeline for RNA sequencing data analysis. Bioinformatics 30, Totoki, Y., Tatsuno, K., Covington, K.R., Ueda, H., Creighton, C.J., Kato, M., Tsuji, S., Donehower, L.A., Slagle, B.L., Nakamura, H., et al. (2014). Transancestry mutational landscape of hepatocellular carcinoma genomes. Nat. Genet. 46, Trapnell, C., Pachter, L., and Salzberg, S.L. (2009). TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, Vaske, C.J., Benz, S.C., Sanborn, J.Z., Earl, D., Szeto, C., Zhu, J., Haussler, D., and Stuart, J.M. (2010). Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 26, i237 i245. Vogelstein, B., Papadopoulos, N., Velculescu, V.E., Zhou, S., Diaz, L.A., Jr., and Kinzler, K.W. (2013). Cancer genome landscapes. Science 339, Wang, K., Singh, D., Zeng, Z., Coleman, S.J., Huang, Y., Savich, G.L., He, X., Mieczkowski, P., Grimm, S.A., Perou, C.M., et al. (2010). MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 38, e178. Wilkerson, M.D., and Hayes, D.N. (2010). ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, Yu, T., Bachman, J., and Lai, Z.C. (2015). Mutation analysis of large tumor suppressor genes LATS1 and LATS2 supports a tumor suppressor role in human cancer. Protein Cell 6, Zack, T.I., Schumacher, S.E., Carter, S.L., Cherniack, A.D., Saksena, G., Tabak, B., Lawrence, M.S., Zhang, C.Z., Wala, J., Mermel, C.H., et al. (2013). Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, Zhang, X., Choi, P.S., Francis, J.M., Imielinski, M., Watanabe, H., Cherniack, A.D., and Meyerson, M. (2016). Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers. Nat. Genet. 48, Cancer Cell 33, , April 9,

18 STAR+METHODS KEY RESOURCES TABLE REAGENT or RESOURCE SOURCE IDENTIFIER Antibodies RPPA antibodies RPPA Core Facility, MD Anderson Cancer Center; Tibes et al., 2006; Gonzalez-Angulo et al., research-resources/core-facilities/functionalproteomics-rppa-core.html Biological Samples Primary tumor samples Multiple tissue source sites, processed through the Biospecimen Core Resource See Methods: Subject Details, Method Details Critical Commercial Assays AmpFISTR Identifier kit Applied Biosystems Cat: A30737 DNA/RNA AllPrep kit Qiagen Cat: Genome-Wide Human SNP Array 6.0 ThermoFisher Scientific Cat: HumanMethylation450 Illumina Cat: HM450 Illumina Barcoded Paired-End Library Preparation Kit Illumina sequencing/ngs-library-prep.html Infinium HumanMethylation450 BeadChip Kit Illumina Cat: WG mirvana mirna Isolation kit Ambion Phusion High-Fidelity PCR Master Mix with New England Biolabs Cat: M0531L HF Buffer QiaAmp blood midi kit Qiagen Cat: RNA6000 Nano Assay Agilent Cat: SureSelect Human All Exon 50 Mb Agilent Cat: G3370J TruSeq PE Cluster Generation Kit Illumina Cat: PE TruSeq RNA Library Prep Kit Illumina Cat: RS VECTASTAIN Elite ABC HRP Kit (Peroxidase, Standard) Vector Lab Cat: PK-6100 Deposited Data Digital pathology images Genomic Data Commons; Cancer Digital Slide Archive Raw and processed clinical, array, and Genomic Data Commons sequencing data Software and Algorithms ABSOLUTE Carter et al., 2012 PMID: ABySS v1.3.4 Simpson et al., 2009 PMID: ABySS v1.4.8 Robertson et al., 2010 PMID: BioBloomTools(v1.2.4.b) Chu et al., 2014 PMID: Birdseed Korn et al., 2008 PMID: Blastn (v2.2.23) Altschul et al., 1997 PMID: CARNAC Totoki et al., 2014 PMID: Circular Binary Segmentation Olshen et al., 2004 PMID: ConsensusClusterPlus Wilkerson and Hayes, 2010 PMID: ContEst Cibulskis et al., 2011 PMID: deconstructsigs Rosenthal et al., 2016 PMID: defuse McPherson et al., 2011 PMID: FireHose The Broad Institute of MIT & Harvard cga/firehose GenePattern Reich et al., 2006 PMID: GISTIC2.0 Mermel et al., 2011 PMID: (Continued on next page) e1 Cancer Cell 33, e1 e9, April 9, 2018

19 Continued REAGENT or RESOURCE SOURCE IDENTIFIER Indelocator Ratan et al., 2015 PMID: MAP-RSeq research/maprseq MapSplice Wang et al., 2010 PMID: MuTect Cibulskis et al., 2013 PMID: MutSigCV v1.4 Lawrence et al., 2013 PMID: Next-Generation Clustered Heatmap MD Anderson Cancer Center TCGA/NGCHMPortal/ Oncotator Ramos et al., 2015 PMID: PARADIGM Vaske et al., 2010 PMID: PathSeq Kostic et al., 2011 PMID: Picard The Broad Institute of MIT & Harvard PRADA Torres-Garcia et al., 2014 PMID: RADIA Radenbaugh et al., 2014 PMID: RSEM Li and Dewey, 2011 PMID: SNPFileCreator Li and Hung Wong, 2001 PMID: SomaticSignatures Gehring et al., 2015 PMID: STAR Dobin et al., 2013 PMID: Strelka Saunders et al., 2012 PMID: Tophat v2.0.8 Trapnell et al., 2009 PMID: WEKA Smith and Frank, 2016 PMID: CONTACT FOR RESOURCE SHARING Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Rehan Akbani SUBJECT DETAILS Human Data and Tumor Data Selection Molecular data were obtained from patients that had not received prior treatment for their disease (ablation, chemotherapy, or radiation therapy) and had provided informed consent as part of The Cancer Genome Atlas Project (TCGA). Local Institutional Review Boards (IRBs) at the tissue source sites reviewed protocols to approve submission of cases. We selected samples from five TCGA projects to represent the gynecologic cancers: breast invasive carcinoma (BRCA), endocervical adenocarcinoma (CESC), high-grade serous ovarian cystadenocarcinoma (OV), uterine corpus endometrial carcinoma (UCEC), and uterine carcinosarcoma (UCS). Sample selection was based on availability of data and propriety of genomic features. Eight CESC samples designated as UCEC-like using mrna data and 14 OV cases lacking TP53 mutations were excluded. The Pan- Gyn cohort was eventually comprised of 2579 cases, consisting of 1087 BRCA cases, 579 OV cases, 548 UCEC cases, 308 CESC cases, and 57 UCS cases. TCGA Project Management collected necessary human subjects documentation to ensure the project complies with 45-CFR-46 (the Common Rule ). The program has obtained documentation from every contributing clinical site to verify that IRB approval has been obtained to participate in TCGA. Such documented approval may include one or more of the following: d An IRB-approved protocol with Informed Consent specific to TCGA or a substantially similar program. In the latter case, if the protocol was not TCGA-specific, the clinical site PI provided a further finding from the IRB that the already-approved protocol is sufficient to participate in TCGA. d A TCGA-specific IRB waiver has been granted. d A TCGA-specific letter that the IRB considers one of the exemptions in 45-CFR-46 applicable. The two most common exemptions cited were that the research falls under (f)(2) or (b)(4). Both exempt requirements for informed consent, because the received data and material do not contain directly identifiable private information. d A TCGA-specific letter that the IRB does not consider the use of these data and materials to be human subjects research. This was most common for collections in which the donors were deceased. Cancer Cell 33, e1 e9, April 9, 2018 e2

20 METHOD DETAILS Sample Processing Cases were staged according to the American Joint Committee on Cancer (AJCC). Each frozen primary tumor specimen had a companion normal tissue specimen (blood or blood components, including DNA extracted at the tissue source site). Adjacent tissue was submitted for some cases. Specimens were shipped overnight using a cryoport that maintained an average temperature of less than 180 C. RNA and DNA were extracted from tumor and adjacent normal tissue specimens using a modification of the DNA/RNA AllPrep kit (Qiagen). The flow-through from the Qiagen DNA column was processed using a mirvana mirna Isolation Kit (Ambion). This latter step generated RNA preparations that included RNA <200 nt suitable for mirna analysis. DNA was extracted from blood using the QiaAmp blood midi kit (Qiagen). Each specimen was quantified by measuring Abs260 with a UV spectrophotometer or by PicoGreen assay. DNA specimens were resolved by 1% agarose gel electrophoresis to confirm high molecular weight fragments. A custom Sequenom SNP panel or the AmpFISTR Identifiler (Applied Biosystems) was utilized to verify tumor DNA and germline DNA were derived from the same patient. Five hundred nanograms of each tumor and normal DNA were sent to Qiagen for REPLI-g whole genome amplification using a 100 mg reaction scale. Only specimens yielding a minimum of 6.9 mg of tumor DNA, 5.15 mg RNA, and 4.9 mg of germline DNA were included in this study. RNA was analyzed via the RNA6000 nano assay (Agilent) for determination of an RNA Integrity Number (RIN), and only the cases with RIN >7.0 were included in this study. Reasons for rejection are described at DNA Sequencing Data Exome capture was performed using Agilent SureSelect Human All Exon 50 Mb according to the manufacturers instructions. Briefly, micrograms of DNA from each sample were used to prepare the sequencing library through shearing of the DNA followed by ligation of sequencing adaptors. All whole exome (WES) and whole genome (WGS) sequencing was performed on the Illumina HiSeq platform. Paired-end sequencing (2 x 101 bp for WGS and 2 x 76 bp for WE) was carried out using HiSeq sequencing instruments; the resulting data was analyzed with the current Illumina pipeline. Basic alignment and sequence QC was done on the Picard and Firehose pipelines at the Broad Institute. Sequencing data were processed using two consecutive pipelines: 1) Sequencing data processing pipeline ( Picard pipeline ). Picard ( uses the reads and qualities produced by the Illumina software for all lanes and libraries generated for a single sample (either tumor or normal) and produces a single BAM file ( representing the sample. The final BAM file stores all reads and calibrated qualities along with their alignments to the genome. 2) Cancer genome analysis pipeline ( Firehose pipeline ). Firehose ( takes the BAM files for the tumor and patient- matched normal samples and performs analyses including quality control, local realignment, mutation calling, small insertion and deletion identification, rearrangement detection, coverage calculations and others as described briefly below. The pipeline represents a set of tools for analyzing massively parallel sequencing data for both tumor DNA samples and their patient-matched normal DNA samples. Firehose uses GenePattern (Reich et al., 2006) as its execution engine for pipelines and modules based on input files specified by Firehose. The pipeline contains the following steps: a. Quality control. This step confirms identity of individual tumor and normal to avoid mix-ups between tumor and normal data for the same individual. b. Local realignment of reads. This step realigns reads at sites that potentially harbor small insertions or deletions in either the tumor or the matched normal, to decrease the number of false positive single nucleotide variations caused by misaligned reads. c. Identification of somatic single nucleotide variations (SSNVs). This step detects candidate SSNVs using a statistical analysis of the bases and qualities in the tumor and normal BAMs, using Mutect (Cibulskis et al., 2013). d. Identification of somatic small insertions and deletions. In this step, putative somatic events were first identified within the tumor BAM file and then filtered out using the corresponding normal data, using Indellocator (Ratan et al., 2015). Molecular Features that Distinguished Pan-Gyn from Other Tumor Types Mutations and CNVs We identified the mutation and CNV events, which are enriched in gynecologic cancers (BRCA, UCEC, CESC, UCS, and OV) compared to all other cancers. 617 oncogenes listed in the COSMIC census list are included in the analysis. A multi-step statistical enrichment analysis method is devised for this purpose and applied to mutation and CNV data separately (see methods subsection dedicated to CNV data below for reference). The analysis involves creating a contingency table for altered vs. unaltered cases in Pan- Gyn vs. other cancers. First, the bias from sample sizes in different cancer types are removed by normalizing the alteration counts in each cancer type with sample sizes. For this purpose, an expected gynecological alteration/gene is calculated. For each gene, the mutation or a CNV high-level amplification (i.e. GISTIC thresholded CN value of +2) count in each gynecological cancer type is divided by the number of samples in the associated disease type and multiplied by the total number of samples (after normalization to e3 Cancer Cell 33, e1 e9, April 9, 2018

21 hundred samples/disease for mutation counts) in the gynecological cancers. The same normalization is performed for non-gynecological cancers. This is critical to avoid cancers with large sample size (e.g. BRCA, N = 982 vs. UCS, N = 57) dominate the whole analysis. The genes with p value adjusted < 0.01 for mutation and p value adjusted < 0.05 for CNV are visualized in Figures 1A and 1B (see Statistical Analysis section for details on calculation of p values). We addressed the question of whether the Pan-Gyn tumor types (BRCA, OV, USC, UCEC, CESC) share a significantly larger number of enriched mutated genes compared to a null distribution of enriched mutated genes in randomly selected 5 disease types. The bootstrapping analysis in Figures S1C and S1D involves an iterative process of randomly selecting 5 cancer types out of non-gyn cancers (N=28), calculating number of enriched genes in the randomly selected group using the same criteria (Fisher exact test with FDR-adjusted p value < 0.01) we used for generating Figure 1A. The iterations were performed 10,000 times to generate the null distribution. Following the same strategy, we performed CNA analysis using gene level CNA results from GISTIC2.0 (Mermel et al., 2011). DNA Methylation Aim of this section was to identify genes that are differentially methylated in gynecological tumors (BRCA, UCEC, CESC, UCS, OV) versus the other tumor types. For this purpose we used two different approaches. We first mapped all the probes from the sequencing platforms to unique genes. For genes having more than one probe mapping to its promoter, median beta value was considered. For the first analysis, a threshold beta value of 0.3 was used to call methylation status of genes. Having converted our data to binary form, we then counted the percentage of samples of each tumor type in whom the gene was in methylated state. By taking percentages instead of just the number of cases for each tumor, we could correct for variation in number of samples that were available for each type. For example, whereas 966 cases of breast cancer (BRCA) was available, only 36 cases were available for cholangiocarcinoma (CHOL). To make sure that our analysis does not get skewed by this variation in sample sizes, we normalized number of samples for each tumor type to 100. We then grouped samples into Gyn vs non-gyn cancers, and again adjusted size of each population to 100. Refer to Statistical Analysis section for details on the identification of significant genes. In order to get more robust results, we performed a second kind of analysis to identify significant differentially methylated genes. We logit transformed the beta values into M-values, z-normalized the scores across all samples for a given gene, and took median across all member samples as the methylation score for each tumor type. We then dichotomized the dataset into gyn and non-gyn populations and identified the statistically significant genes between the two populations (see Statistical Analysis section for details). We then compared the lists of statistically significant genes from the two analyses. A total of 197 genes were called significantly differentially methylated between the two populations by both our analyses. The median beta values of these genes across member samples of each tumor type were then plotted into a heatmap, with the Z-normalized M values being used for hierarchical clustering of genes using Euclidean distances and Ward s linkage. Mutation Analysis We used clinical information from 2579 women with gynecological (Pan-Gyn) cancer in TCGA database (1097 breast carcinomas (BRCA), 579 ovary carcinomas (OV), 308 uterine cervical carcinomas (CESC), 548 endometrial carcinomas (UCEC) and 57 uterine carcinosarcomas (UCS). The mutation data include 2,271 gynecologic tumor samples. We used the pancan.merged.v filtered.maf and applied two different approaches to identify the most significantly mutated genes across all Pan-Gyn samples (see Statistical Analysis section for details). The mutation calls used in all of our analyses were somatic mutations only, not germline, so tumor purity differences had minimal impact. We considered as driver mutation the intersection between the two methods and the mutation classification as a potential oncogene or a tumor suppressor gene was based on the inferred scores. Generation of Mutational Signatures We used the pancanmerged.v0.2.4.sorted.maf file to analyze the operative mutational processes in PanGyn samples. We selected all SNVs and created a Grange object in R for every substitution and converted all mutations into a matrix made up of all substitution contexts. For every pyrimidine substitution (C>A, C>G, C>T, T>A, T>C and T>G) we used the 5 and 3 base according to the hg19 human reference genome ( creating 96 possible mutation contexts as described by Alexandrov et al (Alexandrov et al., 2013). We used the SomaticSignatures package for R to implemented an algorithm that uses the non-negative matrix factorization (NMF) to decompose the original matrix to the minimal set of mutation signatures. This algorithm estimates the contribution of each signature across the samples. This last information was used to perform an unsupervised hierarchical clustering to identify samples that share similar mutational spectra (Gehring et al., 2015). Copy Number Alteration (CNA) Analysis Data Generation and Processing Tumor sample DNA was hybridized to Affymetrix SNP6.0 arrays by the Genome Analysis Platform at the Broad Institute as previously described (McCarroll et al., 2008). The resulting probe intensities were normalized and combined using SNPFileCreator (Li and Hung Wong, 2001) and then processed with Birdseed (Korn et al., 2008) to yield preliminary copy-number estimates. The preliminary copynumber estimates were refined using tangent normalization (B. Tabak et al., unpublished data) and then underwent Circular Binary Segmentation (Olshen et al., 2004) to yield segmented relative copy-number profiles. The processed SNP intensities, Birdseed clusters files, and segmented copy-number profiles were input to HAPSEG to create haplotyped copy-number data, which was then utilized with MC3 mutation calls ( to obtain tumor heterogeneity and ploidy estimates from ABSOLUTE (Carter et al., 2012). CNAs were assessed as deviations in the tumor sample from the paired normal tissue sample, so they only reflected somatic changes. However, the amplitude of CNA signals can be suppressed in tumor samples with normal Cancer Cell 33, e1 e9, April 9, 2018 e4

22 cell contamination. We thus utilized ABSOLUTE-derived tumor purity and ploidy estimates for In Silico Admixture Removal (ISAR) of the segmentation data (Zack et al., 2013) in order to correct for any signal dampening that may have occurred before proceeding to analyze somatic copy number alterations. Identification and Analysis of Significant Somatic Copy Number Alterations There were 2,246 gynecologic samples and 7,707 non-gynecologic samples used for downstream copy-number analyses. To adjust for tumor heterogeneity and ploidy in both the gynecologic and non-gynecologic cohorts, the segmented relative copy-number data was ISAR-corrected (Zack et al., 2013). GISTIC2.0 (Mermel et al., 2011) was ran on the resulting purity and ploidy-adjusted data for both cohorts to obtain genome-wide estimates for significant broad and focal somatic copy number alterations. The frequency of high-level copy-number amplifications in the amplification lesion gene targets (i.e. gene targets with thresholded values of +2 produced by GISTIC) were calculated for each tumor type and plotted (Figure 1A) to visualize the differences between the gynecologic and non-gynecologic cancers. The q-values of all of the significant GISTIC amplification and deletion alterations in the gynecologic and non-gynecologic cohorts were plotted against each other (Figure 1B), and the alterations that were exclusive to each cohort were also visualized by plotting the amplification and deletion lesion region boundaries in genomic coordinates and using the lesion q-values as lesion amplitudes (Figure S1A). The unsupervised hierarchical clustering, utilizing Ward s objective function and a Euclidean distance metric, was performed on the amplification and deletion lesions predicted by GISTIC2.0 across the gynecologic cancers. The six resulting cluster groups were visualized with copy number data (Figure 3) and with various other metrics such as gene-level mutations (Figure S3A). P values were calculated to determine significant differences across the various metrics between the cluster groups (see Statistical Analysis section for details). GISTIC2.0 was also performed on the ISAR-corrected copy data within each cluster group in order to compare amplification and deletion lesions between groups. DNA Methylation Data Preprocessing Illumina infinium DNA methylation arrays (including both HumanMethylation27 (HM27) and HumanMethylation450 (HM450)) were used to assay 2,566 pan-gynecological tumor and 167 normal samples in total, which includes 1,074 BRCA, 573 OV, 555 UCEC, 307 CESC and 57 UCS primary tumor samples. Level 3 data from two generations of Illumina infinium DNA methylation arrays were combined and further normalized between platforms using a probe-by-probe proportional rescaling method as outlined below to yield a final common set of 22,601 probes with comparative methylation levels between platforms. During data generation a single technical replicate of the same cell line control sample from either of two different DNA extractions (TCGA /TCGA-AV-A03D) was included on each plate as a control, and measured 44/198 times and 12/169 times on HM27 and HM450 respectively. These repeated measurements were therefore used for rescaling of the HM27 data to be comparable to HM450. For each probe within each platform, we computed the median beta value across all technical replicates of each of the two TCGA IDs. We then combined the two extractions by taking the mean of the two medians obtained for each of the two replicate TCGA IDs, and obtained a single summarized DNA methylation read out (beta value) for the corresponding probe i for each platform, noted as hm27,i, and hm450,i, respectively. We then applied a constrained (within the rage of 0 to 1 for beta values) linear rescaling of the HM27 data for each probe and for each patient sample using hm27,i and hm450,i. When the HM27 beta value of a patient sample j for probe i was smaller than the mean of median replicate samples on the HM27 for that probe, we linearly rescaled the HM27 beta value Beta hm27,i,j in the (0, Beta hm27,i,j ) space; and when Beta hm27,i,j is greater, we linearly rescaled the HM27 beta value Beta hm27,i,j in the (Beta hm27,i,j,1) space; This translates into the following mathematical computation: Beta hm450,i,j = Beta hm27,i,j *( hm450,i / hm27,i ), if Beta hm27,i,j < hm27,i ; and Beta hm450,i,j = 1-(1-Beta hm27,i,j )*((1- hm450,i )/(1 - hm27,i )), if Beta hm27,i,j > hm27,i. After the between-platform normalization, we further excluded 779 probes that still showed a consistent platform difference (mean beta value difference greater than or equal to 0.1) in six or more tumor types. To minimize the influence of normal tissue contamination and leukocytes infiltration in DNA methylation data, we chose probes not methylated in all relevant normal tissues and blood cells, to get rid of methylation signals from possible confounding factors. In order to deal with tumor samples with low tumor purity, we further chose cancer-specific probes by requiring those unmethylated probes to be methylated (defined as beta value > 0.3) in more than 5% samples per tumor type, and then applied dichotomized clustering methodology to run cluster analysis. DNA Methylation Analysis Unsupervised and dichotomized clustering was performed based on a set of cancer-specific autosomal loci, which were defined as unmethylated in normal tissues and blood cells (mean beta value < 0.2 for each tissue types), but methylated in more than 5% samples of each tumor type included in this analysis (beta value > 0.3). For tumor type with less than 100 samples, we require the portion of methylated samples to be greater than 10% instead. In order to generate a set of high-confident probes, we further removed 3373 probes showing standard deviations bigger than 0.05 using 97 technical replicates run along with the breast and gynecological samples. To minimize the influence of tumor purity, we dichotomize the data into 0 s and 1 s with a beta value cut off of 0.3, and used Ward s method to cluster the distance matrix computed with the Jaccard Index. Heatmaps are colored using methylation beta values but ordered according to the above clustering procedures. Pre-defined clusters (k=7) were generated based on cutree function using R program. e5 Cancer Cell 33, e1 e9, April 9, 2018

23 Epigenetic silencing status for gene BRCA1 (measured by probe cg for both platforms), MLH1 (measured by probe cg for both platforms) and RAD51C (measured by probes cg and cg for platform HM27 and HM450 separately) was computed based on an experiential beta value cutoff of 0.3, 0.1 and 0.15, with beta values higher than 0.3, 0.1 and 0.15 considered as silenced, separately. mrna Analysis Identification of mrna Gene Expression-Based Subtypes and Analysis The combination of available functionally defined cancer genes was first obtained from the literature (Sadelain et al., 2012). The previously-reported cancer gene expression profiling of total 2296 breast and gynecological tumors (1097 BRCA, 305 CESC, 305 OV, 532 UCEC and 57 UCS) was further filtered to eliminate unreliably measured genes and to limit the clustering to relevant genes (Cancer Genome Atlas Research Network et al., 2013). Genes that are not present in the TCGA data set were first removed. We then filtered out genes having missing values in any of the samples. Next we filtered out genes that have small expression values in at least one-third of the samples. Implementation of these filters resulted in 1,860 unique genes with reliably measured expression and with cancer characteristics. The gene expression data were then median centered and log transformed. Next we applied the hierarchical unsupervised clustering analysis with the preprocessed gene expression data. The distance metric was one minus the Pearson s correlation coefficient and Ward was used as a linkage algorithm. This unsupervised approach clustered samples and identified nine robust gene expression-based subtypes. The nine subtypes and their gene expression patterns were viewed by using the next-generation clustered heat map (NG-CHM), a tool that was developed at the University of Texas MD Anderson Cancer Center. See Statistical Analysis section for details on calculation of statistically significant correlations and differences between the subtypes. mirna analysis To identify Samples from different tissues that had similar mirna expression profiles, we used hierarchical clustering with pheatmap v1.0.2 in R. The input was a batch-corrected, mirna-seq mature strand data matrix that contained normalized (RPM) abundance for the 293 mature strands that were the union of the most-variant 200 mature strands for each cancer, in 2417 tumor samples from UCEC (n = 524), UCS (56), CESC (306), BRCA (1057), and OV (474). We transformed each row of the matrix by log10(rpm + 1), then used pheatmap to scale the rows. We used Ward.D2 for the clustering method, and correlation and Euclidean as the distance measures for clustering the columns and rows, respectively. Proteomic Analysis Batch effects corrected protein expression data (generated using the RPPA platform) were clustered using the hierarchical clustering function hclust() in the R language. We used 1-Pearson s correlation coefficient as our distance metric with Ward linkage to cluster both the rows and the columns. The data matrix consisted of a total of 1967 samples across 217 antibodies. The matrix was mediancentered in both directions prior to clustering. Clusters were separated by using the cutree() function with k=5 clusters. lncrna Analysis We used ConsensusClusterPlus (Wilkerson and Hayes, 2010) package in R to perform consensus clustering (Monti et al., 2003) and discover the best partition of samples. The K-medoids method, a modification of the K-means algorithm, first randomly selects k data points (or medoids) that are used to form k clusters, where k is a user supplied variable. Then, all remaining data points are iteratively partitions to minimize the distance between the medoids and all other data points in the same cluster. Once all data points are assigned, a medioid is selected for each cluster and the process is repeated until it converges or until a maximum number of iterations is reached. We used Partitioning Around Medoids (PAM) algorithm to implement the K-medoids method, with the Pearson s correlation coefficient as a measure similarity between data points. We used bootstrapping to select k. For each of 1,000 bootstraps, we selected 80% of the samples and 80% of lncrna genes to investigate how frequent they are grouped in the same cluster for each k. The best k value between 2 and 15 was selected by the Silhouette index, a clustering validation measure used to evaluate the level of similarity within a cluster and dissimilarity between the clusters. Standard deviation produced from this bootstrapping computation was used to compare the Silhouette index across choices of k. Batch Effects Analysis We investigated batch effects first within individual disease types, and then across tumor types. Specifically, we investigated the effects of multiple confounding factors, including differences in: (i) batches in which the samples were processed, (ii) tissue source sites from where the samples were obtained, (iii) the date on which the samples were shipped to the data generation centers, (iv) the instrument on which the samples were processed, (v) the centers that generated the data. The results from individual tumor type analyses can be found online at: ( We assessed the magnitude of batch effects using the following algorithms, (i) clustered heat maps, (ii) enhanced PCA plots, and (iii) box plots. Whenever batch effects were observed, we corrected them using (i) ComBat (Johnson et al., 2007), or an enhanced version of it, (ii) Replicates Based Normalization (RBN) (Akbani et al., 2014), or (iii) removal of bad gene/probe data. Using those methods, we corrected the mrna, mirna, DNA methylation and protein expression data. The mutations and copy number data were already discretized and corrected for background loads. Cancer Cell 33, e1 e9, April 9, 2018 e6

24 Pathways Analysis PARADIGM Integrated Pathway Analysis from Copy Number and Expression Data We used the PARADIGM algorithm (Vaske et al., 2010; Sedgewick et al., 2013) to infer the activities of 19K pathway features based on expression, copy number and pathway interaction data for 9829 tumor samples, including 2173 Pan-Gynecological cancers. Platform corrected expression data and gene-level copy number data were obtained from Synapse (syn and syn ). Whitelisted samples assayed on both platforms were identified. One was added to all expression values, which were then log2 transformed and median-centered across samples for each gene. The log2 transformed, median-centered mrna data were rank transformed based on the global ranking across all samples and all genes and discretized (+1 for values with ranks in the highest tertile, -1 for values with ranks in the lowest tertile, and 0 otherwise) prior to PARADIGM analysis. Pathways were obtained in BioPax Level 3 format, and included the NCIPID and BioCarta databases from and the Reactome database from Gene identifiers were unified by UniProt ID then converted to Human Genome Nomenclature Committee s HUGO symbols using mappings provided by HGNC ( Altogether, 1524 pathways were obtained. Interactions from all of these sources were then combined into a merged Superimposed Pathway (SuperPathway). Genes, complexes, and abstract processes (e.g. cell cycle and apoptosis ) were retained and henceforth referred to collectively as pathway features. The resulting pathway structure contained a total of features, representing 7369 proteins, 9354 complexes, 2092 families, 82 RNAs, 15 mirnas and 592 abstract processes. The PARADIGM algorithm infers an integrated pathway level (IPL) for each feature that reflects the log likelihood of the probability that it is activated (vs. inactivated). PARADIGM IPLs of the features within the SuperPathway is available on Synapse (syn ). We also computed the single sample gene set enrichment (ssgsea) score, as described by Barbie et al (Barbie et al., 2009), of the constituent pathways forming the SuperPathway structure from the PARADIGM IPL data using the GSVA package in R (H anzelmann et al., 2013). Of the 1524 pathways obtained, only 1387 have pathway members within the interconnected SuperPathway structure; and their ssgsea scores are available on Synapse (syn ). Consensus Clustering based on PARADIGM Inferred Pathway Activation Consensus clustering based on the 4876 most varying features (i.e. IPLs with variance within the highest quartile) was used to identify Pan-Gynecological subtypes implicated from shared patterns of pathway inference. Consensus clustering was implemented with the ConsensusClusterPlus package in R (Wilkerson and Hayes, 2010). Specifically, median-centered IPLs were used to compute the squared Euclidean distance between samples; and this metric was used as the input to the ConsensusClusterPlus algorithm. Hierarchical clustering using the Ward s minimum variance method (i.e. ward inner linkage option) with 80% subsampling was performed over 1000 iterations; and the final consensus matrix was clustered using average linkage. The number of clusters was selected by considering the relative change in the area under the empirical cumulative distribution function (CDF) curve. We selected k=8 as further separation provides minimal change. Heatmap display of the top varying IPLs was generated using the heatmap.plus package in R. See Statistical Analysis section for details on identification of significant pathway differences between the resulting clusters. Integrated Analysis across Pan-Gyn Tumor Types Cluster of Cluster Assignments (CoCA) analysis was performed using the cluster assignments from each of the 6 major platforms (mutations, CNV, DNA methylation, mrna, mirna, and protein). Clusters assignments defined from each platform were coded into a series of indicator variables for each platform of the form <platform>-<cluster number>, with samples belonging to the particular platform/cluster having a value of 1, and other samples having a value of 0. The matrix of 1 and 0s was then clustered using hierarchical clustering from the hclust() function in R, with Euclidean distance and Ward linkage. Subtypes across the Pan-Gyn Tumors For the subtype analysis, we identified features that were either (i) currently used in the clinic for at least one of the five tumor types or (ii) identified as informative in previous TCGA gynecologic and breast cancer studies (Cancer Genome Atlas Research Network, 2011, 2012, 2017; Cancer Genome Atlas Research Network et al., 2013; Cherniack et al., 2017; Hoadley et al., 2014; Akbani et al., 2014; Cherniack et al., 2017, 2017). Features belonging to the former group were (i) protein expression of ER and PR, BRCA1/BRCA2 mutation status, ERBB2 amplification status, and HPV status. Features in the latter group were (ii) MSI status, hyper-mutator status (>10 mutations/mbp), SCNA load, AR protein expression, leukocyte infiltration score based on DNA methylation, and mutation status of PTEN, TP53, H-RAS/K-RAS/N-RAS, ERBB2, PIK3CA, and POLE. We initially selected a total of 19 features for the analysis. We then combined N-RAS, H-RAS, and K-RAS mutations into a single feature (using OR logic), and BRCA1 and BRCA2 mutations into another single feature (again using OR logic), yielding a final tally of 16 features. MSI status was available only for UCEC and UCS, and HPV status was available only for CESC, so we treated the features as missing for the remaining tumor types. We dichotomized those 16 features into present/absent (for discrete features like mutations) and high/low (for continuous features) in each sample. Eleven of those features were discrete (all the mutations, MSI status, hyper-mutator status, ERBB2 amplifications, and HPV status), whereas the remaining 5 features (CNV load, immune score, ER, PR, and AR protein expressions) were continuous. The CNV load and immune score thresholds were obtained by modeling the expression with a bimodal Gaussian distribution and using the value between the two modes as the threshold. For ER and PR, the thresholds were identified by maximizing the area under curve (using the Youden index), using the continuous-valued expression value to determine the binary valued-er/pr status obtained from immunohistochemistry (IHC) in BRCA. AR cutoff was obtained similarly, using the continuous-valued AR protein expression level to model the binary valued AR status between TCGA prostate cancer (PRAD) data vs. UCEC, BRCA, OVA, CESC, and UCS. Samples without protein expression data were removed, leaving 1,956 samples out of an original 2,579. Once all the features e7 Cancer Cell 33, e1 e9, April 9, 2018

25 were binarized, we constructed a matrix of samples x features where each cell had a 1 if the sample had that feature (or had high levels of that feature), and 0 otherwise. The resulting matrix was clustered using hierarchical clustering from the hclust() function in R, with 1 - Pearson s correlation and Ward linkage. The clusters were separated using the cutree() function with k=5 clusters. We used the J48 decision tree function in the Weka package (Frank et al., 2016) to construct a pruned decision tree using the feature matrix as input and cluster assignments as the class variable. The software output the decision tree shown in Figure 7D. For the survival analysis shown in Figures 7C and 7E, we used the survival library in R and used the survfit() function followed by plot() to generate the figures. The survival data were fitted using the cluster variable. For p value computation see the Quantification and Statistical Analysis section. QUANTIFICATION AND STATISTICAL ANALYSIS Statistical Tests for Distinguishing Pan-Gyn from Other Tumor Types For both mutations and high-level amplifications, the normalized mutated vs. non-mutated counts in gynecological vs. non-gynecological cancers formed the 2x2 contingency table for each gene. A Fisher s exact test was applied for each gene to determine significant differences of enrichment between the two populations and the resulting p values were adjusted for false discovery rates with the Benjamini-Hochberg method. For DNA methylation, we used a combination of two approaches. For the first approach, we used the proportions test to see if any genes were significantly differentially methylated in one population versus the other. We performed FDR correction using BH method, and considered genes having adjusted one-sided p values of less than 0.05 to be significant. For the second approach, a Mann Whitney U test was utilized to identify genes with significant differences between the median methylation levels of genes in the gyn vs. non-gyn populations, with the resulting p values being adjusted for FDR using BH correction. Mutation Analysis We applied two different approaches to identify the most significantly mutated genes across all PanGyn samples. First, we used the method described by Vogelstein et al to estimate the oncogene (ONG) and tumor suppressor gene (TSG) scores (Vogelstein et al., 2013). ONG score was estimated by the ratio of recurrent mutations (defined as missense and in-frame mutations that affected the same codon of the annotated transcript). The TSG score used the ratio of inactivating mutations (nonsense and frameshift mutations, and variants that affected splice sites) in a specific transcript. Genes with an ONC or TSG scores > 0.2 were classified as putative driver mutation (Table S3). For the second method, we used the MutSigCV v1.4 ( to infer significant cancer mutated genes across all PanGyn samples (Lawrence et al., 2013). We found 46 significantly mutated genes based on the intersection of those genes identified by MutSigCV v1.4 and those identified by the Vogelstein et al. method. For the ten mutation signatures identified by NMF, we calculated correlations between the ten mutation signatures and the 30 COSMIC gene signatures ( P values were calculated for these correlations and corrected to FDR values. Determining Significant Patterns of Somatic Copy Number Aberration Identification of genomic regions undergoing significant levels of copy number arrangements and identification of the significant targets of these somatic copy number alterations along with their q-values was accomplished using GISTIC2.0 (Mermel et al., 2011) for both gyn and non-gyn sample cohorts. High-level amplifications for a gene were defined as having a thresholded copy number value of +2 as estimated by GISTIC. Broad-level copy number contributions estimated by GISTIC of having a value of greater than +1 or less than -1 were classified as broad-level amplifications and broad-level deletions, respectively. Testing for significant differences between the six resulting Pan-Gyn SCNA cluster groups was done using binomial tests for broad-level chromosomal alterations (identified through GISTIC), Kruskal-Wallis tests for continuous variables such as number of segments and tumor purity (identified through ABSOLUTE), and Chi-squared tests for independence comparing discrete variables such as gene mutations and tumor pathologic stage. mrna Analysis This unsupervised approach clustered samples and identified nine robust gene expression-based subtypes. Chi-squared test were used to evaluate the correlation between mrna clusters and tumor type, grade, histology, or molecular subtypes as determined by individual diseases. Log-rank test and Kaplan-Meier survival curves were used to compare overall survival (OS) between different clusters of patients (Cancer Genome Atlas Research Network et al., 2013). To adjust for lineage differences, the log-likelihood ratio (LR) statistic was calculated for a Cox proportional hazards model built using just tumor type information. We then added mrna cluster information to the model and recomputed the LR statistic. We calculated the difference between the two LR statistics and computed its p value using the chi-squared test (Hoadley et al., 2014). For the discriminatory genes analysis, we used the Kruskal-Wallis test to identify the top genes that discriminated between the mrna clusters. lncrna Statistical Analysis We used Pearson s correlation as the metric when we performed unsupervised consensus clustering of the lncrna data by K-medoid with bootstrapping. Silhouette analysis suggested 6 as the optimal number of clusters (L1 to L6). We compared cluster membership with membership of the five protein-based clusters by performing Fisher s Exact tests and corrected the resulting p values to FDR values. We determined significance using a FDR-corrected p value of < We also calculated p values for Pearson s Cancer Cell 33, e1 e9, April 9, 2018 e8

26 correlations between expression of key lncrnas and their regulators and used a p value of 0.05 as the cutoff for significance. Gene Set Enrichment Analysis (GSEA) was utilized to determine significant enrichment (with a cutoff of FDR < 0.05) of gene sets containing TERC-correlated genes. Pathway Differences between Pan-Gynecological PARADIGM Clusters Pathway biomarkers of each PARADIGM clusters were identified by comparing one cluster vs. all others using the t-test and Wilcoxon Rank sum test with Benjamini-Hochberg (BH) false discovery rate (FDR) correction. An initial minimum variation filter (at least 1 sample with absolute activity > 0.05) was applied; and the features passing the minimum variation feature were considered in this analysis. Features deemed significant (FDR corrected p value <0.05) by both tests and showing an absolute difference in group means > 0.05 were selected. The selected pathway features were assessed for interconnectivity; and constituent pathways enriched among interconnected differential features were identified using a modified Fisher s test with BH FDR correction. We also compare ssgsea scores of the constituent pathways in one cluster vs. all other comparisons; and pathways with differential ssgsea scores and are enriched among the interconnected differential features are selected for display in a heatmap. Subtypes across the Pan-Gyn Tumors Survival Analysis Survival analysis of the subtype groups was done using the R package survival. Log-rank test was used to compute the p value (unadjusted for tumor type). The p value adjusted for tumor type was computed by first constructing a Cox proportional hazards model using both cluster and tumor type as the fitting variables. Then, a second Cox proportional hazards model was constructed using just the tumor type variable. The test statistic from the second model was subtracted from the test statistic from the first model. The resulting difference in the test statistics followed a X 2 distribution with 4 degrees of freedom (because there were 5 clusters), and was a measure of the additional prognostic value provided by the clusters above and beyond the information provided by tumor type alone. The p value for the difference in the test statistics is shown in Figures 7C and 7E as the tumor type adjusted p value. DATA AND SOFTWARE AVAILABILITY The raw data, processed data and clinical data can be found at the legacy archive of the GDC ( and the PancanAtlas publication page ( The mutation data can be found here ( TCGA data can also be explored through the Broad Institute FireBrowse portal ( and the Memorial Sloan Kettering Cancer Center cbioportal ( Details for software availability are in the Key Resources Table. e9 Cancer Cell 33, e1 e9, April 9, 2018

27 Cancer Cell, Volume 33 Supplemental Information A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Ashton C. Berger, Anil Korkut, Rupa S. Kanchi, Apurva M. Hegde, Walter Lenoir, Wenbin Liu, Yuexin Liu, Huihui Fan, Hui Shen, Visweswaran Ravikumar, Arvind Rao, Andre Schultz, Xubin Li, Pavel Sumazin, Cecilia Williams, Pieter Mestdagh, Preethi H. Gunaratne, Christina Yau, Reanne Bowlby, A. Gordon Robertson, Daniel G. Tiezzi, Chen Wang, Andrew D. Cherniack, Andrew K. Godwin, Nicole M. Kuderer, Janet S. Rader, Rosemary E. Zuna, Anil K. Sood, Alexander J. Lazar, Akinyemi I. Ojesina, Clement Adebamowo, Sally N. Adebamowo, Keith A. Baggerly, Ting-Wen Chen, Hua-Sheng Chiu, Steve Lefever, Liang Liu, Karen MacKenzie, Sandra Orsulic, Jason Roszik, Carl Simon Shelley, Qianqian Song, Christopher P. Vellano, Nicolas Wentzensen, The Cancer Genome Atlas Research Network, John N. Weinstein, Gordon B. Mills, Douglas A. Levine, and Rehan Akbani

28 The members of The Cancer Genome Atlas Research Network for this project are: NCI/NHGRI Project Team: Samantha J. Caesar-Johnson, John A. Demchok, Ina Felau, Melpomeni Kasapi, Martin L. Ferguson, Carolyn M. Hutter, Heidi J. Sofia, Roy Tarnuzzer, Zhining Wang, Liming Yang, Jean C. Zenklusen, Jiashan (Julia) Zhang TCGA DCC: Sudha Chudamani, Jia Liu, Laxmi Lolla, Rashi Naresh, Todd Pihl, Qiang Sun, Yunhu Wan, Ye Wu Genome Data Analysis Centers (GDACs) The Broad Institute: Juok Cho, Timothy DeFreitas, Scott Frazer, Nils Gehlenborg, Gad Getz, David I. Heiman, Jaegil Kim, Michael S. Lawrence, Pei Lin, Sam Meier, Michael S. Noble, Gordon Saksena, Doug Voet, Hailei Zhang Institute for Systems Biology: Brady Bernard, Nyasha Chambwe, Varsha Dhankani, Theo Knijnenburg, Roger Kramer, Kalle Leinonen, Yuexin Liu, Michael Miller, Sheila Reynolds, Ilya Shmulevich, Vesteinn Thorsson, Wei Zhang MD Anderson Cancer Center: Rehan Akbani, Bradley M. Broom, Apurva M. Hegde, Zhenlin Ju, Rupa S. Kanchi, Anil Korkut, Jun Li, Han Liang, Shiyun Ling, Wenbin Liu, Yiling Lu, Gordon B. Mills, Kwok-Shing Ng, Arvind Rao, Michael Ryan, Jing Wang, John N. Weinstein, Jiexin Zhang Memorial Sloan Kettering Cancer Center: Adam Abeshouse, Joshua Armenia, Debyani Chakravarty, Walid K. Chatila, Ino de Bruijn, Jianjiong Gao, Benjamin E. Gross, Zachary J. Heins, Ritika Kundra, Konnor La, Marc Ladanyi, Augustin Luna, Moriah G. Nissan, Angelica Ochoa, Sarah M. Phillips, Ed Reznik, Francisco Sanchez-Vega, Chris Sander, Nikolaus Schultz, Robert Sheridan, S. Onur Sumer, Yichao Sun, Yichao Sun, Barry S. Taylor, Jioajiao Wang, Hongxin Zhang Oregon Health and Science University: Pavana Anur, Myron Peto, Paul Spellman University of California Santa Cruz: Christopher Benz, Joshua M. Stuart, Christopher K. Wong, Christina Yau University of North Carolina at Chapel Hill: D. Neil Hayes, Joel S. Parker, Matthew D. Wilkerson Genome Characterization Centers (GCC) BC Cancer Agency: Adrian Ally, Miruna Balasundaram, Reanne Bowlby, Denise Brooks, Rebecca Carlsen, Eric Chuah, Noreen Dhalla, Robert Holt, Steven J.M. Jones, Katayoon Kasaian, Darlene Lee, Yussanne Ma, Marco A. Marra, Michael Mayo, Richard A. Moore, Andrew J. Mungall, Karen Mungall, A. Gordon Robertson, Sara Sadeghi, Jacqueline E. Schein, Payal Sipahimalani, Angela Tam, Nina Thiessen, Kane Tse, Tina Wong

29 The Broad Institute: Ashton C. Berger, Rameen Beroukhim, Andrew D. Cherniack, Carrie Cibulskis, Stacey B. Gabriel, Galen F. Gao, Gavin Ha, Matthew Meyerson, Gordon Saksena, Steven E. Schumacher, Juliann Shih Harvard: Melanie H. Kucherlapati, Raju S. Kucherlapati Johns Hopkins: Stephen Baylin, Leslie Cope, Ludmila Danilova University of Southern California: Moiz S. Bootwalla, Phillip H. Lai, Dennis T. Maglinte, David J. Van Den Berg, Daniel J. Weisenberger University of North Carolina at Chapel Hill: J. Todd Auman, Saianand Balu, Tom Bodenheimer, Cheng Fan, D. Neil Hayes, Katherine A. Hoadley, Alan P. Hoyle, Stuart R. Jefferys, Corbin D. Jones, Shaowu Meng, Piotr A. Mieczkowski, Lisle E. Mose, Joel S. Parker, Amy H. Perou, Charles M. Perou, Jeffrey Roach, Yan Shi, Janae V. Simons, Tara Skelly, Matthew G. Soloway, Donghui Tan, Umadevi Veluvolu, Matthew D. Wilkerson Van Andel Research Institute: Huihui Fan, Toshinori Hinoue, Peter W. Laird, Hui Shen, Wanding Zhou Genome Sequencing Centers (GSC) Baylor College of Medicine: Michelle Bellair, Kyle Chang, Kyle Covington, Chad J. Creighton, Huyen Dinh, HarshaVardhan Doddapaneni, Lawrence A. Donehower, Jennifer Drummond, Richard A. Gibbs, Robert Glenn, Walker Hale, Yi Han, Jianhong Hu, Viktoriya Korchina, Sandra Lee, Lora Lewis, Wei Li, Xiuping Liu, Margaret Morgan, Donna Morton, Donna Muzny, Jireh Santibanez, Margi Sheth, Eve Shinbrot, Linghua Wang, Min Wang, David A. Wheeler, Liu Xi, Fengmei Zhao The Broad Institute: Carrie Cibulskis, Stacy B. Gabriel, Julian Hess Washington University at St. Louis: Elizabeth L. Appelbaum, Matthew Bailey, Matthew G. Cordes, Li Ding, Catrina C. Fronick, Lucinda A. Fulton, Robert S. Fulton, Cyriac Kandoth, Elaine R. Mardis, Michael D. McLellan, Christopher A. Miller, Heather K. Schmidt, Richard K. Wilson Bio specimen Core Resource: The International Genomics Consortium: Daniel Crain, Erin Curley, Johanna Gardner, Kevin Lau, David Mallery, Scott Morris, Joseph Paulauskis, Robert Penny, Candace Shelton, Troy Shelton, Mark Sherman, Eric Thompson, Peggy Yena Nationwide Children s Organization:

30 Jay Bowen, Julie M. Gastier-Foster, Mark Gerken, Kristen M. Leraas, Tara M. Lichtenberg, Nilsa C. Ramirez, Lisa Wise, Erik Zmuda Tissue Source Sites Australian Prostate Cancer Research Center: Niall Corcoran, Tony Costello, Christopher Hovens Barretos Cancer Hospital: Andre L. Carvalho, Ana C. de Carvalho, José H. Fregnani, Adhemar Longatto-Filho, Rui M. Reis, Cristovam Scapulatempo-Neto, Henrique C. S. Silveira, Daniel O. Vidal Barrow Neurological Institute: Andrew Burnette, Jennifer Eschbacher, Beth Hermes, Ardene Noss, Rosy Singh Baylor College of Medicine: Matthew L. Anderson, Patricia D. Castro, Michael Ittmann BC Cancer Agency: David Huntsman BioreclamationIVT: Bernard Kohl, Xuan Le, Richard Thorp Boston Medical Center: Chris Andry, Elizabeth R. Duffy Botkin Hospital: Vladimir Lyadov, Oxana Paklina, Galiya Setdikova, Alexey Shabunin, Mikhail Tavobilov Brain Tumor Center at the University of Cincinnati Gardner Neuroscience Institute: Christopher McPherson, Ronald Warnick Brigham and Women's Hospital: Ross Berkowitz, Daniel Cramer, Colleen Feltmate, Neil Horowitz, Adam Kibel, Michael Muto, Chandrajit P. Raut Capital Biosciences, Inc.: Andrei Malykh Case Comprehensive Cancer Center: Jill S. Barnholtz-Sloan, Wendi Barrett, Karen Devine, Jordonna Fulop, Quinn T. Ostrom, Kristen Shimmel, Yingli Wolinsky Case Western Reserve School of Medicine: Andrew E. Sloan

31 Catholic University of the Sacred Heart: Agostino De Rose, Felice Giuliante Cedars-Sinai Medical Center: Marc Goodman, Beth Y. Karlan Central Arkansas Veterans Healthcare System: Curt H. Hagedorn Centura Health: John Eckman, Jodi Harr, Jerome Myers, Kelinda Tucker, Leigh Anne Zach Chan Soon-Shiong Institute of Molecular Medicine at Windber: Brenda Deyarmin, Hai Hu, Leonid Kvecher, Caroline Larson, Richard J. Mural, Stella Somiari Charles University: Ales Vicha, Tomas Zelinka Christiana Care Health System: Joseph Bennett, Mary Iacocca, Brenda Rabeno, Patricia Swanson CHU of Montreal: Mathieu Latour CHU of Quebec: Louis Lacombe, Bernard Têtu CHU of Quebec, Laval University Research Center of Chus: Alain Bergeron Cleveland Clinic Foundation: Mary McGraw, Susan M. Staugaitis Columbia University: John Chabot, Hanina Hibshoosh, Antonia Sepulveda, Tao Su, Timothy Wang Cureline, Inc.: Olga Potapova, Olga Voronina Curie Institute: Laurence Desjardins, Odette Mariani, Sergio Roman-Roman, Xavier Sastre, Marc-Henri Stern Dana-Farber Cancer Institute: Feixiong Cheng, Sabina Signoretti Dignity Health Mercy Gilbert Medical Center: Jennifer Eschbacher

32 Duke University Medical Center: Andrew Berchuck, Darell Bigner, Eric Lipp, Jeffrey Marks, Shannon McCall, Roger McLendon, Angeles Secord, Alexis Sharp Emory University: Madhusmita Behera, Daniel J. Brat, Amy Chen, Keith Delman, Seth Force, Fadlo Khuri, Fadlo Khuri, Kelly Magliocca, Shishir Maithel, Jeffrey J. Olson, Taofeek Owonikoko, Alan Pickens, Suresh Ramalingam, Dong M. Shin, Gabriel Sica, Gabriel Sica, Erwin G. Van Meir, Erwin G. Van Meir, Hongzheng Zhang Erasmus Medical Center: Wil Eijckenboom, Ad Gillis, Esther Korpershoek, Leendert Looijenga, Wolter Oosterhuis, Hans Stoop, Kim E. van Kessel, Ellen C. Zwarthoff Foundation of the Carlo Besta Neurological Institute, IRCCS: Chiara Calatozzolo, Lucia Cuppini, Stefania Cuzzubbo, Francesco DiMeco, Gaetano Finocchiaro, Luca Mattei, Alessandro Perin, Bianca Pollo Fred Hutchinson Cancer Research Center: Chu Chen, John Houck, Pawadee Lohavanichbutr Friedrich-Alexander-University: Arndt Hartmann, Christine Stoehr, Robert Stoehr, Helge Taubert, Sven Wach, Bernd Wullich Greater Poland Cancer Center: Witold Kycler, Dawid Murawa, Maciej Wiznerowicz Greenville Health System Institute for Translational Oncology Research: Ki Chung, W. Jeffrey Edenfield, Julie Martin Gustave Roussy institute: Eric Baudin Harvard University: Glenn Bubley, Raphael Bueno, Assunta De Rienzo, William G. Richards Henry Ford Health System: Ana decarvalho, Steven Kalkanis, Tom Mikkelsen, Tom Mikkelsen, Houtan Noushmehr, Lisa Scarpace Hospices Civils de Lyon: Nicolas Girard Hospital Clinic: Marta Aymerich, Elias Campo, Eva Giné, Armando López Guillermo Hue Central Hospital: Nguyen Van Bang, Phan Thi Hanh, Bui Duc Phu Human Tissue Resource Network:

33 Yufang Tang Huntsman Cancer Institute: Howard Colman, Kimberley Evason Icahn School of Medicine at Mount Sinai: Peter R. Dottino, John A. Martignetti Imperial College London: Hani Gabra Indivumed GmbH: Hartmut Juhl Institute of Human Virology Nigeria: Teniola Akeredolu Institute of Urgent Medicine: Serghei Stepa John Wayne Cancer Institute: Dave Hoon Keimyung University: Keunsoo Ahn, Koo Jeong Kang Ludwich Maximilians University Munich: Felix Beuschlein Maine Medical Center: Anne Breggia Massachusetts General Hospital: Michael Birrer Mayo Clinic: Debra Bell, Mitesh Borad, Alan H. Bryce, Erik Castle, Vishal Chandan, John Cheville, John A. Copland, Michael Farnell, Thomas Flotte, Nasra Giama, Thai Ho, Michael Kendrick, Jean-Pierre Kocher, Karla Kopp, Catherine Moser, David Nagorney, Daniel O'Brien, Brian Patrick O'Neill, Tushar Patel, Gloria Petersen, Gloria Petersen, Florencia Que, Michael Rivera, Lewis Roberts, Robert Smallridge, Robert Smallridge, Thomas Smyrk, Thomas Smyrk, Melissa Stanton, R. Houston Thompson, Michael Torbenson, Ju Dong Yang, Lizhi Zhang, Lizhi Zhang McGill University Health Center: Fadi Brimo MD Anderson Cancer Center:

34 Jaffer A. Ajani, Ana Maria Angulo Gonzalez, Carmen Behrens, Jolanta Bondaruk, Russell Broaddus, Bradley Broom, Bogdan Czerniak, Bita Esmaeli, Junya Fujimoto, Jeffrey Gershenwald, Charles Guo, Alexander J. Lazar, Christopher Logothetis, Funda Meric-Bernstam, Funda Meric-Bernstam, Cesar Moran, Lois Ramondetta, David Rice, Anil Sood, Pheroze Tamboli, Timothy Thompson, Patricia Troncoso, Patricia Troncoso, Anne Tsao, Ignacio Wistuba Melanoma Institute Australia: Candace Carter, Lauren Haydu, Peter Hersey, Valerie Jakrot, Hojabr Kakavand, Richard Kefford, Kenneth Lee, Georgina Long, Graham Mann, Michael Quinn, Robyn Saw, Richard Scolyer, Kerwin Shannon, Andrew Spillane, Jonathan Stretch, Maria Synott, John Thompson, James Wilmott Memorial Sloan Kettering Cancer Center: Hikmat Al-Ahmadie, Timothy A. Chan, Ronald Ghossein, Anuradha Gopalan, Douglas A. Levine, Victor Reuter, Samuel Singer, Bhuvanesh Singh Ministry of Health of Vietnam: Nguyen Viet Tien Molecular Response: Thomas Broudy, Cyrus Mirsaidi, Praveen Nair Nancy N. and J.C. Lewis Cancer & Research Pavilion at St. Joseph's/Candler: Paul Drwiega, Judy Miller, Jennifer Smith, Howard Zaren National Cancer Center Korea: Joong-Won Park National Cancer Hospital of Vietnam: Nguyen Phi Hung National Cancer Institute: Electron Kebebew, W. Marston Linehan, Adam R. Metwalli, Karel Pacak, Peter A. Pinto, Mark Schiffman, Laura S. Schmidt, Cathy D. Vocke, Nicolas Wentzensen, Robert Worrell, Hannah Yang Norfolk & Norwich University Hospital: Marc Moncrieff NYU Langone Medical Center: Chandra Goparaju, Jonathan Melamed, Harvey Pass Oncology Institute: Natalia Botnariuc, Irina Caraman, Mircea Cernat, Inga Chemencedji, Adrian Clipca, Serghei Doruc, Ghenadie Gorincioi, Sergiu Mura, Maria Pirtac, Irina Stancul, Diana Tcaciuc Ontario Tumour Bank: Monique Albert, Iakovina Alexopoulou, Angel Arnaout, John Bartlett, Jay Engel, Sebastien Gilbert, Jeremy Parfitt, Harman Sekhon

35 Oregon Health & Science University: George Thomas Papworth Hospital NHS Foundation Trust: Doris M. Rassl, Robert C. Rintoul Providence Health and Services: Carlo Bifulco, Raina Tamakawa, Walter Urba QIMR Berghofer Medical Research Institute: Nicholas Hayward Radboud Medical University Center: Henri Timmers Regina Elena National Cancer Institute: Anna Antenucci, Francesco Facciolo, Gianluca Grazi, Mirella Marino, Roberta Merola Reinier de Graaf Hospital: Ronald de Krijger René Descartes University: Anne-Paule Gimenez-Roqueplo Research Center of Chus Sherbrooke, Québec: Alain Piché Research Institute of the McGill University Health Centre: Simone Chevalier, Ginette McKercher Rockefeller University: Kivanc Birsoy Rose Ella Burkhardt Brain Tumor and Neuro-Oncology Center: Gene Barnett, Cathy Brewer, Carol Farver, Theresa Naska, Nathan A. Pennell, Daniel Raymond, Cathy Schilero, Kathy Smolenski, Felicia Williams Roswell Park Cancer Institute: Carl Morrison Rush University: Jeffrey A. Borgia, Michael J. Liptay, Mark Pool, Christopher W. Seder Saarland University: Kerstin Junker Sage Bionetworks: Larsson Omberg

36 Saint-Petersburg City Clinical Oncology Hospital: Mikhail Dinkin, George Manikhas Sapienza University of Rome: Domenico Alvaro, Maria Consiglia Bragazzi, Vincenzo Cardinale, Guido Carpino, Eugenio Gaudio Spectrum Health: David Chesla, Sandra Cottingham St. Petersburg Academic University RAS: Michael Dubina, Fedor Moiseenko Stanford University: Renumathy Dhanasekaran Technical University of Munich: Karl-Friedrich Becker, Klaus-Peter Janssen, Julia Slotta-Huspenina The International Genomics Consortium: Daniel Crain, Erin Curley, Johanna Gardner, David Mallery, Scott Morris, Joseph Paulauskis, Robert Penny, Candace Shelton, Troy Shelton, Eric Thompson The Ohio State University: Mohamed H. Abdel-Rahman, Dina Aziz, Sue Bell, Colleen M. Cebulla, Amy Davis, Rebecca Duell, J. Bradley Elder, Joe Hilty, Bahavna Kumar, James Lang, Norman L. Lehman, Randy Mandt, Phuong Nguyen, Robert Pilarski, Karan Rai, Lynn Schoenfield, Kelly Senecal, Paul Wakely The Oregon Clinic: Paul Hansen The Research Institute at Nationwide Children's Hospital: Nilsa Ramirez Tufts Medical Center: Ronald Lechan, James Powers, Arthur Tischler University of Alabama at Birmingham Medical Center: William E. Grizzle, Katherine C. Sexton UC Cancer Institute: Alison Kastl UCSF-Helen Diller Family Comprehensive Cancer Center: Joel Henderson, Sima Porten University Hospital of Giessen and Marburg: Jens Waldmann

37 University Hospital in Wurzburg, Germany: Martin Fassnacht University Health Network: Sylvia L. Asa University Hospital Essen: Dirk Schadendorf University Hospitals Case Medical Center Hamburg-Eppendorf:: Marta Couce, Markus Graefen, Hartwig Huland, Guido Sauter, Thorsten Schlomm, Ronald Simon, Pierre Tennstedt University of Abuja Teaching Hospital: Oluwole Olabode University of Arizona: Mark Nelson University of Calgary: Oliver Bathe University of California: Peter R. Carroll, June M. Chan, Philip Disaia, Pat Glenn, Robin K. Kelley, Charles N. Landen, Joanna Phillips, Michael Prados, Jeff Simko, Jeffry Simko, Karen Smith-McCune, Scott VandenBerg University of Chicago Medicine: Kevin Roggin University of Cincinnati: Ashley Fehrenbach, Ady Kendler University of Cincinnati Cancer Institute: Suzanne Sifri, Ruth Steele University of Colorado Cancer Center: Antonio Jimeno University of Dundee: Francis Carey, Ian Forgie University of Florence: Massimo Mannelli University of Hawaii Cancer Center: Michael Carney, Brenda Hernandez

38 University of Heidelberg: Benito Campos, Christel Herold-Mende, Christin Jungk, Andreas Unterberg, Andreas von Deimling University of Iowa Hospital & Clinics: Aaron Bossler, Joseph Galbraith, Laura Jacobus, Michael Knudson, Tina Knutson, Deqin Ma, Mohammed Milhem, Rita Sigmund University of Kansas Medical Center: Andrew K. Godwin, Rashna Madan, Howard G. Rosenthal University of Maryland School of Medicine: Clement Adebamowo, Sally N. Adebamowo University of Melbourne: Alex Boussioutas University of Michigan: David Beer, Thomas Giordano University of Montreal: Anne-Marie Mes-Masson, Fred Saad University of New Mexico: Therese Bocklage University of Oklahoma: Lisa Landrum, Robert Mannel, Kathleen Moore, Katherine Moxley, Russel Postier, Joan Walker, Rosemary Zuna University of Pennsylvania: Michael Feldman, Federico Valdivieso University of Pittsburgh: Rajiv Dhir, James Luketich University of Puerto Rico: Edna M. Mora Pinero, Mario Quintero-Aguilo University of São Paulo: Carlos Gilberto Carlotti Junior, Jose Sebastião Dos Santos, Rafael Kemp, Ajith Sankarankuty, Daniela Tirapelli University of Sheffield Western Bank: James Catto University of Washington: Kathy Agnew, Elizabeth Swisher

39 University of Western Australia: Jenette Creaney, Bruce Robinson University of Wisconsin School of Medicine and Public Health: Carl Simon Shelley University of Kansas Cancer Center: Eryn M. Godwin, Sara Kendall, Cassaundra Shipman University of Michigan: Carol Bradford, Thomas Carey, Andrea Haddad, Jeffey Moyer, Lisa Peterson, Mark Prince, Laura Rozek, Gregory Wolf UQ Thoracic Research Centre: Rayleen Bowman, Kwun M. Fong, Ian Yang Valley Health System: Robert Korst Vanderbilt University Medical Center: W. Kimryn Rathmell Walter Reed National Medical Center: J. Leigh Fantacone-Campbell, Jeffrey A. Hooke, Albert J. Kovatich, Craig D. Shriver Washington University: John DiPersio, Bettina Drake, Ramaswamy Govindan, Sharon Heath, Timothy Ley, Brian Van Tine, Peter Westervelt Weill Cornell Medical College: Mark A. Rubin Yonsei University College of Medicine: Jung Il Lee Institution Not Provided: Natália D. Aredes, Armaz Mariamidze

40 Institution Addresses Australian Prostate Cancer Research Center, Epworth Hospital, VIC, Australia Australian Prostate Cancer Research Center, Epworth Hospital, VIC, Australia Barretos Cancer Hospital, Av: Antenor Duarte Villela, 1331, Barretos, São Paulo, Brazil Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona 85013, Baylor College of Medicine One Baylor Plaza, Houston, TX BC Cancer Agency, 675 W 10th Ave, Vancouver, BC V5Z 1L3, Canada Beth Israel Deaconess Medical Center Harvard University Medical School Boston Mass BioreclamationIVT, 99 Talbot Blvd Chestertown, MD Boston Medical Center, Boston MA Botkin Hospital, 2-y Botkinskiy pr-d, 5, Moskva, Russia, Brain Tumor and Neuro-oncology Center, Department of Neurosurgery, University Hospitals Case Medical Center, Case Western Reserve School of Medicine, Euclid Ave, Cleveland, Ohio, Brain Tumor Center at the University of Cincinnati Gardner Neuroscience Institute, and Department of Neurosurgery, University of Cincinnati College of Medicine, and Mayfield Clinic, 260 Stetson Street, Suite 2200, Cincinnati, Ohio, Brain Tumor Center at the University of Cincinnati Neuroscience Institute, and Department of Neurosurgery, University of Cincinnati College of Medicine, and Mayfield Clinic, 234 Goodman Street, Cincinnati, Ohio, Brigham and Women's Hospital, 75 Francis St, Boston MA Capital Biosciences, Inc., 900 Clopper Rd, Suite 120, Gaithersburg, MD Case Comprehensive Cancer Center, Euclid Ave - Wearn 152, Cleveland, OH Cedars-Sinai Medical Center, 8700 Beverly Boulevard, Suite 290 West MOT, Los Angeles, CA Center for Liver Cancer, National Cancer Center Korea, 323 Ilsan-ro, Ilsan dong-gu, Goyang, Gyeonggi 10408, South Korea Central Arkansas Veterans Healthcare System, Little Rock, AR CHU of Quebec, Laval University Research Center of Chus 2705, boul. Laurier Bureau TR72 QUÉBEC, Quebec G1V 4G2 Centura Health 9100 E Mineral Cir, Centennial, CO Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA Charles University, Czech Republic CHU of Quebec, Hôtel-Dieu de Quebec-University Laval, 11 cote du palais, Quebec City, G1R 2J6 CHUM, Montreal, Qc, Canada. Clinic of Urology and Pediatric Urology, Saarland University, Homburg, Germany. Clinical Breast Care Project, Murtha Cancer Center, Uniformed Services University / Walter Reed National Military Medical Center, Bethesda, MD Comprehensive Cancer Center Tissue Procurement Shared Resource, Cooperative Human Tissue Network Midwestern Division, Dept. of Pathology, Human Tissue Resource Network, The Ohio State University, 410 West 10th Ave, Doan Hall, Room E413A, Columbus, OH Cureline, Inc., 290 Utah Ave, Ste 300, South san Francisco, CA 94080, USA Dana-Farber Cancer Institute, 450 Brookline Ave, Boston MA, Dardinger Neuro-Oncology Center, Department of Neurosurgery, James Comprehensive Cancer Center and The Ohio State University Medical Center, 320 W 10th Ave, Columbus, Ohio, Department of Cardiovascular and Thoracic Surgery. Suite 774 Professional Office Building W. Harrison St., Chicago, IL Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore MD Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY Department of Hematology and Medical Oncology, Mayo Clinic Arizona, 5779 E. Mayo Blvd, Phoenix AZ Department of Medicine, University of Wisconsin School of Medicine and Public Health, 1685 Highland Avenue, Madison, WI 53705

41 Department of Medicine, Washington University in St. Louis, 660 S. Euclid Ave., CB 8066, St. Louis, MO Department of Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea Department of Neurological Surgery Department of Neurosurgery, Emory University School of Medicine, 1365 Clifton Road, NE, Atlanta, GA Department of Obstetrics and Gynecology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas Department of Obstetrics/Gynecology and Reproductive Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Place, New York, NY Department of Orthopedic Surgery, University of Kansas Medical Center 3901 Rainbow Boulevard, Kansas City, KS Department of Pathology and Cell Biology, Columbia University, New York, NY10032 Department of Pathology and Immunology, Baylor College of Medicine, One Baylor Plaza, Houston, TX Department of Pathology and Laboratory Medicine, University of Kansas Medical Center, Kansas City, KS Department of Pathology, Department of Cell and Molecular Medicine. 570 Jelke South center, 1750 W. Harrison St., Chicago, IL Department of Pathology, Duke University School of Medicine, Durham, NC Department of Pathology, Spectrum Health, 35 Michigan NE, Grand Rapids, MI Department of Pathology, The Ohio State University School of Medicine, N308 Doan Hall, 410 W 10th Ave, Columbus, OH Department of Pathology, The Ohio State University Wexner Medical Center (Doan Hall N337B, 410 West 10th Ave., Columbus, OH 43210) Department of Pathology. 570 Jelke South center, 1750 W. Harrison St., Chicago, IL Department of Surgery and Anatomy, Ribeirão Preto Medical School - FMRP, University of São Paulo, Brazil, Department of Surgery and Cancer, Imperial College London, Du Cane Road London W12 0NN, UK Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Surgery, Columbia University, New York, NY Department of Surgery, University of Michigan, Ann Arbor MI Department of Urology and Pediatric Urology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany Department of Urology, Mayo Clinic Arizona, 5779 E. Mayo Blvd, Phoenix AZ Departments of Neurosurgery and Hematology and Medical Oncology, School of Medicine and Winship Cancer Institute, 1365C Clifton Rd. N.E., Emory University, Atlanta, GA Departments of Pathology & Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd--Unit 85, Houston, Texas, USA Dept. of Pathology & Laboratory Medicine, University of Cincinnati, UC Health University Hospital, 234 Goodman Street, Cincinnati, OH Dept. of Pathology, Robert J. Tomsich Pathology & Laboratory Medicine Institute, Lerner Research Inst, Cleveland Clinic Foundation, Cleveland, OH Dept. of Surgery, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str. 22, Munich, Germany Dignity Health Mercy Gilbert Medical Center 3555 S Val Vista Dr, Gilbert, AZ Division Molecular Urology, Department of Urology and Pediatric Urology, University Hospital Erlangen, Friedrich- Alexander-University Erlangen-Nuremberg, Erlangen, Germany Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Dr. Bethesda USA Division of Neurosurgical Research, Dpt. Neurosurgery, University of Heidelberg, INF 400, Heidelberg, Germany Division of Surgical Oncology, Department of Surgery, Brigham and Women's Hospital, 75 Francis Street, Boston, MA Dpt. Neuropathology, University of Heidelberg, INF 224, Heidelberg, Germany Dpt. Neurosurgery, University of Heidelberg, INF 400, Heidelberg, Germany Duke University Duke University Medical Center 177 MSRB Box 3156 Durham, NC 27710

42 Duke University Medical Center, Gynecologic Oncology, Box 3079, Durham, NC USA Emory University, 1365 Clifton Road, NE Atlanta GA, Erasmus MC, Wytemaweg 80, 3015 CN, Rotterdam, The Netherlands Erasmus Medical Center Erasmus University Medical Center Rotterdam, Cancer Institute, Wytemaweg 80, 3015CN, Rotterdam, the Netherlands The Foundation of the Carlo Besta Neurological Institute, IRCCS via Celoria 11, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA Greater Poland Cancer Center, Garbary 15, Poznań Poland Greenville Health System Institute for Translational Oncology Research 900 West Faris Road Greenville SC Harvard University Cambridge, MA Havener Eye Institute, The Ohio State University Wexner Medical Center 915 Olentangy River Rd, Columbus, OH Henry Ford Hospital 2799 West Grand Blvd Detroit MI USA Hepatobiliary Surgery Unit, A. Gemelli Hospital, Catholic University of the Sacred Heart, Largo Agostino Gemelli 8, Rome, Italy Hermelin Brain Tumor Center, Henry Ford Health System, 2799 W Grand Blvd, Detroit, MI, Hospices Civils de Lyon, CARDIOBIOTEC, Lyon F-69677, France Hospital Clinic, Villarroel 180, Barcelona, Spain, Hue Central Hospital, Hue, Vietnam Human Tissue Resource Network, Dept. of Pathology, College of Medicine, 1615 Polaris Innovation Ctr, 2001 Polaris, Columbus Huntsman Cancer Institute, Univ. of Utah, 2000 Circle of Hope, Salt Lake City, UT Indivumed GmbH, Hamburg, Germany René Descartes University, Hospital Européen Georges Pompidou, 20 rue Leblanc, 75015, Paris, France Curie Institute, 26 rue Ulm, Paris, France Gustave Roussy Institute of Oncology, 39 Rue Camille Desmoulins 94805, Villejuif, France Institute of Human Virology Nigeria, Abuja, Nigeria Institute of Pathology, Technical University of Munich, Trogerstr. 18, Munich, Germany Institute of Pathology, University Hospital Erlangen, Firedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany Institute of Urgent Medicine, Republic of Moldova Regina Elena National Cancer Institute Irccs - Ifo, Via Elio Chianesi 53, 00144, Rome, Italy John Wayne Cancer Institute, 2200 Santa Monica Blvd, Santa Monica, CA Keimyung University, Daegu, South Korea Knight Comprehensive Cancer Institute, Oregon Health & Science University Ludwich Maximilians University Munich, Ziemssenstrasse 1, D-80336, Munich, Germany Maine Medical Center, 22 Bramhall St., Portland, ME Martini-Clinic, Prostate Cancer Center, University Medical Center Hamburg-Eppendorf, Martinistr. 52, D Hamburg, Germany Massachusetts General Hospital 55 Fruit Street Boston Ma Mayo Clinic 5777 E Mayo Blvd, Phoenix, Arizona Mayo Clinic 4500 San Pablo Road Jacksonville, FL Mayo Clinic, 200 First St. SW, Rochester, MN Mayo Clinic, Rochester, MN McGill University Health Center Decarie Blvd, Montreal, QC, Canada H4A 3J1 MD Anderson Cancer Center 1515 Holcombe Blvd. Unit 0085 Houston, TX MD Anderson Cancer Center, Department of Pathology, Unit 085; 1515 MD Anderson Cancer Center Life Science Plaza Building 2130 W. Holcombe Blvd, Unit 2951 Houston, TX Office: LSP Melanoma Institute Australia, North Sydney, NSW, Australia 2060 Memorial Sloan Kettering Cancer Center Department of Pathology, 1275 York Avenue, New York, NY Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065

43 Memorial Sloan Kettering Cancer Center, Center for Molecular Oncology, 1275 York Avenue, New York, NY Ministry of Health of Vietnam, Hanoi, Vietnam Molecular Pathology Shared Resource of Herbert Irving Comprehensive Cancer Center of Columbia University, New York, NY10032 Molecular Response Torreyana Road San Diego, CA Murtha Cancer Center, Uniformed Services University / Walter Reed National Military Medical Center, Bethesda, MD Nancy N. and J.C. Lewis Cancer & Research Pavilion at St. Joseph's/Candler, 225 Candler Drive, Savannah, GA National Cancer Hospital of Vietnam National Cancer Institute, 31 Center Dr, Bethesda, MD National Cancer Institute, Bethesda, MD Norfolk & Norwich University Hospital, Norwich, UK. NR4 7UY NYU Langone Medical Center, Cardiothoracic Surgery, 530 first Avenue, 9V, New York, NY Oncology Institute, Republic of Moldova Ontario Tumor Bank - Hamilton site, St. Joseph's Healthcare Hamilton, Hamilton, Ontario L8N 3Z5, Canada Ontario Tumor Bank - Kingston site, Kingston General Hospital, Kingston, Ontario K7L 5H6, Canada Ontario Tumor Bank Ottawa site, The Ottawa Hospital, Ottawa, Ontario K1H 8L6, Canada. Ontario Tumor Bank, London Health Sciences Centre, London, Ontario N6A 5A5, Canada Ontario Tumor Bank, Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada Orbital Oncology & Ophthalmic Plastic Surgery Department of Plastic Surgery M.D. Anderson Cancer Center 1515 Holcombe Blvd, Unit 1488 Houston, Texas Papworth Hospital NHS Foundation Trust, UK Pathology, St. Joseph's/Candler, 5353 Reynolds St., Savannah, GA Professor, Division of Neuropathology, Department of Pathology, University Hospitals Case Medical Center Program in Epidemiology, Fred Hutchinson Cancer Research Center, Seattle, WA Providence Health and Services QIMR Berghofer Medical Research Institute, Herston, QLD, Australia Radboud Medical University Center, Geert Grooteplein-Zuid 10, Nijmegen, the Netherlands Regina Elena National Cancer Institute, Rome, Italy Reinier de Graaf Hospital, Reinier de Graafweg 5, 2625AD, Delft, the Netherlands Research Institute of the McGill University Health Centre, McGill University, Montréal, Québec, Canada Research Center Of Chus Sherbrooke, Québec aile 9, porte 6, e Avenue Nord, Sherbrooke, QC J1H 5N4, Canada Rockefeller University 1230 York Ave New York, NY Rose Ella Burkhardt Brain Tumor and Neuro-Oncology Center ND4-52A, Cleveland Clinic Foundation, 9500 Euclid Ave, Cleveland, OH Rose Ella Burkhardt Brain Tumor and Neuro-Oncology Center, 9500 Euclid Avenue - CA51, Cleveland, OH Rose Ella Burkhardt Brain Tumor and Neuro-Oncology Center, Department of Neurosurgery, Neurological and Taussig Cancer Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, Ohio, Roswell Park Cancer Institute. Elm & Carlton Streets, Buffalo NY Sage Bionetworks, Seattle, WA Saint-Petersburg City Clinical Oncology Hospital, 56 Veteranov prospect, Saint-Petersburg, , Russia Sapienza University of Rome, Piazzale Aldo Moro 5, Rome, Italy School of Medicine, National Center for Asbestos Related Research, University of Western Australia, Nedlands, WA, Australia 6009 Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, 3050, Victoria, Australia St. Petersburg Academic University RAS, 8/3 Khlopin Str., St. Petersburg, , Russia Stanford University, Palo Alto, CA, USA Stephenson Cancer Center, University of Oklahoma, Oklahoma City, OK USA Tayside Tissue Bank, University of Dundee, Scotland UK DD1 9SY The International Genomics Consortium, 445 N. 5th Street, Phoenix, Arizona The Ohio State University, Columbus, OH The Ohio State University Comprehensive Cancer Center, 320 W 10th Avenue, Columbus, OH 43210

44 The Ohio State University Wexner Medical Center (2012 Kenny Rd, Columbus, OH 43221) The Oregon Clinic 1111 NE 99th Ave, Portland, OR The Prince Charles Hospital, UQ Thoracic Research Centre, Australia 4032 The Research Institute at Nationwide Children's Hospital 700 Children's Drive Columbus Ohio Tufts Medical center, 800 Washington St. Boston MA UABMC 401 Beacon Pkwy W Birmingham AL UC Cancer Institute, 200 Albert Sabin Way, Suite 1012, Cincinnati, OH UCSF-Helen Diller Family Comprehensive Cancer Center, th St., Mission Hall WS 6532 Box 3211, San Francisco, CA University Hospital of Giessen and Marburg, Badingerstrasse 3, 35044, Marburg, Germany University Hospital in Wurzburg, Germany, Oberdürrbacher Strasse 6, 97080, Würzburg, Germany University Health Network, 200 Elizabeth Street, Toronto ON M5G 2C4 Canada University Hospital Essen, University Duisburg-Essen, German Cancer Consortium, Hufelandstr. 55; Essen, Germany University Medical Center Hamburg-Eppendorf, Martinistr. 52, D Hamburg, Germany University of Abuja Teaching Hospital, Gwagalada, FCT, Nigeria University of Arizona Tucson Arizona University of Calgary, Departments of Surgery and Oncology, th St NW, Calgary, AB, T2N 4N2 University of California San Francisco, 2340 Sutter St Rm S 229, San Francisco CA University of California, Irvine 333 City Boulevard West Suite 1400 Orange CA University of Chicago Medicine 5841 S. Maryland Ave. Room G-216, MC 5094 Chicago, IL University of Cincinnati Cancer Institute, Brain Tumor Clinical Trials, 200 Albert Sabin Way Suite 1012, Cincinnati, OH University of Cincinnati Cancer Institute, Holmes Bldg., 200 Albert Sabin Way, Ste 1002, Cincinnati, OH University of Colorado Cancer Center, Aurora, CO, 80111, USA University of Dundee, Scotland UK DD1 9SY University of Florence, Viale Pieraccini 6, Firenze, Italy University of Hawaii Cancer Center University of Iowa Hospital & Clinics, 200 Hawkins Drive, Clinical Trials-Data Management, PFP, Iowa City, IA University of Iowa Hospital & Clinics, 200 Hawkins Drive, Hematology/Oncology, C32 GH, Iowa City, IA University of Iowa Hospital & Clinics, 200 Hawkins Drive, ICTS-Informatics, 272 MRF, Iowa City, IA University of Iowa Hospital & Clinics, 200 Hawkins Drive, Medicine Administration, 380 MRC, Iowa City, IA University of Iowa Hospital & Clinics, 200 Hawkins Drive, Molecular Pathology, B606 GH, Iowa City, IA University of Iowa Hospital & Clinics, 200 Hawkins Drive, Pathology, SW247 GH, Iowa City, IA University of Kansas Cancer Center, 3901 Rainbow Blvd, Kansas City, KS University of Kansas Medical Center Kansas City KS University of Michigan 500 S State St, Ann Arbor, MI University of Montreal 2900 Edouard Mont petit Blvd, Montreal, QC H3T 1J4, Canada University of New Mexico Albuquerque, New Mexico University of Pennsylvania Philadelphia, PA University of Pittsburgh, Department of Cardiothoracic Surgery,200 Lothrop St, Suite C-800, Pittsburgh, Pennsylvania University of Pittsburgh, Department of Pathology, Pittsburgh, Pennsylvania University of Sheffield Western Bank, Sheffield S10 2TN, UK University of Washington Seattle, WA UPR Comprehensive Cancer Center Biobank; University of Puerto Rico Comprehensive Cancer Center, Celso Barbosa St. Medical Center Area, San Juan, PR Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, Building 10, Room , Bethesda, MD Valley Health System, 1 Valley Health Plaza, Paramus, NJ Vanderbilt University Medical Center 1211 Medical Center Dr, Nashville, TN Washington University School of Medicine, 600 S. Taylor Ave, St. Louis, MO 63110

45 Weill Cornell Medical College, New York, NY 10065

46 A 1p34.2 [3]** *non-gynecologic amplifications **gynecologic amplifications 1 *non-gynecologic deletions **gynecologic deletions B Value 4q12 PDGFRA [1]* 5q35.3 FLT4 [45]* 6p22.3 SOX4, E2F3 [4]* 6p21.1 VEGFA [2]* 6q21 [3]** 7p22.1 [4]* 7p11.2 EGFR [1]* 7q21.1 CDK6 [6]* 7q36.3 [21]** 9p13.3 [4]* 10p15.1 [27]** 10q22.2 KAT6B [8]** 12q14.1 CDK4 [5]* 14q13.3 NKX2-1, NKX2-8 [6]* 18q11.2 GATA6 [1]* 19p13.12 BRD4 [2]** C D Number Number of Occurrences of occurrences Number Number of of Occurrences occurrences Amplification FDR Q-value PANGYN P value: X Deletion FDR Q-value Number of Number enriched genes of enriched within disease genesets (k=5) 0 Pan-Gyn: Number of Enriched enriched genes Genes p22.3 NUDT1 [4]** 7q31.1 [2]* 7q36.3 [11]* 9p23 PTPRD [1]* 12q23.1 [2]** 12q24.33 POLE [16]* 16p13.3 RBFOX1 [1]* 16q24.2 [23]* 17p13.1TP53 [2]* 17p12 MAP2K4 [3]** 17q11.2 NF1 [5]** 21q22.3 TMPRSS2 [1]* BRCA CESC OV UCEC UCS ACC BLCA CHOL COAD DLBC ESCA GBM HNSC KICH KIRC KIRP LAML LGG LIHC LUAD LUSC MESO PAAD PCPG PRAD READ SARC SKCM STAD TGCT THCA THYM UVM DYM C YP4 S100A F 8 3 DTL TUB ACO G T2 CP2 MID1 GPR3 CXAD 9 R RAB34 FBLN2 ENTPD RPL39L 3 LRRN4 HCAR1 RHOC GINS2 TACST SLC35B D 3 2 STMN1 HTATIP2 C22orf29 NDUFAF3 RYBP CACN PYCR1 G4 AUNIP DEF6 C14or AIP f93 IFIT STX M B 1 P FBLN1 2 NTSR2 PUF60 LYPD3 SVOPL ESPL1 LRRC1 WISP1 7 FOXI1 CLEC3 SLC6A1 B 3 HMGN1 ELANE KAZALD1 ODF2 LUC7L TMEM5 IFNGR2 9 FAM83A FOLH1 C1QTN ZDHHC F SERPINB5 SEMA4A AGR2 MEGF CCDC TRH CLIC GYPC 6 TCTEX CDO1 1D1 TBX5 BHMT GRIK2 FAIM2 ZIC3 PCDH HTR2C A13 GHSR PENK ZNF56 NETO1 0 PABPC5 SOX1 RAB9 CDKL5 B SRPX SCML SCML1 2 LONRF MAOA 3 CXorf5 ATP7A 8 ATP6AP1 IDS COX CETN 7B 2 FAM120C TSR2 TFE3 TAF1 STK2 IDH3G 6 ATP6AP SUV39H 2 1 PRPS2 FAM3A CMC4 G6PD KIF4A CXorf5 ABCD1 6 MOSPD ZNF41 2 UXT RPL1 PQBP 0 1 WDR44 CXorf57 HSD17B ALG13 10 SAT1 UBQL SLC35 N A 2 2 MAGT1 ABCB7 AMER1 PSMD10 FOXO4 MID1IP YIPF6 1 LAS1L TCEAL ZNF502 3 NHS GRIN HIST1 3 H A 2 ZNF132 BE HTR1B HIST1H HIST1H2 3 B C B ESR1 TMEM TERC 101 LYPLA CCR1 L1 CACY HNF1B BP SYNJ2 CPT1B CADPS NDUFA1 F8A1 PCSK RAB33 1 A N LAGE3 BCAP3 PIM2 1 AMME TMEM1 C 6 R 4 1 WWC3 HMGB3 CXorf40 RLIM A FAM5 TSPY 0 L2 A EMD NDUF ELK1 B11 PDK3 PLP2 EBP MPP MAG 1 E PORCN D1 PGK1 IRAK1 APEX2 USP51 ZMYM3 PHF8 DNAS TMEM E L 5 1 A MED12 SLC9A6 REPS2 HTATSF ATRX 1 RPL36 LAMP2 A MTMR1 ACSL4 STAG2 PRDX4 ZNF280 WDR45 C KRBOX4 MOSPD1 RBMX CXorf4 POLA1 0B VBP1 NONO JADE3 Figure S1. (Related to Figure 1) (A) Significant GISTIC2.0 SCNA amplification and deletion peaks unique to either Pan-Gyn or non-gyn cancers plotted across the genome. Numbers in brackets are the numbers of genes located within the peak; genes listed are the suspected target oncogenes or tumor suppressors in the peak (if identifiable). (B) Clustered heat map showing 197 genes that are differentially methylated between the Pan-Gyn and non-gyn tumor types. The genes (rows) are clustered; the tumor types (columns) are not. The five Pan-Gyn tumor types are in the five left-most columns of the heat map. (C) Bootstrapping based analysis where five non-gyn cancer types (from 28 types) were selected at random 10,000 times and each time the number of mutated genes enriched in the cohort was computed (Fisher exact test with FDRadjusted p < 0.01). The distribution across 10,000 samples is plotted. Also plotted (red line) is the number of mutated genes (23) enriched in the Pan-Gyn cohort, and a p value based on the proportion of iterations that had the same number of or more genes than Pan-Gyn (p = 0.10). Median of number of genes in the null distribution is 6. (D) Same analysis as in (C) using copy-number altered, instead of mutated, genes. The p value is < for Pan-Gyn tumor types with 122 genes altered, whereas the median number of genes altered in the null distribution is 2.

47 A B C Figure S2. (Related to Figure 2) (A) Residual sum of squares (RSS) and Explained Variance vs. number of mutation signatures. The plots show how the RSS and Explained Variance (shown as a percentage on the y-axis) vary with respect to the number of signatures. (B) The 10 mutation signatures that explain 90% of the variance are shown in the same order as in A. (C) Pan-Gyn sample distribution within the mutational clusters with corresponding clinical and pathological features. Each row represents a feature and the color scale represents the proportion of samples.

48 Table S2. (Related to Figure 2) Numbers of types of mutations and their percentages present in the Pan-Gyn cohort. Genes 21,109 Missense mutations 621,875 (50%) Nonsense mutations 56,780 (4.6%) Deletions 32,358 (2.6%) Insertions 10,297 (0.8%) Silent mutations 233,337 (19%) Non-exonic region mutations 291,065 (23%) Total 1,245,712 (100%)

49 A B A HPV BRCA Subtype CESC Pathology UCEC Histology MSI Status HPV Status Pan Gyn Tumor Pan Gyn Cluster (k=7) ABSOLUTE Purity Indeterminate Negative Positive Basal BRCA Cluster Tumor type BRCA PAM50 UCEC histology CESC pathology Num of segments Aneuploidy score 1 Aneuploidy score 2 Purity Ploidy Genome doublings Advanced stage High grade MSI status HPV status Tobacco history POLE Status BRCA1 BRCA2 TP53 PTEN ERBB2 CDKN2A MLH1 MSH2 MSH3 MSH6 Her2 LumA LumB UCEC CESC Normal AdenoCarcinoma SquamousCarcinoma Endometrioid Mixed Serous MSI Indeterminate MSI H MSI L MSS Pan-Gyn BRCA CESC OV UCEC Squam Serous Adeno OV HER2 Mixed UCEC Basal UCS Normal Num of segments Aneuploidy score 1 Aneuploidy score 2 Purity Ploidy B C MSI status HPV status Tobacco history POLE Status BRCA1 - MSH6 Genome doublings Advanced stage High grade Indeter Indeter Current WT None 0 False False MSS Neg Non-Sm MUT MUT 1 True True MSI-L Pos Reform NA HZ DEL 2 NA NA MSI-H NA NA NA AMP NA Pan-Gyn 1966 RAD51C Silencing BRCA1 Silencing MLH1 Silencing TP53 Variant PTEN Variant MLH1 Variant 1.0 Pan-Gyn Silenced HOMDEL MUT Purity Heat Color Consensus Class Pan Gyn Cluster (K=2) Pan Gyn Cluster (K=7) Pan Gyn Tumor D C NOT 1.2 Consensus cluster stability=0.83 Consensus cluster stability=0.90 Cluster (K=2) Endo Lum B Low frequency High frequency Cluster Consensus Class Pan Gyn Cluster (K=7) Pan Gyn Tumor E D Cluster (K=7) UCEC histology CESC pathology Lum A CESC consensus stability BRCA PAM50 BRCA consensus stability Tumor type Variant Silencing UCS Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster Height 40 Consensus clustering assignments of Pan-Gyn samples across Pan-Can cohort Pan Gyn Tumor Pan Gyn Cluster (K=7) Pan Gyn Cluster (K=2) Figure S3. (Related to Figure 3) (A) Various clinical and genomic features visualized alongside the copy number cluster groups. (B) Cancer-specific DNA methylation heterogeneity within the Pan-Gyn cohort. Unsupervised and dichotomized clustering on cancer-specific autosomal loci is shown with multiple column annotations horizontally listed above and beneath the heat maps (with normal heat map plotted on the left and tumor heat map on the right). Color codes for column annotations are vertically listed on the right. Clusters are assigned labels 1-7 from left to right. DNA methylation level (beta values) ranges from 0 to 1, with 0 representing low methylation (blue) and 1 high methylation (red). (C) Consensus clustering stability of the left branch (low degree of hypermethylation, black) and right branch (high degree of hypermethylation, brown) was evaluated by randomly sub-sampling 80% of Pan-Gyn samples repeatedly. (D) Robustness assessment of k=7 cluster assignments. (E) Consensus clustering assignments of the Pan-Gyn samples across the whole Pan-Can cohort by their Pan-Gyn cluster membership. Samples colored as white are nongyn samples.

50 A B PTEN TP53 PTEN TP mirnas 0 C row-scales, log10 (RPM+1) D PanGyn mirna tumors 6 7 Purity C1 C2 C3 C4 C5 C6 C7 OV BRCA non-basal lobular BRCA non-basal ductal BRCA non-basal mixed/other UCEC endometrioid UCEC serous UCEC mixed UCS 55 1 M LV V LQ J Y D OX CESC adenocarcinoma 38 7 CESC squamous BRCA basal C4 UCEC Endo PTEN altered (%) 91.4 TP53 altered (%) 7.2 H C5 CESC UCEC UCEC Serous Endo Adeno UCS Figure S4. (Related to Figure 5) (A) Clustered heat map of 1,967 samples (columns) across 216 proteins (rows) using the protein expression data showing five clusters. Covariate bars above the heat map show association with UCS and UCEC histologic types, breast cancer subtypes, and purity. Proteins whose expressions are high (red) or low (blue) in various subtypes are highlighted. (B) Unsupervised hierarchical clustering of mirna data across 2417 Pan-Gyn tumor samples (columns) yields seven-clusters. A normalized abundance heat map of the 293 mostvariable mirna 5p or 3p strands (rows) is shown. The scale bar shows row-scaled log10(rpm+1) abundances. (C) Distribution of cancer subtypes across the seven clusters. (D) Percent of cases with altered PTEN and TP53 in clusters C4 and C5, for UCEC endometrioid, UCEC serous, UCS and CESC adenocarcinomas.

51 FXN RPL32 HSPD1 BAX PHB HECTD1 ZMYM2 MGA REACTOME_TELOMERE_MAINTENANCE BILD_CTNNB1_ONCOGENIC_SIGNATURE HIST1H2BM HIST1H2AB HIST1H2BB HIST1H2BL HIST3H2BB HIST2H2AC HIST1H2BN HIST1H2BG HIST1H2AJ HIST1H2BI HIST1H4E HIST1H4K HIST1H4D HIST1H4L HIST1H4F HIST4H4 RUVBL2 HIST1H4A HIST1H4B HIST1H2BF HIST1H2AD HIST1H4C RPA3 NHP2 DKC1 LIG1 POLA2 RFC4 PRIM1 ACD HIST1H2BE RUVBL1 HIST1H4J ZNF532 POLD2 POLE RFC5 PCNA RBM25 DNA2 CDC25A ZYG11B TBL1XR1 ANKRD12 PHACTR2 ASH1L NCOA3 FRYL PTPDC1 SREK1 SMG1 CERS6 SPIN1 FBXO11 DHX36 PTGFR AP3M1 RSF1 SCAF11 USP47 GTF2I MYO9A EIF2AK3 ZBTB20 ELAVL1 NEK1 NRD1 SUZ12 CCDC14 UGGT1 SRSF11 HMGA1 HSPE1 PCNA NPM1 TTC33 PAN3 APEX1 FIGN RPL19 EXOC8 RAB22A MRTO4 RPL3 DLG1 DANG_MYC_TARGETS_UP RPL13 TSNAX RPS13 TET1 FAM3C TIMM10 AKAP1 DKC1 NIPBL PBX1 TRIM59 RPL27A AIMP2 SRM USP7 NEK1 RAB3GAP2 CDC25A RPS19 PER3 CDC25C ARMCX3 PHF20L1 PPID NME1 SEC63 LIMCH1 DDX5 CCNB1 UTRN ATF2 ANO3 GNL3 SNRPB HNRNPA1 YME1L1 PTAR1 MEGF9 SRSF2 RPL26 ATP2B4 RPS16 PRRC1 SCAF11 RPL22 ATAD2B E2F1 UBE2C NAA30 RPL10 UBR3 DDX10 FBXO11 ENDOG FAM63B PYCR1 RNF38 ODC1 NFAT5 RPL5 CEP97 MPP5 CKS2 ACAP2 APC PRKAB2 WNT5A LPGAT1 FNBP1 NME2 POLD2 Dataset RPS5 GTF2A1 dataset RRAGC SLC5A3 MAP3K2 RPL29 PPAP2A QARS CLOCK POLD2 VPS36 ARPC4 OLR1 TELO2 GPD2 MRPS2 PQBP1 BRMS1L IFRD2 WWC2 GNB2 LATS1 DMD RPS4X D2HGDH UBXN4 PKD2 RPS5 SLC26A6 SMC5 ENAH STX10 BMPR2 NUBP2 COBLL1 ATP5J2 DDX46 FARSA NEK7 RFXANK TIMP3 NTHL1 EXOC5 NAA10 UBR5 UBXN6 OSBPL3 STOML2 ARID4A FAM96B TRIM38 TRMT1 SSNA1 ACTR2 RHOBTB3 NOB1 MYCBP2 ATP11B RNF126 ECHS1 COX5A SEC11A FERMT2 RALGPS2 DDX3X PALLD RPL35 RPL36 NDUFB10 B3GNT5 SREK1 SGTA QPRT ISOC2 RTN4RL1 LAMB2 RPS14 RPS6KA3 GAPVD1 PELI1 YOD1 TLR4 DAIRKEE_TERT_TARGETS_UP C19orf53 DGCR6L FOXN3 SLC17A5 VASH2 PEMT SIVA1 TRAPPC2L PPP1R3B YTHDC2 CALD1 IDH3G C12orf10 MVD MANBAL GLTSCR2 PHACTR2 ARFGEF1 RASGRP3 ARHGAP21 COL5A2 SNX30 MPST SRA1 GNL3 ATP5C1 BAX FAM207A UBXN1 ALKBH7 NDUFA13 UQCRC1 ARFIP2 LMAN2 ZNRD1 ARHGEF12 SASH1 PARP9 MRPL37 NFKBIL1 MRPL38 ZNF691 DOCK10 MRPS24 RPL10A CDC25C APOPT1 TMEM2 ZNF532 STAT3 RECK PLD1 EMD ATP5D SNRPA RPS16 MTX1 RPS10 RPL15 CDC34 NME1 WNK1 TACC1 CLIP4 CKS2 CLPP PAG1 AIM1 RP2 PIN1 GABRIELY_MIR21_TARGETS ncrna/mrna corr. gene set FDR Uterine corpus endometrial carcinoma (tumor- serous subgroup) - TCGA Uterine corpus endometrial carcinoma (tumor - endometrioid subgroup) - TCGA Breast invasive carcinoma (tumor - luminal A subgroup) - TCGA shared leading edge genes Breast invasive carcinoma (tumor - luminal B subgroup) - TCGA Ovarian serous cystadenocarcinoma (tumor) - TCGA Cervical squamous cell carcinoma and endocervical adenocarcinoma (tumor) - TCGA Figure S5. (Related to Figure 5) Circos plot showing a subset of the gene sets identified through GSEA on TERC targets from the Pan-Gyn datasets. Spearman correlation values were calculated for TERC targets identified in this study, as well as up- and downstream TERC neighboring genes in UCEC, OV, BRCA and CESC. Gene sets associated with reported TERC functions and having significant FDR values in all tissue types are displayed. For each gene set, all of the leading-edge genes (i.e., genes contributing to the gene set enrichment) shared by at least three of the cancer types in Pan-Gyn are shown.

Supplementary Figure 1. Copy Number Alterations TP53 Mutation Type. C-class TP53 WT. TP53 mut. Nature Genetics: doi: /ng.

Supplementary Figure 1. Copy Number Alterations TP53 Mutation Type. C-class TP53 WT. TP53 mut. Nature Genetics: doi: /ng. Supplementary Figure a Copy Number Alterations in M-class b TP53 Mutation Type Recurrent Copy Number Alterations 8 6 4 2 TP53 WT TP53 mut TP53-mutated samples (%) 7 6 5 4 3 2 Missense Truncating M-class

More information

Plasma-Seq conducted with blood from male individuals without cancer.

Plasma-Seq conducted with blood from male individuals without cancer. Supplementary Figures Supplementary Figure 1 Plasma-Seq conducted with blood from male individuals without cancer. Copy number patterns established from plasma samples of male individuals without cancer

More information

The Cancer Genome Atlas Pan-cancer analysis Katherine A. Hoadley

The Cancer Genome Atlas Pan-cancer analysis Katherine A. Hoadley The Cancer Genome Atlas Pan-cancer analysis Katherine A. Hoadley Department of Genetics Lineberger Comprehensive Cancer Center The University of North Carolina at Chapel Hill What is TCGA? The Cancer Genome

More information

Expanded View Figures

Expanded View Figures Molecular Systems iology Tumor CNs reflect metabolic selection Nicholas Graham et al Expanded View Figures Human primary tumors CN CN characterization by unsupervised PC Human Signature Human Signature

More information

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Melissa S. Cline 1*, Brian Craft 1, Teresa Swatloski 1, Mary Goldman 1, Singer Ma 1, David Haussler 1, Jingchun Zhu 1 1 Center for Biomolecular

More information

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity A Consistently unmethylated sites (30%) in 21 cancer types 174,696

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Sung-Hou Kim University of California Berkeley, CA Global Bio Conference 2017 MFDS, Seoul, Korea June 28, 2017 Cancer

More information

Genomic and Functional Approaches to Understanding Cancer Aneuploidy

Genomic and Functional Approaches to Understanding Cancer Aneuploidy Article Genomic and Functional Approaches to Understanding Cancer Aneuploidy Graphical Abstract Cancer-Type Specific Aneuploidy Patterns in TCGA Samples CRISPR Transfection and Selection Immortalized Cell

More information

Supplemental Information. Integrated Genomic Analysis of the Ubiquitin. Pathway across Cancer Types

Supplemental Information. Integrated Genomic Analysis of the Ubiquitin. Pathway across Cancer Types Cell Reports, Volume 23 Supplemental Information Integrated Genomic Analysis of the Ubiquitin Pathway across Zhongqi Ge, Jake S. Leighton, Yumeng Wang, Xinxin Peng, Zhongyuan Chen, Hu Chen, Yutong Sun,

More information

Clinical Grade Genomic Profiling: The Time Has Come

Clinical Grade Genomic Profiling: The Time Has Come Clinical Grade Genomic Profiling: The Time Has Come Gary Palmer, MD, JD, MBA, MPH Senior Vice President, Medical Affairs Foundation Medicine, Inc. Oct. 22, 2013 1 Why We Are Here A Shared Vision At Foundation

More information

User s Manual Version 1.0

User s Manual Version 1.0 User s Manual Version 1.0 #639 Longmian Avenue, Jiangning District, Nanjing,211198,P.R.China. http://tcoa.cpu.edu.cn/ Contact us at xiaosheng.wang@cpu.edu.cn for technical issue and questions Catalogue

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes.

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes. Supplementary Figure 1 Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes. (a,b) Values of coefficients associated with genomic features, separately

More information

Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc

Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc Honors Theses Biology Spring 2018 Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc Anne B. Richardson Whitman College Penrose

More information

Ahrim Youn 1,2, Kyung In Kim 2, Raul Rabadan 3,4, Benjamin Tycko 5, Yufeng Shen 3,4,6 and Shuang Wang 1*

Ahrim Youn 1,2, Kyung In Kim 2, Raul Rabadan 3,4, Benjamin Tycko 5, Yufeng Shen 3,4,6 and Shuang Wang 1* Youn et al. BMC Medical Genomics (2018) 11:98 https://doi.org/10.1186/s12920-018-0425-z RESEARCH ARTICLE Open Access A pan-cancer analysis of driver gene mutations, DNA methylation and gene expressions

More information

Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt, Weizhou Zhang 1-4

Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt, Weizhou Zhang 1-4 SOFTWARE TOOL ARTICLE TRGAted: A web tool for survival analysis using protein data in the Cancer Genome Atlas. [version 1; referees: 1 approved] Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt,

More information

Nature Getetics: doi: /ng.3471

Nature Getetics: doi: /ng.3471 Supplementary Figure 1 Summary of exome sequencing data. ( a ) Exome tumor normal sample sizes for bladder cancer (BLCA), breast cancer (BRCA), carcinoid (CARC), chronic lymphocytic leukemia (CLLX), colorectal

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Workflow of CDR3 sequence assembly from RNA-seq data.

Nature Genetics: doi: /ng Supplementary Figure 1. Workflow of CDR3 sequence assembly from RNA-seq data. Supplementary Figure 1 Workflow of CDR3 sequence assembly from RNA-seq data. Paired-end short-read RNA-seq data were mapped to human reference genome hg19, and unmapped reads in the TCR regions were extracted

More information

Identification of Potential Therapeutic Targets by Molecular and Genomic Profiling of 628 Cases of Uterine Serous Carcinoma

Identification of Potential Therapeutic Targets by Molecular and Genomic Profiling of 628 Cases of Uterine Serous Carcinoma Identification of Potential Therapeutic Targets by Molecular and Genomic Profiling of 628 Cases of Uterine Serous Carcinoma Nathaniel L Jones 1, Joanne Xiu 2, Sandeep K. Reddy 2, Ana I. Tergas 1, William

More information

Molecular Subtyping of Endometrial Cancer: A ProMisE ing Change

Molecular Subtyping of Endometrial Cancer: A ProMisE ing Change Molecular Subtyping of Endometrial Cancer: A ProMisE ing Change Charles Matthew Quick, M.D. Associate Professor of Pathology Director of Gynecologic Pathology University of Arkansas for Medical Sciences

More information

Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies

Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies Article Conditional Selection of Genomic Alterations Dictates Cancer Evolution and Oncogenic Dependencies Graphical Abstract Authors Marco Mina, Franck Raynaud, Daniele Tavernari,..., Nikolaus Schultz,

More information

Targeted Agent and Profiling Utilization Registry (TAPUR ) Study. February 2018

Targeted Agent and Profiling Utilization Registry (TAPUR ) Study. February 2018 Targeted Agent and Profiling Utilization Registry (TAPUR ) Study February 2018 Precision Medicine Therapies designed to target the molecular alteration that aids cancer development 30 TARGET gene alterations

More information

Nature Medicine: doi: /nm.3967

Nature Medicine: doi: /nm.3967 Supplementary Figure 1. Network clustering. (a) Clustering performance as a function of inflation factor. The grey curve shows the median weighted Silhouette widths for varying inflation factors (f [1.6,

More information

Biology of cancer development in the GI tract

Biology of cancer development in the GI tract 1 Genesis and progression of GI cancer a genetic disease Colorectal cancer Fearon and Vogelstein proposed a genetic model to explain the stepwise formation of colorectal cancer (CRC) from normal colonic

More information

Transcriptional Profiles from Paired Normal Samples Offer Complementary Information on Cancer Patient Survival -- Evidence from TCGA Pan-Cancer Data

Transcriptional Profiles from Paired Normal Samples Offer Complementary Information on Cancer Patient Survival -- Evidence from TCGA Pan-Cancer Data Transcriptional Profiles from Paired Normal Samples Offer Complementary Information on Cancer Patient Survival -- Evidence from TCGA Pan-Cancer Data Supplementary Materials Xiu Huang, David Stern, and

More information

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015 Goals/Expectations Computer Science, Biology, and Biomedical (CoSBBI) We want to excite you about the world of computer science, biology, and biomedical informatics. Experience what it is like to be a

More information

Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas

Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas Article Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas Graphical Abstract Authors Franz X. Schaub, Varsha Dhankani, Ashton C. Berger,..., Brady Bernard,

More information

SUPPLEMENTARY APPENDIX

SUPPLEMENTARY APPENDIX SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature

More information

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Supplementary Materials and Methods Phylogenetic tree of the HMT superfamily The phylogeny outlined in the

More information

Ovarian Cancers: Evolving Paradigms in Research and Care Report Release Wednesday, March 2, 2016

Ovarian Cancers: Evolving Paradigms in Research and Care Report Release Wednesday, March 2, 2016 Ovarian Cancers: Evolving Paradigms in Research and Care Report Release Wednesday, March 2, 2016 Committee Membership Jerome F. Strauss, III, (Chair), Virginia Commonwealth University School of Medicine

More information

LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer

LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer Original Article LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer patients Xiaoshun Shi 1 *, Yusong Chen 2,3 *, Allen M.

More information

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant Reporting Period July 1, 2012 June 30, 2013 Formula Grant Overview The National Surgical

More information

HALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA LEO TUNKLE *

HALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA   LEO TUNKLE * CERNA SEARCH METHOD IDENTIFIED A MET-ACTIVATED SUBGROUP AMONG EGFR DNA AMPLIFIED LUNG ADENOCARCINOMA PATIENTS HALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA Email:

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature10866 a b 1 2 3 4 5 6 7 Match No Match 1 2 3 4 5 6 7 Turcan et al. Supplementary Fig.1 Concepts mapping H3K27 targets in EF CBX8 targets in EF H3K27 targets in ES SUZ12 targets in ES

More information

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well plate at cell densities ranging from 25-225 cells in

More information

LinkedOmics. A Web-based platform for analyzing cancer-associated multi-dimensional data. Manual. First edition 4 April 2017 Updated on July 3, 2017

LinkedOmics. A Web-based platform for analyzing cancer-associated multi-dimensional data. Manual. First edition 4 April 2017 Updated on July 3, 2017 LinkedOmics A Web-based platform for analyzing cancer-associated multi-dimensional data Manual First edition 4 April 2017 Updated on July 3, 2017 LinkedOmics is a publicly available portal (http://linkedomics.org/)

More information

MolEcular Taxonomy of BReast cancer International Consortium (METABRIC)

MolEcular Taxonomy of BReast cancer International Consortium (METABRIC) PERSPECTIVE 1 LARGE SCALE DATASET EXAMPLES MolEcular Taxonomy of BReast cancer International Consortium (METABRIC) BC Cancer Agency, Vancouver Samuel Aparicio, PhD FRCPath Nan and Lorraine Robertson Chair

More information

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075 Aiello & Alter (216) PLoS One vol. 11 no. 1 e164546 S1 Appendix A-1 S1 Appendix: Figs A G and Table A a Tumor Generalized Fraction b Normal Generalized Fraction.25.5.75.25.5.75 1 53 4 59 2 58 8 57 3 48

More information

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK CHAPTER 6 DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK Genetic research aimed at the identification of new breast cancer susceptibility genes is at an interesting crossroad. On the one hand, the existence

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Pan-cancer analysis of global and local DNA methylation variation a) Variations in global DNA methylation are shown as measured by averaging the genome-wide

More information

10/15/2012. Biologic Subtypes of TNBC. Topics. Topics. Histopathology Molecular pathology Clinical relevance

10/15/2012. Biologic Subtypes of TNBC. Topics. Topics. Histopathology Molecular pathology Clinical relevance Biologic Subtypes of TNBC Andrea L. Richardson M.D. Ph.D. Brigham and Women s Hospital Dana-Farber Cancer Institute Harvard Medical School Boston, MA Topics Histopathology Molecular pathology Clinical

More information

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS APPLICATION NOTE Fluxion Biosciences and Swift Biosciences OVERVIEW This application note describes a robust method for detecting somatic mutations from liquid biopsy samples by combining circulating tumor

More information

Evaluation of BRCA1/2 and homologous recombination defects in ovarian cancer and impact on clinical outcomes

Evaluation of BRCA1/2 and homologous recombination defects in ovarian cancer and impact on clinical outcomes Evaluation of BRCA1/2 and homologous recombination defects in ovarian cancer and impact on clinical outcomes Melinda S. Yates, PhD Department of Gynecologic Oncology & Reproductive Medicine University

More information

Computational Investigation of Homologous Recombination DNA Repair Deficiency in Sporadic Breast Cancer

Computational Investigation of Homologous Recombination DNA Repair Deficiency in Sporadic Breast Cancer University of Massachusetts Medical School escholarship@umms Open Access Articles Open Access Publications by UMMS Authors 11-16-2017 Computational Investigation of Homologous Recombination DNA Repair

More information

The Tandem Duplicator Phenotype Is a Prevalent Genome-Wide Cancer Configuration Driven by Distinct Gene Mutations

The Tandem Duplicator Phenotype Is a Prevalent Genome-Wide Cancer Configuration Driven by Distinct Gene Mutations Article The Tandem Duplicator Phenotype Is a Prevalent Genome-Wide Cancer Configuration Driven by Distinct Gene Mutations Graphical Abstract Authors Francesca Menghi, Floris P. Barthel, Vinod Yadav,...,

More information

Biomarker development in the era of precision medicine. Bei Li, Interdisciplinary Technical Journal Club

Biomarker development in the era of precision medicine. Bei Li, Interdisciplinary Technical Journal Club Biomarker development in the era of precision medicine Bei Li, 23.08.2016 Interdisciplinary Technical Journal Club The top ten highest-grossing drugs in the United States help between 1 in 25 and 1 in

More information

Molecular classification of breast cancer implications for pathologists. Sarah E Pinder

Molecular classification of breast cancer implications for pathologists. Sarah E Pinder Molecular classification of breast cancer implications for pathologists Sarah E Pinder Courtesy of CW Elston Histological types Breast Cancer Special Types 17 morphological special types 25-30% of all

More information

Nature Medicine: doi: /nm.4439

Nature Medicine: doi: /nm.4439 Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants identification in 547 additional validation samples. Somatic variants

More information

Biomarkers in Imunotherapy: RNA Signatures as predictive biomarker

Biomarkers in Imunotherapy: RNA Signatures as predictive biomarker Biomarkers in Imunotherapy: RNA Signatures as predictive biomarker Joan Carles, MD PhD Director GU, CNS and Sarcoma Program Department of Medical Oncology Vall d'hebron University Hospital Outline Introduction

More information

MSI positive MSI negative

MSI positive MSI negative Pritchard et al. 2014 Supplementary Figure 1 MSI positive MSI negative Hypermutated Median: 673 Average: 659.2 Non-Hypermutated Median: 37.5 Average: 43.6 Supplementary Figure 1: Somatic Mutation Burden

More information

Case presentation 04/13/2017. Genomic/morphological classification of endometrial carcinoma

Case presentation 04/13/2017. Genomic/morphological classification of endometrial carcinoma Genomic/morphological classification of endometrial carcinoma Robert A. Soslow, MD soslowr@mskcc.org architecture.about.com Case presentation 49 year old woman with vaginal bleeding Underwent endometrial

More information

Supplemental Information. Molecular, Pathological, Radiological, and Immune. Profiling of Non-brainstem Pediatric High-Grade

Supplemental Information. Molecular, Pathological, Radiological, and Immune. Profiling of Non-brainstem Pediatric High-Grade Cancer Cell, Volume 33 Supplemental Information Molecular, Pathological, Radiological, and Immune Profiling of Non-brainstem Pediatric High-Grade Glioma from the HERBY Phase II Randomized Trial Alan Mackay,

More information

Integration of Cancer Genome into GECCO- Genetics and Epidemiology of Colorectal Cancer Consortium

Integration of Cancer Genome into GECCO- Genetics and Epidemiology of Colorectal Cancer Consortium Integration of Cancer Genome into GECCO- Genetics and Epidemiology of Colorectal Cancer Consortium Ulrike Peters Fred Hutchinson Cancer Research Center University of Washington U01-CA137088-05, PI: Peters

More information

Figure S4. 15 Mets Whole Exome. 5 Primary Tumors Cancer Panel and WES. Next Generation Sequencing

Figure S4. 15 Mets Whole Exome. 5 Primary Tumors Cancer Panel and WES. Next Generation Sequencing Figure S4 Next Generation Sequencing 15 Mets Whole Exome 5 Primary Tumors Cancer Panel and WES Get coverage of all variant loci for all three Mets Variant Filtering Sequence Alignments Index and align

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma.

Nature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma. Supplementary Figure 1 Mutational signatures in BCC compared to melanoma. (a) The effect of transcription-coupled repair as a function of gene expression in BCC. Tumor type specific gene expression levels

More information

Genomic tests to personalize therapy of metastatic breast cancers. Fabrice ANDRE Gustave Roussy Villejuif, France

Genomic tests to personalize therapy of metastatic breast cancers. Fabrice ANDRE Gustave Roussy Villejuif, France Genomic tests to personalize therapy of metastatic breast cancers Fabrice ANDRE Gustave Roussy Villejuif, France Future application of genomics: Understand the biology at the individual scale Patients

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 20

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant Reporting Period July 1, 2011 June 30, 2012 Formula Grant Overview The National Surgical

More information

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SUPPLEMENTARY FIGURES: Supplementary Figure 1 SUPPLEMENTARY FIGURES: Supplementary Figure 1 Supplementary Figure 1. Glioblastoma 5hmC quantified by paired BS and oxbs treated DNA hybridized to Infinium DNA methylation arrays. Workflow depicts analytic

More information

Genomic Medicine: What every pathologist needs to know

Genomic Medicine: What every pathologist needs to know Genomic Medicine: What every pathologist needs to know Stephen P. Ethier, Ph.D. Professor, Department of Pathology and Laboratory Medicine, MUSC Director, MUSC Center for Genomic Medicine Genomics and

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2

Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2 APPLIED PROBLEMS Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2 A. N. Polovinkin a, I. B. Krylov a, P. N. Druzhkov a, M. V. Ivanchenko a, I. B. Meyerov

More information

Supplementary Figure 1: Comparison of acgh-based and expression-based CNA analysis of tumors from breast cancer GEMMs.

Supplementary Figure 1: Comparison of acgh-based and expression-based CNA analysis of tumors from breast cancer GEMMs. Supplementary Figure 1: Comparison of acgh-based and expression-based CNA analysis of tumors from breast cancer GEMMs. (a) CNA analysis of expression microarray data obtained from 15 tumors in the SV40Tag

More information

Clustered mutations of oncogenes and tumor suppressors.

Clustered mutations of oncogenes and tumor suppressors. Supplementary Figure 1 Clustered mutations of oncogenes and tumor suppressors. For each oncogene (red dots) and tumor suppressor (blue dots), the number of mutations found in an intramolecular cluster

More information

Improving quality of care for patients with ovarian and endometrial cancer Eggink, Florine

Improving quality of care for patients with ovarian and endometrial cancer Eggink, Florine University of Groningen Improving quality of care for patients with ovarian and endometrial cancer Eggink, Florine IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if

More information

Summary... 2 TRANSLATIONAL RESEARCH Tumour gene expression used to direct clinical decision-making for patients with advanced cancers...

Summary... 2 TRANSLATIONAL RESEARCH Tumour gene expression used to direct clinical decision-making for patients with advanced cancers... ESMO 2016 Congress 7-11 October, 2016 Copenhagen, Denmark Table of Contents Summary... 2 TRANSLATIONAL RESEARCH... 3 Tumour gene expression used to direct clinical decision-making for patients with advanced

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 28

More information

Secuenciación masiva: papel en la toma de decisiones

Secuenciación masiva: papel en la toma de decisiones Secuenciación masiva: papel en la toma de decisiones Cancer is a Genetic Disease Development of cancer is driven by the acquisition of somatic genetic alterations: Nonsynonymous point mutations: missense.

More information

The Avatar System TM Yields Biologically Relevant Results

The Avatar System TM Yields Biologically Relevant Results Application Note The Avatar System TM Yields Biologically Relevant Results Liquid biopsies stand to revolutionize the cancer field, enabling early detection and noninvasive monitoring of tumors. In the

More information

SureSelect Cancer All-In-One Custom and Catalog NGS Assays

SureSelect Cancer All-In-One Custom and Catalog NGS Assays SureSelect Cancer All-In-One Custom and Catalog NGS Assays Detect all cancer-relevant variants in a single SureSelect assay SNV Indel TL SNV Indel TL Single DNA input Single AIO assay Single data analysis

More information

Jennifer Hauenstein Oncology Cytogenetics Emory University Hospital Atlanta, GA

Jennifer Hauenstein Oncology Cytogenetics Emory University Hospital Atlanta, GA Comparison of Genomic Coverage using Affymetrix OncoScan Array and Illumina TruSight Tumor 170 NGS Panel for Detection of Copy Number Abnormalities in Clinical GBM Specimens Jennifer Hauenstein Oncology

More information

Development of Carcinoma Pathways

Development of Carcinoma Pathways The Construction of Genetic Pathway to Colorectal Cancer Moriah Wright, MD Clinical Fellow in Colorectal Surgery Creighton University School of Medicine Management of Colon and Diseases February 23, 2019

More information

Looking Beyond the Standard-of- Care : The Clinical Trial Option

Looking Beyond the Standard-of- Care : The Clinical Trial Option 1 Looking Beyond the Standard-of- Care : The Clinical Trial Option Terry Mamounas, M.D., M.P.H., F.A.C.S. Medical Director, Comprehensive Breast Program UF Health Cancer Center at Orlando Health Professor

More information

Cancer develops as a result of the accumulation of somatic

Cancer develops as a result of the accumulation of somatic Cancer-mutation network and the number and specificity of driver mutations Jaime Iranzo a,1, Iñigo Martincorena b, and Eugene V. Koonin a,1 a National Center for Biotechnology Information, National Library

More information

Comprehensive analyses of tumor immunity: implications for cancer immunotherapy

Comprehensive analyses of tumor immunity: implications for cancer immunotherapy Comprehensive analyses of tumor immunity: implications for cancer immunotherapy The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.

More information

Next Generation Sequencing in Clinical Practice: Impact on Therapeutic Decision Making

Next Generation Sequencing in Clinical Practice: Impact on Therapeutic Decision Making Next Generation Sequencing in Clinical Practice: Impact on Therapeutic Decision Making November 20, 2014 Capturing Value in Next Generation Sequencing Symposium Douglas Johnson MD, MSCI Vanderbilt-Ingram

More information

Only Estrogen receptor positive is not enough to predict the prognosis of breast cancer

Only Estrogen receptor positive is not enough to predict the prognosis of breast cancer Young Investigator Award, Global Breast Cancer Conference 2018 Only Estrogen receptor positive is not enough to predict the prognosis of breast cancer ㅑ Running head: Revisiting estrogen positive tumors

More information

Dr Yvonne Wallis Consultant Clinical Scientist West Midlands Regional Genetics Laboratory

Dr Yvonne Wallis Consultant Clinical Scientist West Midlands Regional Genetics Laboratory Dr Yvonne Wallis Consultant Clinical Scientist West Midlands Regional Genetics Laboratory Personalised Therapy/Precision Medicine Selection of a therapeutic drug based on the presence or absence of a specific

More information

Tratamiento neoadyuvante: Enfermedad residual como marcador de resistencia Carlos L. Arteaga, MD Vanderbilt-Ingram Cancer Center Vanderbilt

Tratamiento neoadyuvante: Enfermedad residual como marcador de resistencia Carlos L. Arteaga, MD Vanderbilt-Ingram Cancer Center Vanderbilt Tratamiento neoadyuvante: Enfermedad residual como marcador de resistencia Carlos L. Arteaga, MD Vanderbilt-Ingram Cancer Center Vanderbilt University Neoadjuvant (preoperative) therapy Surgery Systemic

More information

Comparative DNA methylome analysis of endometrial carcinoma reveals complex and distinct deregulation of cancer promoters and enhancers

Comparative DNA methylome analysis of endometrial carcinoma reveals complex and distinct deregulation of cancer promoters and enhancers Zhang et al. BMC Genomics 2014, 15:868 RESEARCH ARTICLE Open Access Comparative DNA methylome analysis of endometrial carcinoma reveals complex and distinct deregulation of cancer promoters and enhancers

More information

Clinically Useful Next Generation Sequencing and Molecular Testing in Gliomas MacLean P. Nasrallah, MD PhD

Clinically Useful Next Generation Sequencing and Molecular Testing in Gliomas MacLean P. Nasrallah, MD PhD Clinically Useful Next Generation Sequencing and Molecular Testing in Gliomas MacLean P. Nasrallah, MD PhD Neuropathology Fellow Division of Neuropathology Center for Personalized Diagnosis (CPD) Glial

More information

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland AWARD NUMBER: W81XWH-13-1-0421 TITLE: The Fanconi Anemia BRCA Pathway as a Predictor of Benefit from Bevacizumab in a Large Phase III Clinical Trial in Ovarian Cancer PRINCIPAL INVESTIGATOR: Elizabeth

More information

IPATIMUP Translational Research Unit Matching IPATIMUP s Expertise with Company s Research/Clinical Strategy

IPATIMUP Translational Research Unit Matching IPATIMUP s Expertise with Company s Research/Clinical Strategy IPATIMUP Translational Research Unit Matching IPATIMUP s Expertise with Company s Research/Clinical Strategy Innovative Projects Questions & Needs André Albergaria INDUSTRY Pharma R&D Clinical Research

More information

What is the status of the technologies of "precision medicine?

What is the status of the technologies of precision medicine? Session 2: What is the status of the technologies of "precision medicine? Gideon Blumenthal, MD, Clinical Team Leader, Thoracic and Head/Neck Oncology, Center for Drug Evaluation and Research (CDER), U.S.

More information

Integrated Molecular Characterization of Uterine Carcinosarcoma

Integrated Molecular Characterization of Uterine Carcinosarcoma Article Integrated Molecular Characterization of Uterine Carcinosarcoma Highlights d Uterine carcinosarcomas (UCSs) have frequent TP53 mutations d d d UCSs demonstrate a strong and varied degree of epithelialmesenchymal

More information

TCGA. The Cancer Genome Atlas

TCGA. The Cancer Genome Atlas TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue

More information

Supplementary. properties of. network types. randomly sampled. subsets (75%

Supplementary. properties of. network types. randomly sampled. subsets (75% Supplementary Information Gene co-expression network analysis reveals common system-level prognostic genes across cancer types properties of Supplementary Figure 1 The robustness and overlap of prognostic

More information

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes Kaifu Chen 1,2,3,4,5,10, Zhong Chen 6,10, Dayong Wu 6, Lili Zhang 7, Xueqiu Lin 1,2,8,

More information

September 20, Submitted electronically to: Cc: To Whom It May Concern:

September 20, Submitted electronically to: Cc: To Whom It May Concern: History Study (NOT-HL-12-147), p. 1 September 20, 2012 Re: Request for Information (RFI): Building a National Resource to Study Myelodysplastic Syndromes (MDS) The MDS Cohort Natural History Study (NOT-HL-12-147).

More information

ENODMETRIAL CARCINOMA: SPECIAL & NOT SO SPECIAL VARIANTS

ENODMETRIAL CARCINOMA: SPECIAL & NOT SO SPECIAL VARIANTS ENODMETRIAL CARCINOMA: SPECIAL & NOT SO SPECIAL VARIANTS Pacific Northwest Society of Pathologists Vancouver, B.C. September 26, 2015 Teri A. Longacre, M.D. longacre@stanford.edu Stanford University, Stanford,

More information

Clonal evolution of human cancers

Clonal evolution of human cancers Clonal evolution of human cancers -Pathology-based microdissection and genetic analysis precisely demonstrates molecular evolution of neoplastic clones- Hiroaki Fujii, MD Ageo Medical Laboratories, Yashio

More information

Identification and clinical detection of genetic alterations of pre-neoplastic lesions Time for the PML ome? David Sidransky MD Johns Hopkins

Identification and clinical detection of genetic alterations of pre-neoplastic lesions Time for the PML ome? David Sidransky MD Johns Hopkins Identification and clinical detection of genetic alterations of pre-neoplastic lesions Time for the PML ome? David Sidransky MD Johns Hopkins February 3-5, 2016 Lansdowne Resort, Leesburg, VA Molecular

More information

Breast Cancer: ASCO Poster Review

Breast Cancer: ASCO Poster Review Breast Cancer: ASCO Poster Review Carmen Criscitiello, MD, PhD Istituto Europeo di Oncologia Milano HER2+ SUBTYPE Research questions in early HER2+ BC De-escalation of toxicity without compromising efficacy

More information

Predictive biomarker profiling of > 1,900 sarcomas: Identification of potential novel treatment modalities

Predictive biomarker profiling of > 1,900 sarcomas: Identification of potential novel treatment modalities Predictive biomarker profiling of > 1,900 sarcomas: Identification of potential novel treatment modalities Sujana Movva 1, Wenhsiang Wen 2, Wangjuh Chen 2, Sherri Z. Millis 2, Margaret von Mehren 1, Zoran

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature15260 Supplementary Data 1: Gene expression in individual basal/stem, luminal, and luminal progenitor cells. Box plots show expression levels for each gene from the 49-gene differentiation

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Sample selection procedure and TL ratio across cancer.

Nature Genetics: doi: /ng Supplementary Figure 1. Sample selection procedure and TL ratio across cancer. Supplementary Figure 1 Sample selection procedure and TL ratio across cancer. (a) Flowchart of sample selection. After exclusion of unsuitable samples, 18,430 samples remained. Tumor and normal samples

More information