Nature Medicine: doi: /nm.4439

Similar documents
Molecular Markers in Acute Leukemia. Dr Muhd Zanapiah Zakaria Hospital Ampang

Supplementary Appendix

Examining Genetics and Genomics of Acute Myeloid Leukemia in 2017

New drugs in Acute Leukemia. Cristina Papayannidis, MD, PhD University of Bologna

Concomitant WT1 mutations predicted poor prognosis in CEBPA double-mutated acute myeloid leukemia

Acute leukemia and myelodysplastic syndromes

Nature Methods: doi: /nmeth.3115

Cancer Informatics Lecture

SUPPLEMENTARY INFORMATION

Nature Genetics: doi: /ng Supplementary Figure 1. HOX fusions enhance self-renewal capacity.

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute

Mutational Impact on Diagnostic and Prognostic Evaluation of MDS

Nature Genetics: doi: /ng Supplementary Figure 1. Somatic coding mutations identified by WES/WGS for 83 ATL cases.

Reporting cytogenetics Can it make sense? Daniel Weisdorf MD University of Minnesota

BCR ABL1 like ALL: molekuliniai mechanizmai ir klinikinė reikšmė. IKAROS delecija: molekulinė biologija, prognostinė reikšmė. ASH 2015 naujienos

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes.

Corporate Medical Policy. Policy Effective February 23, 2018

Supplementary Figure 1. Estimation of tumour content

Illumina Trusight Myeloid Panel validation A R FHAN R A FIQ

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

SUPPLEMENTARY FIGURES: Supplementary Figure 1

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Impact of Biomarkers in the Management of Patients with Acute Myeloid Leukemia

Nature Structural & Molecular Biology: doi: /nsmb.2419

Session 4 Rebecca Poulos

About OMICS Group Conferences

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

Molecular Genetics of Paediatric Tumours. Gino Somers MBBS, BMedSci, PhD, FRCPA Pathologist-in-Chief Hospital for Sick Children, Toronto, ON, CANADA

ADRL Advanced Diagnostics Research Laboratory

Please Silence Your Cell Phones. Thank You

Blastic Plasmacytoid Dendritic Cell Neoplasm with DNMT3A and TET2 mutations (SH )

The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions

Test Name Results Units Bio. Ref. Interval. Positive

Journal: Nature Methods

GENETIC TESTING FOR FLT3, NPM1 AND CEBPA VARIANTS IN CYTOGENETICALLY NORMAL ACUTE MYELOID LEUKEMIA

Objectives. Morphology and IHC. Flow and Cyto FISH. Testing for Heme Malignancies 3/20/2013

Molecular Markers. Marcie Riches, MD, MS Associate Professor University of North Carolina Scientific Director, Infection and Immune Reconstitution WC

Supplemental Material. The new provisional WHO entity RUNX1 mutated AML shows specific genetics without prognostic influence of dysplasia

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

A pediatric patient with acute leukemia of ambiguous lineage with a NUP98-NSD1 rearrangement SH

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

DNA-seq Bioinformatics Analysis: Copy Number Variation

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Frequency(%) KRAS G12 KRAS G13 KRAS A146 KRAS Q61 KRAS K117N PIK3CA H1047 PIK3CA E545 PIK3CA E542K PIK3CA Q546. EGFR exon19 NFS-indel EGFR L858R

Supplemental Information. Molecular, Pathological, Radiological, and Immune. Profiling of Non-brainstem Pediatric High-Grade

Out-Patient Billing CPT Codes

Molecular and genetic alterations associated with therapy resistance and relapse of acute myeloid leukemia

ARTICLE RESEARCH. Macmillan Publishers Limited. All rights reserved

Next Generation Sequencing in Haematological Malignancy: A European Perspective. Wolfgang Kern, Munich Leukemia Laboratory

Expert-guided Visual Exploration (EVE) for patient stratification. Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland

Session 4 Rebecca Poulos

Application of Whole Genome Microarrays in Cancer: You should be doing this test!!

Supplementary Materials for

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

HEMATOLOGIC MALIGNANCIES BIOLOGY

Genomic Medicine: What every pathologist needs to know

Test Name Results Units Bio. Ref. Interval. Positive

Nature Biotechnology: doi: /nbt.1904

Plasma-Seq conducted with blood from male individuals without cancer.

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

All patients with FLT3 mutant AML should receive midostaurin-based induction therapy. Not so fast!

Nature Medicine: doi: /nm.3967

Introduction. Introduction

Changing AML Outcomes via Personalized Medicine: Transforming Cancer Management with Genetic Insight

SUPPLEMENTAL APPENDIX METZELER ET AL.: SPECTRUM AND PROGNOSTIC RELEVANCE OF DRIVER GENE MUTATIONS IN ACUTE MYELOID LEUKEMIA

NGS in tissue and liquid biopsy

SUPPLEMENTARY APPENDIX

Nature Genetics: doi: /ng.2995

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

The Center for PERSONALIZED DIAGNOSTICS

Supplementary Figures

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Nature Genetics: doi: /ng Supplementary Figure 1

Clasificación Molecular del Cáncer de Próstata. JM Piulats

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Global variation in copy number in the human genome

Laboratory Service Report

Molecular Genetic Testing for the Diagnosis of Haematological Malignancies

Supplementary Information. Supplementary Figures

Supplementary Figure 1: Comparison of acgh-based and expression-based CNA analysis of tumors from breast cancer GEMMs.

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The Cancer Genome Atlas & International Cancer Genome Consortium

Acute Myeloid Leukemia Progress at last

Supplementary Figure 1. Copy Number Alterations TP53 Mutation Type. C-class TP53 WT. TP53 mut. Nature Genetics: doi: /ng.

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

August 17, Dear Valued Client:

Expanded View Figures

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Nature Immunology: doi: /ni Supplementary Figure 1. Characteristics of SEs in T reg and T conv cells.

TCF3 breakpoints of TCF3-PBX1 (patients 1a 5a) and TCF3-HLF (patients 6a 9a and11a) translocations.

TEST MENU TEST CPT CODES TAT. Chromosome Analysis Bone Marrow x 2, 88264, x 3, Days

Supplemental Figure legends

Predicting clinical outcomes in neuroblastoma with genomic data integration

p.r623c p.p976l p.d2847fs p.t2671 p.d2847fs p.r2922w p.r2370h p.c1201y p.a868v p.s952* RING_C BP PHD Cbp HAT_KAT11

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

Molecular. Oncology & Pathology. Diagnostic, Prognostic, Therapeutic, and Predisposition Tests in Precision Medicine. Liquid Biopsy.

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Future Targets for Acute Myeloid Leukemia

Transcription:

Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants identification in 547 additional validation samples. Somatic variants (SNVs, indels, focal and chromosome-armlevel CNVs, and fusion products) were first called in 197 diagnostic samples with remission DNA (for germline) using a Complete Genomics custom Whole Genome Sequencing (WGS) variant calling pipeline. Complete Genomics calls were optimized at the start of the TARGET project using 100 independentlyverified variants in WGS samples. Matched tumor and remission samples in 153 cases were used for somatic variant calling by both WGS and targeted capture sequencing (TCS) of genes recurrently impacted in the WGS samples. 72% of WGS SNVs, and 76% of WGS indels were confirmed by TCS (red & green text in figures). For focal copy number (CN) alterations spanning fewer than 7 genes, 75% of recurrent WGS deletion/loss and 85% gain/amplification calls matched recurrent alterations discovered by SNP6 arrays in 96 matching samples. For chromosomal junctions, we integrated WGS, clinical and RNA-seq data by majority vote, and confirmed 89% of WGS calls. An additional 29 samples from the WGS discovery cohort were verified by TCS of diagnostic cases only, as part of 146 tumors without matched remission (see top portion of the figure). The remainder of these 146 cases were not used for discovery or validation purposes, rather, we simply identified recurrence of variants that were observed and verified in other samples.

Figure S2. Cellular processes and pathways commonly impacted in pediatric AML. The height of each bar indicates the percentage of samples with verified fusions (green), SNVs/indels (grey), or focal CNVs (gold) in recurrently impacted genes within 684 pediatric AML samples. See Table S2b for a list of the impacted genes.

Figure S3. Data type overlap for TARGET and TCGA diagnostic samples. UpSet plots (http://www.caleydo.org/tools/upset/) showing the set overlaps for whole genome sequencing (WGS), whole exome (WXS), mrna sequencing, DNA methylation arrays (CpGmeth), mirna sequencing and targeted capture sequencing (TCS) in the TARGET and TCGA cohorts. The numbers of assays analyzed for each type are indicated by the horizontal bar graphs and number in the set intersection is illustrated in vertical bar graphs. The Clinical category includes samples comprising the entire TARGET AML dataset, including those in TARGET AML subprojects (e.g. previously reported WXS analysis 6 ). Data from these samples are included in the chromosomal arm level and karyotype based assessments of copy loss and fusions. (a) All TARGET AML project samples available. (b) All TCGA samples used for comparisons to TARGET. (c) Assay type overlaps for TCGA and TARGET data combined. a TARGET AML assay overlap (n=1023) * Clinical annotations include ISCN karyotype b TCGA assay overlap (n=177) c Combined TARGET & TCGA assay data overlap (n=1200)

Figure S4. Clonality estimates are consistent by age across cohorts. Both TCGA and TARGET AML cohorts contain affected individuals between the ages of 15 and 39 (adolescent and young adult, or AYA). Mutational and karyotypic clonality were assessed in AYA patients with whole-genome or whole-exome sequencing from either cohort, resulting in estimates from 40 TARGET AML subjects and 22 TCGA AML subjects in this age group. No significant association between cohort and mutational clonality estimate (p = 0.79613, Fisher s exact test) or karyotypic clonality (p= 0.180302, Fisher s exact test) is observed (TCGA AYA cases are older and more likely to have normal karyotype, though not significantly so). A multivariate Poisson model similarly shows little evidence for a significant cohort-wise effect. The strongest predictor of (decreasing) mutational clonality in AYAs is age at diagnosis (p=0.28).

Figure S5. The context of genome-wide mutation burden in pediatric AML. The mutational burden of SNV and indels is low in pediatric AML (blue), with a median of 10 mutations/case across the 197 sample WGS cohort. This places pediatric AML, along with other pediatric malignancies (rhabdoid tumor, Ewing sarcoma, medulloblastoma) and adult AML (red) among the least mutated of human cancers. Figure reproduced from the raw data reported by Lawrence and colleagues 51 updated to reflect TARGET AML results, plotted using the ggplot2 package in the R statistical environment.

Figure S6. A simplified visualization of common genomic variants in TARGET and TCGA AML data. Selected small variants are grouped by those that appear distinctive from core binding factor (CBF; t(8;21) and inv(16)) and KMT2A (aka MLL) fusions (grp1: mutations of WT1, NPM1, PTPN11, GATA2, CEBPA) and those that frequently cooccur with CBF alterations (grp2: mutations of KIT or ASXL2, loss of chr X). C, chromosomal alteration; J, junction/translocation; M, mutation; I, ITD. Pediatric Adult CBF grp1.var KMT2A FLT3 grp2.var NRAS KRAS ZEB2 MBNL1 grp1.var DNMT3A IDH2 IDH1 CBF TET2 TP53 NRAS grp2.var KMT2A KRAS

Figure S7. Adult-Pediatric mutational contrasts in AML. Lollipop plots generated with ProteinPaint (https://pecan.stjude.org/#/proteinpaint) highlight differences in frequency, type, and location of sequence variants in pediatric and adult AML. The plotted data reflect all somatic coding variants identified at presentation in 177 TCGA cases and 815 TARGET AML cases (WGS + TCS). Mutations are coded by functional class: blue, missense; brown, insertion; gray, deletion; red, frameshifting; orange, stop-gain; green, tandem duplication. a MYC b GATA2 TARGET TARGET c KRAS d FLT3 TARGET TCGA TARGET TCGA e NRAS f KIT TARGET TCGA TARGET TCGA

Figure S8. The impact of pediatric gene fusions on clinical outcome. (a) 199 patients evaluated for CBFA2T3-GLIS2 fusion had clinical outcome data available for analysis. Those with the fusion (n=9) had significantly worse overall survival than patients without the fusion (n=190) (p=0.0101). (b) 824 patients were evaluated for fusions involving ETS family transcription factors (ETV6, FUS, or ERG) through karyotype and/or transcriptome sequencing and had clinical outcome data available for analysis. Those with fusions (n=20) had significantly worse event-free survival than patients without a fusion (n=804) (p=0.0060). (c) 824 patients were evaluated for fusions involving KAT6A through karyotype and/or transcriptome sequencing and had clinical outcome data available for analysis. Those with fusions (n=8) had significantly worse overall survival than patients without a fusion (n=816) (p=0.0195). Differences in outcome were assessed by log-rank test. EFS, event free survival; OS, overall survival.

Figure S9. Pediatric CBL Exonic Deletions Detected by cdna Fragment Length Analysis. Representative examples of CBL wild-type and deletion transcripts detected by capillary electrophoresis of cdna. Horizontal axis depicts size of the PCR fragment (bp), while vertical axis indicates strength of signal. WT size (full-length transcript) is 685bp, exon 8 deletion only is 563bp, and deletions of exons 8 and 9 is 354bp.

Figure S10. Mutational frequency differences in key myeloid genes. (a) ECOG comparison 4. (b) TCGA comparison, balanced by cytogenetic subtypes (see online Methods). Error bars indicate the empirical SD from the resampling procedure. a b TARGET ECOG TARGET TCGA

Figure S11. Mutational co-occurrence in KMT2A rearranged childhood AML. We identified single copy segmental deletions of ZEB2 and/or MBNL1 in 14 patients, 6 of whom had concurrent KMT2A fusions (p=0.035, Fisher s exact test). The row entitled KMT2A (clinical) shows the manually-curated classification of the tumor primary cytogenetic type by combining results from clinical, genomic and RNA-seq assays. By this measure, all samples are classified as belonging to the KMT2A fusion cytogenetic group. The row entitled KMT2A (WGS) shows KMT2A variants found by WGS alone. Note 2 samples have copy number alterations as well as fusions impacting KMT2A. C, copy number alteration; J, junction/translocation; M, mutation; I, ITD. KMT2A (clinical) KMT2A (WGS) MLLT3 NRAS FLT3 KRAS MLLT10 MBNL1 TMEM14E ZEB2

Figure S12. Clonality at presentation in pediatric AML. (a) Mutation-based inference of clonality in 197 TARGET AML cases with WGS and 177 TCGA AML cases identifies 2 or more detectable clones in the majority of patients across age ranges. (b) A similar pattern with overall fewer detectable clones was observed by karyotypic inference of clonal relationships at presentation. a Infants (age <3) Children (age 3-15) AYA (age 15-40) Adults (age >40) b Mutational clones detected at diagnosis Karyotypic clones detected at diagnosis

Figure S13. Gene variants alone and in combination impact pediatric AML outcomes. (a) 963 patients from the TARGET dataset with clinical results for FLT3 internal tandem duplication (ITD), NPM1, WT1, NUP98-NSD1 fusion had clinical outcome data for analysis. Patients with a combination of FLT3 ITD and WT1 or NUP98-NSD1 versus FLT3 ITD alone or in combination with NPM1 mutation exhibit significantly decreased overall survival (p<0.001). (b) Similar results were found for COG trial AAML0531 (b), COG trial CCG-2961 (c), and the Dutch Childhood Oncology Group (DCOG) (d). In each trial those with FLT3 ITD plus WT1 and/or NUP98-NSD1 fusion exhibit significantly worse overall survival. The exact numbers of patients in each subgroup are indicated in the table below the figures. The total numbers of evaluable patients is indicated in the table below. ITD, FLT3-ITD. Cohort ITD - ITD - NPM1 + ITD - WT1 + ITD - NPM1 + WT1 + ITD - WT1 + NUP98-NSD1 + ITD - NUP98-NSD1 + ITD + ITD + NPM1 + ITD + WT1 + ITD + NPM1 + WT1 + ITD + WT1 + NUP98-NSD1 + ITD + NUP98-NSD1 + ITD - NPM1 + NUP98-NSD1 + TARGET 687 37 56 7 4 0 72 27 27 7 17 21 1 963 AAML 0531 651 41 43 5 3 0 67 28 21 3 12 13 1 888 CCG-2961 435 41 27 2 0 0 17 8 11 2 4 9 0 556 DCOG 225 14 14 0 1 1 28 9 9 0 4 9 0 314 Total

Figure S14. Remission rates vary for pediatric AML with FLT3-ITD according to cooperating mutations. The CCG-2961, AAML0531 and DCOG cohorts were combined to compare complete remission (CR) rates after one cycle of induction therapy for groups with FLT3-ITD cooperating mutations, as shown. CR rates are consistent with the survival outcomes (Figs. 3c and S13) among these studies: the poorest outcome group containing FLT3-ITD and a cooperating WT1 and/or NUP98-NSD1 fusion had the lowest CR rate, at 54.8%. The most favorable group, FLT3-ITD positive, NPM1 positive at 93.0% (groupwise p<0.0001, Kruskal-Wallis).

Figure S15. Novel ZEB2 and MBNL1 Deletions. (a-b) show short (<500 Kbp) deletion segments along chromosomes 2 (panel a, ZEB2) and chromosome 3 (panel b, MBNL1) in TARGET discovery cohort samples (n=197). (c) With the exception of one ZEB2-deleted sample (red point at top right of panel c), samples with ZEB2 and MBNL1 deletions are not impacted by large numbers of other CNVs. a b c

Expression Value ELF1 expression Figure S16. Novel ELF1 focal deletions in the TARGET discovery cohort. (a) Genome browser view of segmental deletions covering the ELF1 locus. Patients (n=197) are in rows, blue bars indicate length of deletion in that genomic region. (b) Genomic deletions were confirmed in a secondary assay using the ncounter CNV assay (Nanostring Techologies), with verification (boxed specimens with low probes signals as identified by green signals in the heatmap below) of all ELF1 deletions initially identified by WGS. (c) Expression values (RPKM) of ELF1 differ between those with the deletion and those with wild-type copy number (p=0.0077). (d) Unsupervised clustering of 63 differentially expressed genes (p<0.01) between patients with and without ELF1 deletion shows many genes are upregulated in the samples with ELF1 deletions. Orange labels on the y axis indicate patients with an ELF1 deletion. a b c d Expression of ELF1 1200 1000 800 600 400 200 0 del ELF1 deletion WT ELF1 WT

Figure S17. Summary view of the key fusion classes in pediatric AML. Each colored region represents a fusion family. Descriptive labels are written adjacent to each family. The fusion partner genes for each family are indicated by their HGNC symbols. The lines connecting gene symbols indicate fusion partners. The thickness of each line reflects the frequency of the observed fusion.

Figure S18. Varying the age cutoff for infants (< 3 years in Figure 4b) vs. children, to < 2 or even < 1, does not substantially alter conclusions about fusion prevalence. Panel c is the same as Fig. 4b (reproduced here for comparison). Panels a and b show how samples shift between age groups if the infant-child threshold is reduced to <2 years (b), or <1 year (a). Fusions are listed in the same order as in 4b and used the same color scheme. a b c Infants Children AYA Adults Infants Children AYA Adults <1 1-15 <2 2-15 Infants <3 Children 3-15 AYA Adults

Figure S19. Co-occurring mutations with CEBPA. (a) Oncoprint (http://www.cbioportal.org) showing all TARGET samples with functionally-validated CSF3R mutations 20. Green indicates samples with mutations. (b) CEBPA and GATA2 mutations combinatorially impact Event-Free Survival. a b Percent survival CEBPA and/or GATA2 in Normal Cyto EFS 100 GATA2 +, CEBPA - (N=7) 80 CEBPA +, GATA2 + (N=16) CEBPA +, GATA2 - (N=13) 60 Wildtype (N=143) 40 P=0.0177 20 0 0 1000 2000 3000 4000 EFS (Days)

Figure S20. Patterns of mutual co-occurrence and mutual exclusion among somatic pediatric AML variants. (a) Patterns of co-occurrence and (b) mutual exclusion among variants in the TARGET cohort were evaluated using CoMEt (see online methods). Line thickness represents log(p-value) for the observed co-occurrence rates. Orange boxes indicate cytogenetic groups. Except for copy number alterations at the top-right, which were only evaluated within the 197 samples with WGS, all other relations are among 684 samples with TCS. (c) An alternative derivation of conditional gene-gene relationships using a penalized Ising model yields similar conditional dependencies. a b

c

Figure S21. Anti-correlated DNA methylation and reduced transcription potential. By scanning 2000 bp upstream and 200 bp downstream of the transcription start site (TSS) for all known ENSEMBL isoforms of ~8000 expressed genes in AML, we fit segmented regression models of DNA methylation (X axis) against asinh (transcripts per million, TPM, Y axis) of each transcript or gene. Hyperbolic arcsine (asinh) is similar to log transformation but is defined at all points along the real number line. Since large batch effects confound the biological differences between TARGET pediatric AML and TCGA adult AML mrna data, we opted to take the within-cohort median expression for samples with 10% or less methylation at a CpG locus, and the silencing threshold at the locus corresponding to the gene of interest was then defined as the methylation fraction beyond which no sample in a cohort exceeded the median unmethylated expression level (from samples with <= 10% methylation) within its cohort. Any locus where healthy progenitors or myeloid cells showed >= 10% methylation was omitted from consideration. After these filtering steps, the most significantly associated locus (ideally correlated with r > 0.8 against its neighboring loci) was then selected as a tag CpG for the downstream transcript(s). A tag CpG for HumanMethylation450 arrays and either the same locus or (if not present) the best surrogate locus for HumanMethylation27 arrays passing the filters was retained for silencing calls. If no suitable HumanMethylation27 locus could be found, only samples with HumanMethylation450 data were assayed for silencing of a given gene. This method identified 119 genes with recurrent silencing by promoter hypermethylation within the TARGET and TCGA datasets. Examples below include THRB and WDR35 (components of NMF signature 2 and 13 signals, respectively), CDKN2B, and ULBP1, ULBP2 and ULBP3 (NK ligands). The red line marks the empirically determined silencing threshold (% methylation).

Figure S22. Integrative analysis of gene mutations, deletion, and transcriptional silencing by promoter methylation. Silencing (gold) or mutation/deletion events (gray) for each gene (rows) are displayed for all assayed patients (columns), with marginal total of events per patient illustrated in the upper histogram. The plotted data reflects 172 TCGA cases and 284 TARGET cases at 119 genes and are outlined in Tables S8-S9. These data represent a complete illustration of the subset shown in Fig. 5a with differences in row/column ordering based on differing clustering solutions for greater numbers of samples and genes. Status silenced mutated Cohort

Figure S23. NMF Deconvolution of genome-wide methylation patterns. DNA methylation signatures derived by non-negative matrix factorization (NMF) and in silico purification. Samples are ordered by hierarchical clustering of signatures (labeled at right) and demonstrate the relative similarity of methylation features from samples within cytogenetic categories (top ribbon). The plotted data are outlined in Table S10 and represent a complete illustration of those shown in Fig 5b. Associations Cohort

Figure S24. Two DNA methylation signatures mark poor prognosis. Kaplan-Meier plots for signatures 2 and 13. After stratifying by cohort and adjusting for both TP53 mutation status and white blood cell count, these two signatures predict significantly (p < 0.05) poorer event-free survival in both pediatric and adult patients with above-median scores. DNA methylation signature #2 DNA methylation signature #13

Consensus matrix Figure S25. Unsupervised Nonnegative Matrix Factorization (NMF) Clustering of mirna Expression. This figure is a fully annotated version of Fig. 6A in the main text. Unsupervised NMF clustering of mirna expression patterns in pediatric AML samples revealed 4 discrete pediatric subgroups (marked by the numbered colored rectangles at the top) that were correlated with specific genomic alterations (indicated by blue bars in the gray annotation rows below the race and FAB category annotations, near the top). 1 2 3 4 Consensus matrix Expression z-

Figure S26. Kaplan-Meier plots for samples expressing low and high levels of mirs let-7a-3p, let-7b-5p and 30a-3p. The expression (RPM) cut point between high and low expression groups for each mirna was defined using the X-tile method 77, where all separation points between patients are considered and the selected cut point is the one that provided the optimal (lowest) EFS log rank p-value. OS, overall survival. P=<0.0001 P=0.0001 P=<0.0001

Figure S27. Kaplan-Meier plots for samples expressing low and high levels of mirs 155-5p, 3614-5p, 4662-5p and 26a-2-3p. The expression (RPM) cut point between high and low expression groups for each mirna was defined using the X-tile method 77. OS, overall survival.

Figure S28. High expression levels of mirs 133a-3p, 212-3p, and 29c-5p have deleterious effects on event free survival (EFS). The expression (RPM) cut point between high and low expression groups for each mirna was defined using the X-tile method 77. EFS, event-free survival.