doi:10.1038/nature10866 a b 1 2 3 4 5 6 7 Match No Match 1 2 3 4 5 6 7 Turcan et al. Supplementary Fig.1 Concepts mapping H3K27 targets in EF CBX8 targets in EF H3K27 targets in ES SUZ12 targets in ES EED targets in EF SUZ12 targets in EF PRC targets in ES 0 10 20 30 40 -log 10 (Q-value) Supplementary Figure 1 Differential methylation in IDH mutant human astrocytes. (a) Box plots of relative methylation levels in passage 40 (P40) astrocytes expressing wild-type IDH1, mutant IDH1 (R132H), or neither. The average methylation levels of variant genes are shown. Data from two technical replicates for each of two biological replicates is shown. Error bars indicate 1 standard deviation (SD). (b) Concordance between PRC2 targets and genes methylated following introduction of mutant IDH1. Concepts transcriptional module mapping of genes with PRC2 occupancy with loci that are hypermethylated in astrocytes expressing mutant IDH1. Data is from passage 40. Each row shows individual PCR2 occupancy gene sets. Red=matching gene. EF=embryonic fibroblasts. ES=embryonic stem cells. WWW.NATURE.COM/NATURE 1
a Consensus CDF Delta area Change in Gini CDF Proportion increase delta(gini) Consensus index value # clusters # clusters b 2 WWW.NATURE.COM/NATURE
c 100% 100% 90% 90% 80% 80% 70% 70% 60% 50% Not Defined CpG Shelf 60% 50% Not Defined Body 5' UTR 40% CpG Shore 40% 3' UTR 30% 20% CpG Island 30% 20% TSS 200-1500bp TSS <200bp 10% 10% 0% Hypermethylated Promoter promoterassociated Associated Probes probes 0% Promoter Associated Hypermethylated gene body- Probes associated probes d Distribution of CpG sites among hypermethylated genes WWW.NATURE.COM/NATURE 3
e f 4 WWW.NATURE.COM/NATURE
g h WWW.NATURE.COM/NATURE 5
Supplementary Figure 2 K-means statistical output and validation of differentially methylated regions in human tumors. (a) Statistical output from K-means consensus clustering analysis on 81 glioma samples. Plot showing cumulative distribution function (CDF) for k=2 to k=5. The shape of the CDF plot allows one to determine the ideal k. Delta area curves are shown for each cluster number. Curve for each cluster number is shown as labeled in the legend. The corresponding Gini coefficient is listed in each case. The change in Gini curve is displayed, showing the change in Gini coefficient as a function of cluster number. Gini index reflects the area of the Lorenz curve. (b) Relative methylation of the genes analyzed in Fig. 2b in G-CIMP+ versus G-CIMP tumors. Whisker box plots summarizing each data set are shown (mean ± standard deviation (SD)). The boxes delineate the 25th to 75th percentile range. The P value indicates significance determined using ANOVA. Data from top 2% most variant probes. (c) The plot on the left shows the breakdown of hypermethylated probes that are located in CpG islands, shores, shelves, or not defined. The plot on the right shows breakdown of gene body-associated probes. TSS=transcriptional start site, UTR=untranslated region. (d) The pie chart indicates the relative locations (and overlap) of the hypermethylated CpG sites. The percentage of hypermethylated CpGs located in shores, islands, both or neither are shown. EpiTYPER was used to validate CpG island methylation states of parental and mutant IDH1 astrocytes at passages 2 and 50 (e); in CIMP+ and CIMP- LGGs (f); and shore methylation states in parental, mutant IDH1 astrocytes and CIMP+/CIMP- LGGs (g). Overall, these examples include (1) unmethylated genes that have undergone de novo methylation following introduction of mutant IDH1 in astrocytes, (2) genes with low levels of methylation that develop high methylation levels following introduction of mutant IDH1, and (3) representative genes from the two groups above that are methylated in CIMP+ LGGs. Each circle indicates a CpG dinucleotide. The percentage of methylation is shown by the color scale in the legend. The genomic location is noted. PAR, parental; Mut, IDH1 mutant; IVD, in vitro methylated DNA (methylated control); WGA, whole genome amplified DNA (unmethylated control). (h) Frequencies of IDH1 and IDH 2 mutations in G-CIMP+ and G-CIMPtumors. 6 WWW.NATURE.COM/NATURE
0 p<0.001 0 2 4 6 Time (years) Supplementary Figure 3 Survival benefit of CIMP+ phenotype in MSKCC dataset. Kaplan-Meier survival curve of MSKCC patients (n=72) with astrocytomas (n=31)(left), and grade 3 gliomas (n=43)(right). P values calculated by log-rank. MSKCC Cohort CIMP by Age 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 40 yrs 40-50 yrs 50-60 yrs >60 yrs P=0.005 CIMP Negative CIMP Positive Supplementary Figure 4 CIMP distribution by age of patient. Stacked column chart showing CIMP phenotype by age in 81 patients. CIMP positive phenotype is associated with younger age. P value calculated by chi-squared test WWW.NATURE.COM/NATURE 7
MSKCC Cohort CIMP by Histology 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% CIMP Negative CIMP Positive P<0.005 Supplementary Figure 5 CIMP distribution by histology. Stacked column chart showing CIMP status by histology in 81 patients. Oilgodendroglioma histology is associated with the greatest incidence of CIMP positive phenotype. P value calculated by chi-squared test. CIMP by Histology and Grade 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% CIMP Negative CIMP Positive Supplementary Figure 6 CIMP distribution by histology and grade. Stacked column chart showing CIMP status by grade and histology. 8 WWW.NATURE.COM/NATURE
MSKCC LGG samples TCGA GBM samples Idh1 mut vs idh1 wt Idh1 wt vs idh1 mut # of perm. ES NES Nom p-val FDR q-val FWER p-val 10000 1 1.74 0 0 0 idh1 mut vs idh1 wt idh1 wt vs idh1 mut # of perm ES NES Nom p-val FDR q-val FWER p-val 1000 3 1.96 0 0 0 10000 3 1.94 0 0 0 1000-3 - 1.95 8.9E- 4 8.9E- 4 3.0E- 4 10000-3 - 1.94 0 0 0 Supplementary Figure 7 Enrichment of hypermethylated probes derived from IDH1 mutant astrocytes. 730 of the most hypermethylated probes in IDH1 mutant expressing astrocytes were significantly enriched in IDH1 mutant MSKCC lower grade gliomas (left, 1 panel). The GSEA enrichment scores and their significance are noted for 10,000 permutations. The same hypermethylated probes were positively enriched in IDH1 mutant TCGA GBM tumor samples when compared to IDH1 wild-type GBM samples. Similarly, these probes are not methylated in IDH1 wild-type TCGA GBM tumors when compared to IDH1 mutant TCGA GBM tumors (right, 2 panels). GSEA enrichment scores for 1,000 and 10,000 iterations are shown. Perm, permutations; ES, enrichment score; NES, normalized enrichment score, NOM, nominal; FDR; false discovery rate; FWER, family-wise error rate. Genes Glioma samples CIMP status CIMP+ CIMP- Supplementary Figure 8 Accuracy of 17 gene signature in predicting CIMP. Two-dimensional unsupervised clustering using the probes derived from the 17 gene signature divides the MSKCC cohort into two clusters, representing a predicted CIMP+ and predicted CIMP- phenotype based on expression data. The blue and red labels on the dendrogram represent CIMP based on methylation data. WWW.NATURE.COM/NATURE 9
Genes Glioma samples CIMP status Predicted CIMP+ Predicted CIMP- Supplementary Figure 9 Identification of predicted CIMP groups in Rembrandt validation cohort. Two dimensional unsupervised clustering using the probes derived from the 17 gene signature clusters the 115 patients with grade II and grade III glioma from the Rembrandt database into two groups, classified as predicted CIMP+ and predicted CIMPphenotype. Cluster A Cluster B Astrocytoma Age>50 Grade 2 Cumulative Survival 0 p=0.003 0 2 4 6 8 Time (years) Cumulative Survival 0 P<0.001 0 2 4 6 Time (years) Cumulative Survival 0 P<0.001 0 2 4 6 8 Time (years) Supplementary Figure 10 Survival benefit of CIMP+ phenotype in Rembrandt dataset. Kaplan-Meier survival curve shows an improvement in survival in patients predicted to be CIMP+ in the Rembrandt validation dataset. Displayed are survival curves demonstrating the survival benefit in astrocytoma, tumors with grade II histology, age >50. P value calculated by log-rank. 10 WWW.NATURE.COM/NATURE
a Genes Proneural TCGA samples IDH1 Wild-type Mutant Cluster A B b Survival (probability) 0 Proneural p=0.001 0 p=0.132 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Neural Cluster A Cluster B Classical Mesenchymal Survival (probability) 0 p=92 0 5 10 15 20 25 30 Time (months) 0 p=04 0 5 10 15 20 25 30 Time (months) WWW.NATURE.COM/NATURE 11
Supplementary Figure 11 Classification of TCGA GBM data using mutant IDH1 gene signature (a) Gene signature (17 genes) derived from mutant IDH1 expressing astrocytes identifies two distinct groups within proneural group of GBMs (TCGA tumor set). Heatmap for proneural GBMs generated from unsupervised 2Dhierarchical clustering. Color scale indicates normalized expression level. (b) Kaplan-Meier survival curves for GBMs that either have (red) or do not have (black) the 17 gene signature for IDH1 mutation. Results are shown for each GBM transcriptional subgroup. Subgroup 1 (green, from a), which contains all IDH1 mutants, was associated with improved survival. No subgroups associated with differing prognosis were identified in other TCGA subgroups. 12 WWW.NATURE.COM/NATURE
Supplementary Figure 12 Concordance of transcriptional programs regulated by mutant IDH1 in astrocytes and G-CIMP in LGGs. Analyses of enriched transcriptional programs in G-CIMP LGGs and mutant IDH1 astrocytes by PANTHER as described in Methods. P value for significance is shown along the x-axis. Yellow lines indicate threshold of significance (P=0.05). a b IDH1 WT IDH1 R132H Expression fold change (IDH1 mutant / parental) BMI1 RIF1 DICER1 LIG4 NES Supplementary Figure 13 Expression of mutant IDH1 promotes a neurosphere phenotype and induces expression of stem cell-associated genes. (a) Astrocytes (P15) that express IDH R132H or isogenic controls were grown in media that supports neural stem cell growth as in Methods. Experiment was performed in triplicate. (b) Introduction of mutant IDH1 into astrocytes induces expression of multiple glioma stem cell-associated markers. Data from expression microarrays are shown. Error bars indicate 1 SD. Yellow line marks equality. Passage 27 cells were used in each case. WWW.NATURE.COM/NATURE 13
TET2 negative TET2 positive % of Max FLAG FSC 5-hydroxymethylcytosine Supplementary Figure 14 Increased 5-methylcytosine hydroxylation by TET2. TET2 transfected HEK 293T cells were gated as TET2 positive or TET2 negative by FLAG antibody. 5-hydroxymethylcytosine levels are overlaid as histograms for both TET2 positive and negative populations. Increased 5-hydroxymethylcytosine levels are observed in the TET2 positive population. 14 WWW.NATURE.COM/NATURE