doi:.38/nature8975 SUPPLEMENTAL TEXT Unique association of HOTAIR with patient outcome To determine whether the expression of other HOX lincrnas in addition to HOTAIR can predict patient outcome, we measured the expression levels of 43 different HOX lincrnas and all 39 HOX coding genes in 78 primary breast tumors from the NKI 295 breast cancer patient cohort. Results confirm the widespread dysregulation of HOX lincrnas in breast cancer (Supplementary Fig. 2). Results from our tiling array had identified a subset of genes that showed a distinct set of HOX coding genes and lincrnas, including HOTAIR, that are variably overexpressed in primary tumors and frequently overexpressed in metastatic samples (Fig. b). This large data set of qrt-pcr expression of multiple HOX coding and lincrnas was utilized to determine if other transcripts highlighted in Fig. b [including HOXC, HOXC, HOXC3, and nc-hoxc-24 (shown by EST mapping to also comprise transcripts labeled nc-hoxc- 26A and nc-hoxc-27a)] were linked to patient outcome. For each transcript, patients with high versus low expression showed no statistically significant difference in overall survival or metastasis free survival (Supplementary Table, Supplementary Fig. 3). A HOTAIR PRC-2 gene set signature can predict patient outcome To determine if the 854 gene set representing promoters with an increase in PRC-2 occupancy upon HOTAIR overexpression (Fig. 3A) can be used as a diagnostic fingerprint for patient outcome, the gene expression of these 854 genes was extracted from the microarray data set of all 295 primary breast tumors from the NKI 295 patient cohort. Unsupervised hierarchical clustering of these data revealed a subset of patients that showed a distinct relative downregulation of genes from the larger gene set (Supplementary Fig. ). Patients showing this unique signature was predictive for overall survival (p =.3).
doi:.38/nature8975 TABLE S: Lack of association between other HOX transcripts with patient outcome DEATH METASTASIS P value P value High HOXC.92.4 High HOXC.582.325 High HOXC3.853.972 High nchoxc-24.487.6 a RNA expression data (as measured by qrt-pcr) from seventy-eight primary breast tumors used to determine association. nchoxc-24, -26A, -26B are believed to represent exons of one lincrna and are represented by nchoxc-24 here. TABLE S2: Multivariate analysis of risk factors for death and metastasis as the first recurrence event in early breast cancer DEATH METASTASIS Hazard Ratio P value Hazard Ratio P value High HOTAIR expression a 3.33. 3.468. Age.754.468.745.374 Diameter of tumor, per cm.65.84.524.75 Lymph node status, per positive node 3.25.33 3.348.4 Tumor grade 2.6.6.668.33 Vascular invasion.935..726. Estrogen receptor status, positive vs. negative.329.2.695.422 No adjuvant therapy vs. chemo or hormonal therapy.824.29.376.498 a Modeled as a binary variable with High HOTAIR expression defined as primary breast tumors with relative HOTAIR expression 25 fold above normal (representing the minimum level of expression seen in a panel of metastatic tumors). Using this criteria, High HOTAIR expression represents 44/32 primary breast tumors surveyed. 2
doi:.38/nature8975 TABLE S3: PRC-2 ChIP mapping data procured for gene set analysis Gene Set Species Platform Description SUZ2_Prostate Cancer Cell Line Human Avia System Biology hu6k promoter set SUZ2 occupancy PC-3 and LNCaP cell lines EZH2_Prostate Cancer Cell Line Human Custom 2K cdna microarray EZH2 transcriptional targets RWPE prostate cell line PRC_2 Prostate Cancer 2 Human Agilent human proximal promoter set PRC-2 occupancy Metastatic prostate cancer tissue SUZ2_Colon Cancer Cell Line 3 Human Nimblegen-Roche 5K promoter set SUZ2 occupancy SW48 colon carcinoma line SUZ2_Breast Cancer Cell Line 3 Human Nimblegen-Roche 5K promoter set SUZ2 occupancy MCF-7 breast carcinoma line PRC-2_Embryonic Fibroblast Cell Line 4 Human Nimblegen-Roche promoter tiliing array PRC-2 occupancy embryonic Lung TIG3 line H3K27_Lung Fibroblast Cells 5 Human Nimblegen-Roche 5 K promoter set H3K27 occupancy neonatal primary lung fibroblasts SUZ2_Foreskin Fibroblast Cells a Human Nimblegen-Roche HG8 two array set SUZ2 occupancy neonatal primary foreskin fibroblasts H3K27_Foreskin Fibroblast Cells 5 Human Niimblegen-Roche 5K promoter set H3K27 occupancy neonatal primary foreskin fibroblasts H3K27_Embryonic Stem Cell 6 Human Agilent Whole Genome Array H3K27 occupancy WA9 embryonic stem cells SUZ2_Embryonic Stem Cell 6 Human Agilent Whole Genome Array SUZ2 occupancy WA9 embryonic stem cells PRC-2_Embryonic Stem Cell 6 Human Agilent Whole Genome Array PRC-2 occupancy WA9 embryonic stem cells SUZ2_Embryonic Stem Cell 7 Mouse Agilent Mouse Promoter Array Set SUZ2 occupancy mouse embryonic stem cells SUZ2_Embryonic Stem Cell 3 Mouse Nimblegen-Roche.5 kb promoter set SUZ2 occupancy mouse embryonic stem cells J. Yu, Q. Cao, R. Mehra et al., Cancer cell 2 (5), 49 (27). 2 J. Yu, J. Yu, D. R. Rhodes et al., Cancer research 67 (22), 657 (27). 3 S. L. Squazzo, H. O'Geen, V. M. Komashko et al., Genome research 6 (7), 89 (26). 4 A. P. Bracken, N. Dietrich, D. Pasini et al., Genes & development 2 (9), 23 (26). 5 H. O'Geen, S. L. Squazzo, S. Iyengar et al., PLoS genetics 3 (6), e89 (27). 6 T. I. Lee, R. G. Jenner, L. A. Boyer et al., Cell 25 (2), 3 (26). 7 L. A. Boyer, K. Plath, J. Zeitlinger et al., Nature 44 (79), 349 (26). a Current Study TABLE S4: PCR primer pairs for qrt-pcr Gene Name Forward Reverse HOTAIR GGTAGAAAAAGCAACCACGAAGC ACATAAACCTCTGTCTGTGAGTGCC GAPDH CCGGGAAACTGTGGCGTGATGG AGGTGGAGGAGTGGGTGTCGCTGTT LAMB3 GCCACATTCTCTACTCGGTGA CCAAGCCTGAGACCTACTGC SNAIL TGACCTGTCTGCAAATGCTC CAGACCCTGGTTGCTTCAA LAMC2 CTCTGCTTCTCGCTCCTCC TCTGTGAAGTTCCCGATCAA ABL2 GGACACTTCACTTTGCTGCC TAGTGCCTGGGGTTCAACAT JAM2 TCTTTTGGGGCAGAAAAC AAGATGGCGAGGAGG PCDH CCCGTCTACACTGTGTCCCT GGAGTACACGACCTCACCGT PCDHB5 AGGTGTGTTTGACCGGAGAC TCCCTATTTCTTCACCAGCG TABLE S5: PCR primer pairs for ChIP verification Gene Name Forward Reverse JAM2 ACCTGACTTCCAGCACGAGT CCAACTCCTTTCTTCCCCTC HOXD GCTGAGGCGCTTTAATGAAC GGTCCCAGAAACTCTGACCA PR TCTCCAACTTCTGTCCGAGG CACGAGTTTGATGCCAGAGA EphA ATATGACAAACACGGCCCAT GGTGGTTAACTTGGGGAACA PCDH ACCAGGCTCTGTTCTGTTCG TCTTGGGTCATAGGGGTCTG PCDHB5 AGACCGGCAATTTGCTTCTA TCTGGGGCATGGTCATTTAT 3
doi:.38/nature8975 Normal >> Primary, Metastatic CA Primary >> Metastatic CA i M3 M6 P3 P2 M5 M2 M4 M P6 P5 P8 P7 P P4 HOXA5 ex2 nc-hoxa5-68a nc-hoxa5-67a HOXA4 ex2 nc-hoxa-56a nc-hoxa-57a nc-hoxa-55a HOXD ex2 nc-hoxd4-26a HOXD ex2 HOXD ex2 nc-hoxa-58a HOXA9 ex2 nc-hoxb-59a HOXB6 ex HOXB9 ex HOXB6 ex2 HOXD ex ii M3 M6 P3 P2 M5 M2 M4 M P6 P5 P8 P7 P P4 nc-hoxb2-64a nc-hoxb2-6a HOXB7 ex nc-hoxb9-99 HOXC9 ex2 Figure S. Higher resolution of subsets i and ii from Fig. the heat map depicted in Figure a identifying transcripts that show (i) higher expression in normal compared to cancer samples and (ii) higher expression in primary compared to metastatic samples. 4
doi:.38/nature8975 GUPTA et al. FIG. S2 quantitative PCR (88 Samples) Primary Tumors (n = 78) Metastases (n = 5) Normal (n = 5) HOX coding HOX ncrna SUPPLEMENTAL FIGURE LEGENDS Figure S. Higher resolution of subsets i and -4.6 ii from Fig. the heat map 4.6 depicted in Figure a identifying transcripts that show (i) higher expression in normal compared to cancer samples and (ii) higher expression in primary compared to metastatic samples. log2 Figure S2. Heat map (supervised hierarchical clustering) representing the relative expression values of a filtered subset of HOX coding genes and lincrnas as determined by qrt-pcr. RNA from 88 samples (5 normal breast organoid, 78 primary breast tumors from the NKI 295 Cohort, and 5 metastatic breast tumors) was assayed for the expression of 43 HOX lincrnas and 39 HOX coding genes by qrt-pcr. Transcripts were filtered for significant differences in expression (SAM, 3 permutations, FDR<5%). Figure S3. Relative levels of HOTAIR RNA by qrt-pcr of the indicated breast carcinoma cell line. Values are expressed relative to HOTAIR abundance in human adult foot fibroblast cells; error bars represent s.d. (n=3). 5
doi:.38/nature8975 Relative HOTAIR Abundance 5 4.5 4 3.5 3 2.5 2.5.5 Figure S3. Relative levels of HOTAIR RNA by qrt-pcr of the indicated breast carcinoma cell line. Values are expressed relative to HOTAIR abundance in human adult foot fibroblast cells; error bars represent s.d. (n=3). HOTAIR Level 6 4 2 8 6 4 2 plzrs HOTAIR - + - + - + MCF-A SKBR-3 MDA-MB-23 8 6 4 2 8 6 4 2 Primary Breast Tumors Figure S4. Levels of HOTAIR RNA following enforced expression in MCF-A, SK-BR3, and MDA-MB-23 cells in relationship to HOTAIR RNA levels in the 32 primary breast tumors screened in Figure d (measured on same scale in both left and right panel to allow direct comparison); error bars represents s.d. (n=3). 6
doi:.38/nature8975 Soft agar colonies (Number > 5 µm diameter) 8 7 6 5 4 3 2 HCC954 Vector HOTAIR * * FBS: 2% 2% % % 7 6 5 4 3 2 SK-BR3 * * FBS: 2% 2% % % Soft agar colonies (Number > 25 µm diameter) 25 2 5 5 MDA-MB-23 * FBS: 2% 2% % % Figure S5. Soft agar colony counts (in either 2% or % FBS) in HCC954, SK-BR3 and MDA-MB-23 cells after transduction with vector or HOTAIR. Assays were repeated in triplicate and mean ± s.e.m are shown. Statistical significance (highlighted by *) was determined by paired t-test (p values: HCC954 2% =.7, HCC954 % =.3, SK-BR3 2% =.3, SK-BR3 % =., MDA-MB-23 2% =.). 7
doi:.38/nature8975 Relative HOTAIR Abundance.4.2.8.6.4.2 sigfp sihotair (pool) sihotair # sihotair #2 Relative HOTAIR Abundance.8.6.4.2.8.6.4.2 Vector EZH2 sigfp sihotair Figure S6. a, Relative levels of HOTAIR (by qrt-pcr) in the MCF-7 line after transfection with sirna duplexes targeting HOTAIR (either pooled or two individual duplexes). Matrix invasion of the same cells were shown in Figure 2b. b, Relative levels of HOTAIR (by qrt- PCR) in the H6N2 cell line infected with retroviral vector or EZH2 before and after transfection with sirna duplexes targeting HOTAIR (either pooled or two individual duplexes). Matrix invasion of the same cells were shown in Figure 4e. Error bars represent s.d. (n=3) 8
doi:.38/nature8975 GUPTA et al. FIG. S7 A HOTAIR AS Riboprobe HOTAIR S Ribprobe SUPPLEMENTAL FIGURE LEGENDS Figure S. Higher resolution of subsets i and ii from Fig. the heat map depicted in Figure a identifying transcripts that show (i) higher expression in normal compared to cancer samples and (ii) higher expression in primary compared to metastatic samples. Figure S2. Heat map (supervised hierarchical clustering) representing the relative expression values of a filtered subset of HOX coding genes and lincrnas as determined by qrt-pcr. RNA from 88 samples (5 normal breast organoid, 78 primary breast tumors from the NKI 295 Cohort, and 5 metastatic breast tumors) was assayed for the expression of 43 HOX lincrnas and 39 HOX coding genes by qrt-pcr. Transcripts were filtered for significant differences in expression (SAM, 3 permutations, FDR<5%). et) ng (Lu #3 TAI R TAI R TAI R #2 # (Lu (Lu ng ng Mic Mic rom rom rom Mic t) me gm or ( or ( OT AIR Lun (Ce ll L ll L Ine ) Ine icro ) Figure S4. Levels of HOTAIR RNA following enforced expression in MCF-A, SK-BR3, and MDA-MB-23 cells in relationship to HOTAIR RNA levels in the 32 primary breast tumors screened in Figure d (measured on same scale in both left and right panel to allow direct comparison); error bars represents s.d. (n=3). Ce B et) et) Figure S3. Relative levels of HOTAIR RNA by qrt-pcr of the indicated breast carcinoma cell line. Values are expressed relative to HOTAIR abundance in human adult foot fibroblast cells; error bars represent s.d. (n=3). HO HO MD A-M B-2 3 3 B-2 A-M MD MD A-M B-2 3 HO Vec t 3 B-2 A-M MD -23 H A-M B MD MD A-M B -23 V ect Figure S5. Soft agar colony counts (in either 2% or % FBS) in HCC954, SK-BR3 and MDA-MB-23 cells after transduction with vector or HOTAIR. Assays were repeated in triplicate and mean ± s.e.m are shown. Statistical significance (highlighted by *) was determined by paired t-test (p values: HCC954 2% =.7, HCC954 % =.3, SK-BR3 2% =.3, SK-BR3 % =., MDA-MB-23 2% =.). Sample: Figure S6. a, Relative levels of HOTAIR (by qrt-pcr) in the MCF-7 line after transfection HOTAIR with sirna duplexes targeting HOTAIR (either pooled or two individual duplexes). Matrix invasion of the same cells were shown in Figure 2b. b, Relative levels of HOTAIR (by qrtgapdh PCR) in the H6N2 cell line infected with retroviral vector or EZH2 before and after transfection with sirna duplexes targeting HOTAIR (either pooled or two individual duplexes). Matrix invasion of the same cells were shown in Figure 4e. Error bars represent s.d. (n=3) Figure S7. a, In situ hybridization of HOTAIR using either HOTAIR antisense (AS) or sense (S) riboprobes on a micrometastatic lesion in mouse lung following tail vein injection of MDA-MB23 HOTAIR cells showing retention of HOTAIR expression post-injection. b, RT-PCR of HOTAIR and GAPDH from RNA isolated from micrometastatic lesions in mouse lung following tail vein injection of MDA-MB-23 HOTAIR or Vector cells. 9
doi:.38/nature8975 Multicellular organismal process Top 5 Enriched Gene Ontologies Cell part Cell communication Multicellular organismal development Developmental process 2 3 4 5 p-value (-log) 6 Figure S8. Top 5 enriched Gene Ontologies of the 854 genes with a gain of PRC2 occupancy and H3K27me3 following enforced expression of HOTAIR.
doi:.38/nature8975 SUZ2 Occupancy [log2 (ChIP/IgG) by qpcr] 7 6 5 4 3 2 Vector HOTAIR - HOXD PR JAM2 EphA PCDH PCDHB5 Vector HOTAIR H3K27 Occupancy [log2 (ChIP/IgG) by qpcr] 2.5 2.5.5 HOXD PR JAM2 EphA PCDH PCDHB5 2 Vector HOTAIR.5 EZH2 Occupancy [log2 (ChIP/IgG) by qpcr].5.5 HOXD PR JAM2 EphA PCDH PCDHB5 Figure S9. SUZ2, H3K27, and EZH2 occupancy measured by ChIP-qPCR in vector or HOTAIR cells for the indicated gene promoters. Mean±s.d. are shown (n=3).
doi:.38/nature8975 GUPTA et al. FIG. S Relative Expression of 854 gene set with gain of PRC-2 occupancy with HOTAIR in NKI Primary Breast Tumor Dataset HOTAIR-PRC-2 targets DOWN.2 Probability HOTAIR-PRC-2 targets UP HOTAIR PRC-2 targets UP HOTAIR PRC-2 targets DOWN.8.6.4 p =.3 Figure S8. Top 5 enriched Gene Ontologies of the 854 genes with a gain of PRC2 occupancy.2 and H3K27me3 following enforced expression of HOTAIR. Figure S9. SUZ2, H3K27, and EZH2 occupancy measured by ChIP-qPCR in2vector or 5 5 HOTAIR cells for the indicated gene promoters. Mean±s.d. are shown (n=3). survival (death) Figure S. (upper panel) Heat Map representing unsupervised hierarchical clustering of the relative expression values from 295 primary breast tumors (NKI 295 Cohort) of the 854 promoter set (from Fig. 3A) that show a gain of PRC-2 occupancy upon enforced HOTAIR expression in the MDA-MB-23 cells. A subset of patients (blue bar HOTAIR-PRC-2 targets DOWN) was identified that show a consistent down-regulation (relative silencing) of a subset of genes from this set. (lower panel) Kaplan-Meier curves showing overall survival in patients with the expression signature of HOTAIR PRC-2 targets UP (red) or DOWN (blue) as delineated in the upper panel heat map. 2