doi: 1.138/nature8645 Physical coverage (x haploid genomes) 11 6.4 4.9 6.9 6.7 4.4 5.9 9.1 7.6 125 Neither end mapped One end mapped Chimaeras Correct Reads (million ns) 1 75 5 25 HCC1187 HCC1395 HCC1599 HCC1937 HCC1954 HCC2157 HCC2218 HCC38 HCC1143 Physical coverage (x haploid genomes) 64 6.4 57 5.7 42 4.2 57 5.7 63 6.3 38 3.8 53 5.3 6 11.4 78 7.8 63 6.3 48 4.8 69 6.9 68 6.8 66 6.6 125 Neither end mapped One end mapped Chimaeras Correct Reads (millions s) 1 75 5 25 PD3664a PD3666a PD3668a PD367a PD3672a PD3688a PD369a PD3695a PD3665a PD3667a PD3669a PD3671a PD3687a PD3689a PD3693a Supplementary Figure 1 Haploid physical coverage of breast cancer samples. Physical coverage indicates the number of DNA fragments of which both ends have been sequenced that on average overlie any position in the genome. 1
doi: 1.138/nature8645 HCC2157 HCC1395 HCC1954 HCC1187 HCC1937 HCC1143 HCC2218 PD3667a PD3671a PD3689a PD3664a PD3668a PD3672a PD369a PD3669a PD3687a PD3665a PD3693a SI Guide Supplementary Figure 1 Haploid physical coverage of breast cancer samples. Physical coverage indicates the number of DNA fragments of which both ends have been sequenced HCC1599 HCC38 overlie any position PD3666a that on average in the genome. PD367a PD3688a PD3695a Supplementary Figure 2 Genome wide circos plots of somatic rearrangements in all 24 breast cancers in the study. Supplementary Figure 3 NFIA-EHF, an expressed, in frame fusion gene caused by an interchromosomal rearrangement in breast cancer cell line HCC1937. (a) Across not in normal DNA; (b) RT-PCR of RNA between NFIA exon 2 and EHF exon 5 to confirm the presence of a chimeric expressed transcript; (c) Representative picture of dual colour FISH confirming a translocation in HCC1937. Red probe corresponds to BAC RP11364M11, chromosome 1: 61,64,196-61,228,554. Green probe corresponds to BAC RP11277N8, chromosome 11: 34,772,14-34,965,946. (d) Schematic diagram of the protein domains fused in the predicted NFIA/EHF fusion protein. Domains from NFIA are blue, domains from EHF are red (e) Sequence from RT-PCR product shown in (b) confirming NFIA exon 2 fused to EHF exon 5. Supplementary Figure 4 SLC26A6-PRKAR2A, an expressed, in-frame fusion gene generated by a tandem duplication in the breast cancer cell line HCC38. (a) Across not in normal DNA; (b) RT-PCR of RNA between SLC26A6 exon 17 and PRKAR2A exon 4 to confirm the presence of a chimeric expressed transcript; (c) Dual colour FISH confirming the 3p21.31 tandem duplication in HCC38. Green-labelled BAC RP11-148G2 is within the 2
doi: 1.138/nature8645 a bps T N 6 4 2 d NFIA (exons 1-2) EHF (exons 5-9) Genomic PCR b bps 3 2 T N CTF/NF1 MH 1 DNA ETS DNA RT-PCR e c Dual colour FISH exon 2 NFIA exon 5 EHF Supplementary Figure 3 NFIA-EHF, an expressed, in frame fusion gene caused by an interchromosomal rearrangement in breast cancer cell line HCC1937. (a) Across not in normal DNA; (b) RT-PCR of RNA between NFIA exon 2 and EHF exon 5 to confirm the presence of a chimeric expressed transcript; (c) Representative picture of dual colour FISH confirming a translocation in HCC1937. Red probe corresponds to BAC RP11-364M11, chromosome 1: 61,64,196-61,228,554. Green probe corresponds to BAC RP11-277N8, chromosome 11: 34,772,14-34,965,946. (d) Schematic diagram of the protein domains fused in the predicted NFIA/EHF fusion protein. Domains from NFIA are blue, domains from EHF are red (e) Sequence from RT-PCR product shown in (b) confirming NFIA exon 2 fused to EHF exon 5. 3
No of rearrangements doi: 1.138/nature8645 2 2 2 2 15 15 15 15 1 1 1 1 5 5 5 5 No of rearran ngements HCC2157 No of rearrange ements HCC1937 HCC1599 2 2 2 2 15 15 15 15 1 1 1 1 5 5 5 5 HCC1395 HCC1187 HCC1143 HCC38 2 2 2 2 15 15 15 15 1 1 1 1 5 5 5 5 15 1 5 HCC2218 2 SI Guide PD3664a PD3664 PD3665a PD3665 PD3666a PD3666 2 2 2 15 15 15 1 1 Supplementary Figure 1 Haploid physical coverage of breast cancer samples. 5 Physical 5 5 coverage indicates the number of DNA fragments of which both ends have been sequenced that on average overlie any position in the genome. PD3667a PD3668a PD3669a PD367a 1 1 1 Supplementary 1 Figure 3 NFIA-EHF, an expressed, in frame fusion gene caused 1 by an No of rearrange ements 15 Supplementary Figure 2 Genome wide circos plots of somatic rearrangements in all 24 2 2 2 breast cancers in the study. 15 15 15 2 No of rearranggements No of rearrangements HCC1954 2 5 15 1 5 interchromosomal5rearrangement in breast cancer cell line HCC1937. (a) 5 5 Across not in normal DNA; (b) RT-PCR of RNA between NFIA exon 2 and EHF exon 5 to confirm PD3671a PD3672a PD3687a PD3688a the presence of a chimeric expressed transcript; (c) Representative picture of dual colour 2 FISH confirming a translocation in HCC1937.2 Red probe corresponds to 2 BAC RP11-15 1: 61,64,196-61,228,554. 15Green probe corresponds to15 364M11, chromosome BAC RP11277N8, chromosome 11: 34,772,14-34,965,946. the protein 1 1 (d) Schematic diagram of1 domains fused in the predicted NFIA/EHF fusion protein. Domains from NFIA are blue, 5 5 5 domains from EHF are red (e) Sequence from RT-PCR product shown in (b) confirming NFIA exon 2 fused to EHF exon 5. PD3689a PD369a PD3693a PD3695a PD3689 PD369 PD3693 PD3695 Supplementary Figure 4 SLC26A6-PRKAR2A, an expressed, in-frame fusion gene generated by a tandem duplication in the breast cancer cell line HCC38. (a) Across not in normal DNA; (b) RT-PCR of RNA between SLC26A6 exon 17 and PRKAR2A exon 4 to confirm the presence of a chimeric expressed transcript; (c) Dual colour FISH confirming the 3p21.31 tandem duplication in HCC38. Green-labelled BAC RP11-148G2 is within the 4
doi: 1.138/nature8645 not in normal DNA; (b) RT-PCR of RNA between SLC26A6 exon 17 and PRKAR2A exon 4 to confirm the presence of a chimeric expressed transcript; (c) Dual colour FISH confirming the 3p21.31 tandem duplication in HCC38. Green-labelled BAC RP11-148G2 is within the tandem duplication. Red-labelled BAC RP11-527M1 is located ~3 Mb telomeric of the tandem duplication. (d) Schematic diagram of the protein domains in the predicted SLC26A6-PRKAR2A fusion protein. Domains from SLC26A6 are blue, domains from PRKAR2A are red. (e) Sequence from RT-PCR product shown in (b) confirming SLC26A6 exon 17 fused to PRKAR2A exon 4. 5
doi: 1.138/nature8645 a bps T N 6 4 2 d SLC26A6 (exons 1-17) PRKAR2A (exons 4-11) b bps 6 4 2 Genomic PCR T N Sulphate Transporter cnmp camp/cgmp Kinase cnmp RT PCR e c Dual colour FISH of 3p21.31 tandem duplication exon 17 exon 4 SLC26A6 PRKAR2A Supplementary Figure 5 Prevalence of architectures of rearrangements in all 24 breast cancers in the study: Deletion (dark blue), tandem duplication (red), inverted orientation (green), interchromosomal (light blue), breakpoint(s) within an amplicon (orange). 6
doi: 1.138/nature8645 No of rearrangements No of rearrangements No of rearrangements No of rearrangements No of rearrangements No of rearrangements 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 HCC2157 HCC1954 HCC1937 HCC1599 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 HCC1395 HCC1187 HCC1143 HCC38 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 HCC2218 PD3664a PD3665a PD3666a 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 PD3667a PD3668a PD3669a PD367a 5 5 4 5 4 3 4 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 PD3671a PD3672a PD3687a PD3688a 5 5 5 4 4 4 3 3 3 2 2 2 1 1 1 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 1 2 3 4 5 >5 PD3689a PD369a PD3693a PD3695a Supplementary Figure 6 Number of nucleotides of overlapping microhomology at rearrangement junctions in the 24 breast cancers. 7
doi: 1.138/nature8645 3 Number of rearrangements 25 2 15 1 5 1 2 3 4 5 6-2 >2 Non-templated sequence (bps) Supplementary Figure 7 Nucleotides of non-templated DNA sequence at rearrangement breakpoints in the 24 breast cancers. 8