Nature Genetics: doi: /ng Supplementary Figure 1. Workflow of CDR3 sequence assembly from RNA-seq data.

Size: px
Start display at page:

Download "Nature Genetics: doi: /ng Supplementary Figure 1. Workflow of CDR3 sequence assembly from RNA-seq data."

Transcription

1 Supplementary Figure 1 Workflow of CDR3 sequence assembly from RNA-seq data. Paired-end short-read RNA-seq data were mapped to human reference genome hg19, and unmapped reads in the TCR regions were extracted for pairwise comparison. CDR3 sequences were assembled from disjoint read sets and annotated using IMGT nomenclatures.

2 Supplementary Figure 2 Number of reads/contigs at each step of the CDR3 assembly method. For a selected sample, we demonstrate the number of reads or contigs kept at each step of our method. The numbers are included at the bottom of each text box. The selected sample represents the median library size of TCGA tumors with the median number of assembled CDR3 sequences.

3 Supplementary Figure 3 Method evaluation using TCGA tumors profiled with both TCR sequencing and RNA-seq. Left, relationship between CDR3 transcripts called from TCR-seq and RNA-seq data. Middle, distribution of clonal frequencies of RNAseq assemblies and TCR-seq transcripts. Right, another visualization of the clonal frequency distribution: the x axis shows the quantiles of clonal frequencies from immunoseq data, and the y axis shows the fraction of above-quantile TCR transcripts called from RNA-seq data.

4 Supplementary Figure 4 Schematics of two simulation approaches used to validate the method developed in this work. Descriptions for each approach can be found in the Online Methods.

5 Supplementary Figure 5 Performance evaluation of the CDR3 assembly algorithm using an in silico mixture experiment. Our method was applied to data sets produced by the second simulation approach (Online Methods), and the CDR3 calls were compared to the gold standard TCR sequencing reads. (a) At different levels of T cell infiltration, our method recovered 4 6% of the infiltrating T cell repertoire, with 94 98% accuracy. (b) The called CDR3 sequences (infiltration level of 60%) were enriched for T cells with high clonal frequency. (c) Quantile quantile plot showing that the clonal frequency for called CDR3 sequences is skewed to the higher end in comparison to the background distribution.

6 Supplementary Figure 6 Evaluation of the CDR3 assembly algorithm at high coverage and comparison with issake. Both methods were applied to analyze the data sets produced from the first simulation approach (Online Methods). Called CDR3 sequences were compared to the 100 simulated transcripts, and true or false positive rates were calculated. False positive calls were defined as contigs that did not contain the CDR3 region. Standard deviation was estimated using 100 simulations at each given level of coverage. The true positive rate was the number of unique correct calls divided by 100, and the false positive rate was the number of unique incorrect calls divided by the total number of CDR3 calls. (a,b) These results were visualized as box plots for our method (a) and issake (b). We did not include a precision recall curve at each coverage setting because there was not a continuous threshold that would affect the performance in our algorithm.

7 Supplementary Figure 7 Differential usage of TRAV and TRBV genes in lower-grade glioma and kidney clear cell cancer. (a c) Bar plots of TRAV and TRBV gene usage in glioma (a,b) and kidney tumors (c) are presented. TRAV and TRBV genes are in the same order as in Figure 1a,b, and the fractions were calculated in the same way.

8 Supplementary Figure 8 Distribution of read counts for assembled CDR3 contigs. Read counts for each CDR3 contig were obtained from the assembly algorithm. When shared across multiple contigs, the count for a read was evenly split between each contig.

9 Supplementary Figure 9 Association of CPK with genes involved in cytolysis. The expression levels of previously defined cytolytic genes 19,27 were associated with CPK. The heat map displays values from partial Spearman s correlation corrected for tumor purity. Cancers with fewer than ten samples were excluded. Statistical significance was evaluated using partial Spearman s correlation test.

10 Supplementary Figure 10 Scatterplot between CPK and mutation load. Each point on the plot represents a cancer sample, with color referring to the corresponding disease type. The statistical significance of the association was evaluated using Spearman s correlation. This represents a complementary analysis to Figure 4b.

11 Supplementary Figure 11 Fraction of public -CDR3 sequences across cancer types. For each cancer type, the fraction was calculated as the number -CDR3 sequences in the final public sequence set divided by the number of total distinct -CDR3 sequences. All fractions were then mean centered, with the mean being the number of total public - CDR3 sequences divided by the number of total distinct -CDR3 sequences. Significance was evaluated using the binomial test, with the mean being the expected frequency and counts for public and total -CDR3 sequences for each cancer as observations.

12 Supplementary Figure 12 MHC I binding predictions for SPAG5 and TSSK6 protein sequences. Complete amino acid sequences for SPAG5 and TSSK6 were obtained from the NCBI protein database. All tiling nine-amino-acid sequences were analyzed by NetMHC4.0 for MHC I binding predictions. The peptides with strong binding (rank <0.5%) to an MHC I allele are underlined, and the corresponding MHC I allele is labeled by color. Only common MHC I alleles (HLA-A01:01, HLA-A02:01, HLA-A03:01, HLA-A07:02 and HLA-B08:01) with high population frequencies are displayed in the plot for visualization purposes.

13 Supplementary Figure 13 MHC II binding predictions for peptides produced from PRAMEF4 F300V. MHC II binding was predicted by NetMHC-II 2.2. Fifteen-amino-acid sequences are the standard input for the webserver, and one mutated peptide was predicted to bind to three MHC II alleles with high affinity.

14 Supplementary Figure 14 Box plot of PRAMEF4 expression levels in multiple cancer types and paired normal tissues. Testicular cancer (TGCT) is highlighted by the blue box. Numbers of outliers are included in red along the top of the plot.

15 Cancer Survival Age Gender Race *:$FDR<0.1 HR Correlation FC4(M/F) FC4(A/W) FC4(B/W) HR:$Hazard$Ratio$from$Cox$proportional$hazard$model BLCA 5.60E:07 : * 0.96 M/F:$Male/Female BRCA 1.90E:22 :0.1286* NA A/W:$Asian/White CESC 9.20E: NA NA 1.1 B/W:$Black/White COAD 1.10E: NA 0.85 GBM 1.90E+20 : NA NA NA HNSC 5.80E:50 : NA 1 KIRC 2.50E:31 : * NA 0.97 LGG 3.2e+153* NA NA LIHC 5.10E: NA LUAD 2.20E:42 : * NA 1.09 LUSC 1.20E NA 0.94 MESO 5.90E:05 NA NA NA NA PAAD 8.70E * NA NA SARC 4.40E:77 : NA NA SKCM 1.70E:06 : NA NA KIRP NA : NA 1.06 PCPG NA : NA NA PRAD NA NA NA NA TGCT NA : NA NA NA THCA NA : UCEC NA : NA NA 1.17

16 HLA$A0101 HLA$A0201 HLA$A0301 HLA$A2402 HLA$A2601 HLA$A3001 Pos Peptide ID nm Rank Core nm Rank Core nm Rank Core nm Rank Core nm Rank Core nm Rank Core 0 SCLKTSLKV PRAMEF SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV 1 CLKTSLKVL PRAMEF CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL 2 LKTSLKVLT PRAMEF LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT 3 KTSLKVLTI PRAMEF KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI 4 TSLKVLTIT PRAMEF TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT 5 SLKVLTITN PRAMEF SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN 6 LKVLTITNC PRAMEF LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC 7 KVLTITNCV PRAMEF KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV 8 VLTITNCVL PRAMEF VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL 0 SCLKTSLKF PRAMEF4$WT SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF 1 CLKTSLKFL PRAMEF4$WT CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL 2 LKTSLKFLT PRAMEF4$WT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT 3 KTSLKFLTI PRAMEF4$WT KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI 4 TSLKFLTIT PRAMEF4$WT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT 5 SLKFLTITN PRAMEF4$WT SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN 6 LKFLTITNC PRAMEF4$WT LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC 7 KFLTITNCV PRAMEF4$WT KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV 8 FLTITNCVL PRAMEF4$WT FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL 0 TGDTTPLPD MUC TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD 1 GDTTPLPDT MUC GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT 2 DTTPLPDTD MUC DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD 3 TTPLPDTDT MUC TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT 4 TPLPDTDTS MUC TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS 5 PLPDTDTSS MUC PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS 6 LPDTDTSSA MUC LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA 7 PDTDTSSAS MUC PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS 8 DTDTSSAST MUC DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST 0 TGDTTPLPV MUC4$WT TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV 1 GDTTPLPVT MUC4$WT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT 2 DTTPLPVTD MUC4$WT DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD 3 TTPLPVTDT MUC4$WT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT 4 TPLPVTDTS MUC4$WT TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS 5 PLPVTDTSS MUC4$WT PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS 6 LPVTDTSSA MUC4$WT LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA 7 PVTDTSSAS MUC4$WT PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS 8 VTDTSSAST MUC4$WT VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST 0 THITEPSTG MUC5B THITEPSTG THITEPSTG THITEPSTG THITEPSTG THITEPSTG THITEPSTG 1 HITEPSTGT MUC5B HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT 2 ITEPSTGTS MUC5B ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS 3 TEPSTGTSH MUC5B TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH 4 EPSTGTSHT MUC5B EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT 5 PSTGTSHTP MUC5B PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP 6 STGTSHTPA MUC5B STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA 7 TGTSHTPAA MUC5B TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA 8 GTSHTPAAT MUC5B GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT 0 THITEPSTV MUC5B$WT THITEPSTV THITEPSTV THITEPSTV THITEPSTV THITEPSTV THITEPSTV 1 HITEPSTVT MUC5B$WT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT 2 ITEPSTVTS MUC5B$WT ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS 3 TEPSTVTSH MUC5B$WT TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH 4 EPSTVTSHT MUC5B$WT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT 5 PSTVTSHTP MUC5B$WT PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP 6 STVTSHTPA MUC5B$WT STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA 7 TVTSHTPAA MUC5B$WT TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA 8 VTSHTPAAT MUC5B$WT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT

17 HLA$B0702 HLA$B0801 HLA$B1501 HLA$B2705 HLA$B3901 HLA$B4001 HLA$B5801 nm Rank Core nm Rank Core nm Rank Core nm Rank Core nm Rank Core nm Rank Core nm Rank Core SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV SCLKTSLKV CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL CLKTSLKVL LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT LKTSLKVLT KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI KTSLKVLTI TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT TSLKVLTIT SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN SLKVLTITN LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC LKVLTITNC KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV KVLTITNCV VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL VLTITNCVL SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF SCLKTSLKF CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL CLKTSLKFL LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT LKTSLKFLT KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI KTSLKFLTI TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT TSLKFLTIT SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN SLKFLTITN LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC LKFLTITNC KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV KFLTITNCV FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL FLTITNCVL TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD TGDTTPLPD GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT GDTTPLPDT DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD DTTPLPDTD TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TTPLPDTDT TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS TPLPDTDTS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS PLPDTDTSS LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA LPDTDTSSA PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS PDTDTSSAS DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST DTDTSSAST TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV TGDTTPLPV GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT GDTTPLPVT DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD DTTPLPVTD TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TTPLPVTDT TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS TPLPVTDTS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS PLPVTDTSS LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA LPVTDTSSA PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS PVTDTSSAS VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST VTDTSSAST THITEPSTG THITEPSTG THITEPSTG THITEPSTG THITEPSTG THITEPSTG THITEPSTG HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT HITEPSTGT ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS ITEPSTGTS TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH TEPSTGTSH EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT EPSTGTSHT PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP PSTGTSHTP STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA STGTSHTPA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA TGTSHTPAA GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT GTSHTPAAT THITEPSTV THITEPSTV THITEPSTV THITEPSTV THITEPSTV THITEPSTV THITEPSTV HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT HITEPSTVT ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS ITEPSTVTS TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH TEPSTVTSH EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT EPSTVTSHT PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP PSTVTSHTP STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA STVTSHTPA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA TVTSHTPAA VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT VTSHTPAAT

18 HLA$C0401 HLA$C0602 HLA$C0702 nm Rank Core nm Rank Core nm Rank Core H_Avg_RanksN_binders SCLKTSLKV SCLKTSLKV SCLKTSLKV CLKTSLKVL CLKTSLKVL CLKTSLKVL LKTSLKVLT LKTSLKVLT LKTSLKVLT KTSLKVLTI KTSLKVLTI KTSLKVLTI TSLKVLTIT TSLKVLTIT TSLKVLTIT SLKVLTITN SLKVLTITN SLKVLTITN LKVLTITNC LKVLTITNC LKVLTITNC KVLTITNCV KVLTITNCV KVLTITNCV VLTITNCVL VLTITNCVL VLTITNCVL SCLKTSLKF SCLKTSLKF SCLKTSLKF CLKTSLKFL CLKTSLKFL CLKTSLKFL LKTSLKFLT LKTSLKFLT LKTSLKFLT KTSLKFLTI KTSLKFLTI KTSLKFLTI TSLKFLTIT TSLKFLTIT TSLKFLTIT SLKFLTITN SLKFLTITN SLKFLTITN LKFLTITNC LKFLTITNC LKFLTITNC KFLTITNCV KFLTITNCV KFLTITNCV FLTITNCVL FLTITNCVL FLTITNCVL TGDTTPLPD TGDTTPLPD TGDTTPLPD GDTTPLPDT GDTTPLPDT GDTTPLPDT DTTPLPDTD DTTPLPDTD DTTPLPDTD TTPLPDTDT TTPLPDTDT TTPLPDTDT TPLPDTDTS TPLPDTDTS TPLPDTDTS PLPDTDTSS PLPDTDTSS PLPDTDTSS LPDTDTSSA LPDTDTSSA LPDTDTSSA PDTDTSSAS PDTDTSSAS PDTDTSSAS DTDTSSAST DTDTSSAST DTDTSSAST TGDTTPLPV TGDTTPLPV TGDTTPLPV GDTTPLPVT GDTTPLPVT GDTTPLPVT DTTPLPVTD DTTPLPVTD DTTPLPVTD TTPLPVTDT TTPLPVTDT TTPLPVTDT TPLPVTDTS TPLPVTDTS TPLPVTDTS PLPVTDTSS PLPVTDTSS PLPVTDTSS LPVTDTSSA LPVTDTSSA LPVTDTSSA PVTDTSSAS PVTDTSSAS PVTDTSSAS VTDTSSAST VTDTSSAST VTDTSSAST THITEPSTG THITEPSTG THITEPSTG HITEPSTGT HITEPSTGT HITEPSTGT ITEPSTGTS ITEPSTGTS ITEPSTGTS TEPSTGTSH TEPSTGTSH TEPSTGTSH EPSTGTSHT EPSTGTSHT EPSTGTSHT PSTGTSHTP PSTGTSHTP PSTGTSHTP STGTSHTPA STGTSHTPA STGTSHTPA TGTSHTPAA TGTSHTPAA TGTSHTPAA GTSHTPAAT GTSHTPAAT GTSHTPAAT THITEPSTV THITEPSTV THITEPSTV HITEPSTVT HITEPSTVT HITEPSTVT ITEPSTVTS ITEPSTVTS ITEPSTVTS TEPSTVTSH TEPSTVTSH TEPSTVTSH EPSTVTSHT EPSTVTSHT EPSTVTSHT PSTVTSHTP PSTVTSHTP PSTVTSHTP STVTSHTPA STVTSHTPA STVTSHTPA TVTSHTPAA TVTSHTPAA TVTSHTPAA VTSHTPAAT VTSHTPAAT VTSHTPAAT

19 Sample ID Cancer HLA-A genotype HLA-B genotype HLA-C genotype CDR3 motif Mutation TCGA-4K-AA1I TGCT A*01:01 A*30:01 B*13:02 B*13:02 C*06:02 C*06:02 GESEQY PRAMEF4 F300V TCGA-2G-AAGE TGCT A*01:01 A*01:01 B*08:01 B*57:01 C*07:01 C*07:01 GESEQY PRAMEF4 F300V TCGA-2G-AAKO TGCT A*01:01 A*02:01 B*08:01 B*18:01 C*07:01 C*12:03 GESEQY PRAMEF4 F300V TCGA-B KIRC A*68:02 A*74:01 B*53:01 B*53:01 C*04:01 C*04:01 GLAEQY MUC4 V4195D TCGA-G DLBC A*02:02 A*68:02 B*35:01 B*58:01 C*04:01 C*07:18 GLAEQY MUC4 V4195D TCGA-FF-8041 DLBC A*01:01 A*02:07 B*15:02 B*15:17 C*07:01 C*08:01 GLAEQY MUC4 V4195D TCGA-BH-A0HA BRCA A*03:01 A*31:01 B*07:02 B*07:02 C*07:02 C*07:02 RDNSYEQYMUC5B V3490G TCGA-G DLBC A*01:01 A*33:03 B*08:01 B*50:01 C*06:02 C*07:01 RDNSYEQYMUC5B V3490G TCGA-FF-A7CR DLBC A*24:07 A*24:10 B*15:02 B*27:06 C*03:04 C*07:02 RDNSYEQYMUC5B V3490G

Landscape of tumor-infiltrating T cell repertoire of human cancers

Landscape of tumor-infiltrating T cell repertoire of human cancers Landscape of tumor-infiltrating T cell repertoire of human cancers The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation

More information

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Sung-Hou Kim University of California Berkeley, CA Global Bio Conference 2017 MFDS, Seoul, Korea June 28, 2017 Cancer

More information

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity A Consistently unmethylated sites (30%) in 21 cancer types 174,696

More information

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),

More information

User s Manual Version 1.0

User s Manual Version 1.0 User s Manual Version 1.0 #639 Longmian Avenue, Jiangning District, Nanjing,211198,P.R.China. http://tcoa.cpu.edu.cn/ Contact us at xiaosheng.wang@cpu.edu.cn for technical issue and questions Catalogue

More information

Nature Getetics: doi: /ng.3471

Nature Getetics: doi: /ng.3471 Supplementary Figure 1 Summary of exome sequencing data. ( a ) Exome tumor normal sample sizes for bladder cancer (BLCA), breast cancer (BRCA), carcinoid (CARC), chronic lymphocytic leukemia (CLLX), colorectal

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Pan-cancer analysis of global and local DNA methylation variation a) Variations in global DNA methylation are shown as measured by averaging the genome-wide

More information

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Melissa S. Cline 1*, Brian Craft 1, Teresa Swatloski 1, Mary Goldman 1, Singer Ma 1, David Haussler 1, Jingchun Zhu 1 1 Center for Biomolecular

More information

TCGA. The Cancer Genome Atlas

TCGA. The Cancer Genome Atlas TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Download slides and package http://odin.mdacc.tmc.edu/~rverhaak/package.zip http://odin.mdacc.tmc.edu/~rverhaak/rna-seqlecture.zip Overview Introduction into the topic

More information

Elevated RNA Editing Activity Is a Major Contributor to Transcriptomic Diversity in Tumors

Elevated RNA Editing Activity Is a Major Contributor to Transcriptomic Diversity in Tumors Cell Reports Supplemental Information Elevated RNA Editing Activity Is a Major Contributor to Transcriptomic Diversity in s Nurit Paz-Yaacov, Lily Bazak, Ilana Buchumenski, Hagit T. Porath, Miri Danan-Gotthold,

More information

LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer

LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer Original Article LncRNA TUSC7 affects malignant tumor prognosis by regulating protein ubiquitination: a genome-wide analysis from 10,237 pancancer patients Xiaoshun Shi 1 *, Yusong Chen 2,3 *, Allen M.

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Length of mrna transcripts in the human genome 5,000 5,000 4,000 3,000 2,000 4,000 1,000 0 0 200 400 600 800 3,000 2,000 1,000 0 0 2,000 4,000 6,000 8,000 10,000 Length

More information

Genomic and Functional Approaches to Understanding Cancer Aneuploidy

Genomic and Functional Approaches to Understanding Cancer Aneuploidy Article Genomic and Functional Approaches to Understanding Cancer Aneuploidy Graphical Abstract Cancer-Type Specific Aneuploidy Patterns in TCGA Samples CRISPR Transfection and Selection Immortalized Cell

More information

File Name: Supplementary Information Description: Supplementary Figures and Supplementary Tables. File Name: Peer Review File Description:

File Name: Supplementary Information Description: Supplementary Figures and Supplementary Tables. File Name: Peer Review File Description: File Name: Supplementary Information Description: Supplementary Figures and Supplementary Tables File Name: Peer Review File Description: Primer Name Sequence (5'-3') AT ( C) RT-PCR USP21 F 5'-TTCCCATGGCTCCTTCCACATGAT-3'

More information

LinkedOmics. A Web-based platform for analyzing cancer-associated multi-dimensional data. Manual. First edition 4 April 2017 Updated on July 3, 2017

LinkedOmics. A Web-based platform for analyzing cancer-associated multi-dimensional data. Manual. First edition 4 April 2017 Updated on July 3, 2017 LinkedOmics A Web-based platform for analyzing cancer-associated multi-dimensional data Manual First edition 4 April 2017 Updated on July 3, 2017 LinkedOmics is a publicly available portal (http://linkedomics.org/)

More information

1,000 in silico simulated alpha, beta, gamma and delta TCR repertoires were created.

1,000 in silico simulated alpha, beta, gamma and delta TCR repertoires were created. 938 939 940 941 942 Figure S1 Schematic of the in silico TCRminer and MiXCR validation. 1,000 in silico simulated alpha, beta, gamma and delta TCR repertoires were created. Then, 100,000 simulated 80 bp

More information

Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt, Weizhou Zhang 1-4

Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt, Weizhou Zhang 1-4 SOFTWARE TOOL ARTICLE TRGAted: A web tool for survival analysis using protein data in the Cancer Genome Atlas. [version 1; referees: 1 approved] Nicholas Borcherding, Nicholas L. Bormann, Andrew Voigt,

More information

Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma

Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma Christof C. Smith,, Sara R. Selitsky, Benjamin G. Vincent J Clin Invest. 2018. https://doi.org/10.1172/jci121476.

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Sample selection procedure and TL ratio across cancer.

Nature Genetics: doi: /ng Supplementary Figure 1. Sample selection procedure and TL ratio across cancer. Supplementary Figure 1 Sample selection procedure and TL ratio across cancer. (a) Flowchart of sample selection. After exclusion of unsuitable samples, 18,430 samples remained. Tumor and normal samples

More information

Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc

Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc Honors Theses Biology Spring 2018 Distinct cellular functional profiles in pan-cancer expression analysis of cancers with alterations in oncogenes c-myc and n-myc Anne B. Richardson Whitman College Penrose

More information

TCGA-Assembler: Pipeline for TCGA Data Downloading, Assembling, and Processing. (Supplementary Methods)

TCGA-Assembler: Pipeline for TCGA Data Downloading, Assembling, and Processing. (Supplementary Methods) TCGA-Assembler: Pipeline for TCGA Data Downloading, Assembling, and Processing (Supplementary Methods) Yitan Zhu 1, Peng Qiu 2, Yuan Ji 1,3 * 1. Center for Biomedical Research Informatics, NorthShore University

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

Comprehensive analyses of tumor immunity: implications for cancer immunotherapy

Comprehensive analyses of tumor immunity: implications for cancer immunotherapy Comprehensive analyses of tumor immunity: implications for cancer immunotherapy The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.

More information

Nature Immunology: doi: /ni Supplementary Figure 1

Nature Immunology: doi: /ni Supplementary Figure 1 Supplementary Figure 1 A β-strand positions consistently places the residues at CDR3β P6 and P7 within human and mouse TCR-peptide-MHC interfaces. (a) E8 TCR containing V β 13*06 carrying with an 11mer

More information

Ahrim Youn 1,2, Kyung In Kim 2, Raul Rabadan 3,4, Benjamin Tycko 5, Yufeng Shen 3,4,6 and Shuang Wang 1*

Ahrim Youn 1,2, Kyung In Kim 2, Raul Rabadan 3,4, Benjamin Tycko 5, Yufeng Shen 3,4,6 and Shuang Wang 1* Youn et al. BMC Medical Genomics (2018) 11:98 https://doi.org/10.1186/s12920-018-0425-z RESEARCH ARTICLE Open Access A pan-cancer analysis of driver gene mutations, DNA methylation and gene expressions

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

Promoter methylation of DNA damage repair (DDR) genes in human tumor entities: RBBP8/CtIP is almost exclusively methylated in bladder cancer

Promoter methylation of DNA damage repair (DDR) genes in human tumor entities: RBBP8/CtIP is almost exclusively methylated in bladder cancer Mijnes et al. Clinical Epigenetics (2018) 10:15 DOI 10.1186/s13148-018-0447-6 RESEARCH Promoter methylation of DNA damage repair (DDR) genes in human tumor entities: RBBP8/CtIP is almost exclusively methylated

More information

Supplementary Materials for

Supplementary Materials for www.sciencetranslationalmedicine.org/cgi/content/full/7/283/283ra54/dc1 Supplementary Materials for Clonal status of actionable driver events and the timing of mutational processes in cancer evolution

More information

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer.

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Supplementary Figure 1 SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Scatter plots comparing expression profiles of matched pretreatment

More information

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Supplementary Figure 1 7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Regions targeted by the Even and Odd ChIRP probes mapped to a secondary structure model 56 of the

More information

Supplementary Figure 1. Using DNA barcode-labeled MHC multimers to generate TCR fingerprints

Supplementary Figure 1. Using DNA barcode-labeled MHC multimers to generate TCR fingerprints Supplementary Figure 1 Using DNA barcode-labeled MHC multimers to generate TCR fingerprints (a) Schematic overview of the workflow behind a TCR fingerprint. Each peptide position of the original peptide

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2

Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2 APPLIED PROBLEMS Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2 A. N. Polovinkin a, I. B. Krylov a, P. N. Druzhkov a, M. V. Ivanchenko a, I. B. Meyerov

More information

Clustered mutations of oncogenes and tumor suppressors.

Clustered mutations of oncogenes and tumor suppressors. Supplementary Figure 1 Clustered mutations of oncogenes and tumor suppressors. For each oncogene (red dots) and tumor suppressor (blue dots), the number of mutations found in an intramolecular cluster

More information

Expanded View Figures

Expanded View Figures Molecular Systems iology Tumor CNs reflect metabolic selection Nicholas Graham et al Expanded View Figures Human primary tumors CN CN characterization by unsupervised PC Human Signature Human Signature

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Experimental design and workflow utilized to generate the WMG Protein Atlas.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Experimental design and workflow utilized to generate the WMG Protein Atlas. Supplementary Figure 1 Experimental design and workflow utilized to generate the WMG Protein Atlas. (a) Illustration of the plant organs and nodule infection time points analyzed. (b) Proteomic workflow

More information

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first intron IGLL5 mutation depicting biallelic mutations. Red arrows highlight the presence of out of phase

More information

TP53 mutations, expression and interaction networks in human cancers

TP53 mutations, expression and interaction networks in human cancers /, 2017, Vol. 8, (No. 1), pp: 624-643 TP53 mutations, expression and interaction networks in human cancers Xiaosheng Wang 1, Qingrong Sun 2 1 Department of Basic Medicine, School of Basic Medicine and

More information

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets.

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets. Supplementary Figure 1 Transcriptional program of the TE and MP CD8 + T cell subsets. (a) Comparison of gene expression of TE and MP CD8 + T cell subsets by microarray. Genes that are 1.5-fold upregulated

More information

LncMAP: Pan-cancer atlas of long noncoding RNA-mediated transcriptional network perturbations

LncMAP: Pan-cancer atlas of long noncoding RNA-mediated transcriptional network perturbations Published online 9 January 2018 Nucleic Acids Research, 2018, Vol. 46, No. 3 1113 1123 doi: 10.1093/nar/gkx1311 LncMAP: Pan-cancer atlas of long noncoding RNA-mediated transcriptional network perturbations

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature10866 a b 1 2 3 4 5 6 7 Match No Match 1 2 3 4 5 6 7 Turcan et al. Supplementary Fig.1 Concepts mapping H3K27 targets in EF CBX8 targets in EF H3K27 targets in ES SUZ12 targets in ES

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Heatmap of GO terms for differentially expressed genes. The terms were hierarchically clustered using the GO term enrichment beta. Darker red, higher positive

More information

Nature Genetics: doi: /ng Supplementary Figure 1

Nature Genetics: doi: /ng Supplementary Figure 1 Supplementary Figure 1 Expression deviation of the genes mapped to gene-wise recurrent mutations in the TCGA breast cancer cohort (top) and the TCGA lung cancer cohort (bottom). For each gene (each pair

More information

Comparison of open chromatin regions between dentate granule cells and other tissues and neural cell types.

Comparison of open chromatin regions between dentate granule cells and other tissues and neural cell types. Supplementary Figure 1 Comparison of open chromatin regions between dentate granule cells and other tissues and neural cell types. (a) Pearson correlation heatmap among open chromatin profiles of different

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary Notes 1: accuracy of prediction algorithms for peptide binding affinities to HLA and Mamu alleles For each HLA and Mamu allele we have analyzed the accuracy of four predictive algorithms

More information

MRHCA: a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network

MRHCA: a nonparametric statistics based method for hub and co-expression module identification in large gene co-expression network Quantitative Biology 2018, 6(1): 40 55 https://doi.org/10.1007/s40484-018-0131-z RESEARCH ARTICLE MRHCA: a nonparametric statistics based method for hub and co-expression module identification in large

More information

DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE

DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE Eustache Paramithiotis PhD Vice President, Biomarker Discovery & Diagnostics 17 March 2016 PEPTIDE PRESENTATION BY MHC MHC I Antigen presentation by

More information

Nature Immunology: doi: /ni Supplementary Figure 1. RNA-Seq analysis of CD8 + TILs and N-TILs.

Nature Immunology: doi: /ni Supplementary Figure 1. RNA-Seq analysis of CD8 + TILs and N-TILs. Supplementary Figure 1 RNA-Seq analysis of CD8 + TILs and N-TILs. (a) Schematic representation of the tumor and cell types used for the study. HNSCC, head and neck squamous cell cancer; NSCLC, non-small

More information

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics Precision Genomics for Immuno-Oncology Personalis, Inc. ACE ImmunoID When one biomarker doesn t tell the whole

More information

Supplementary Tables. Supplementary Figures

Supplementary Tables. Supplementary Figures Supplementary Files for Zehir, Benayed et al. Mutational Landscape of Metastatic Cancer Revealed from Prospective Clinical Sequencing of 10,000 Patients Supplementary Tables Supplementary Table 1: Sample

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Supplementary Materials and Methods Phylogenetic tree of the HMT superfamily The phylogeny outlined in the

More information

Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas

Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas Article Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas Graphical Abstract Authors Franz X. Schaub, Varsha Dhankani, Ashton C. Berger,..., Brady Bernard,

More information

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning Yi-Hsiang Hsu, MD, SCD Sep 16, 2017 yihsianghsu@hsl.harvard.edu Director & Associate

More information

Pan-cancer analysis of expressed somatic nucleotide variants in long intergenic non-coding RNA

Pan-cancer analysis of expressed somatic nucleotide variants in long intergenic non-coding RNA Pan-cancer analysis of expressed somatic nucleotide variants in long intergenic non-coding RNA Travers Ching 1,2, Lana X. Garmire 1,2 1 Molecular Biosciences and Bioengineering Graduate Program, University

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality.

Nature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality. Supplementary Figure 1 Assessment of sample purity and quality. (a) Hematoxylin and eosin staining of formaldehyde-fixed, paraffin-embedded sections from a human testis biopsy collected concurrently with

More information

Nature Medicine: doi: /nm.3967

Nature Medicine: doi: /nm.3967 Supplementary Figure 1. Network clustering. (a) Clustering performance as a function of inflation factor. The grey curve shows the median weighted Silhouette widths for varying inflation factors (f [1.6,

More information

fl/+ KRas;Atg5 fl/+ KRas;Atg5 fl/fl KRas;Atg5 fl/fl KRas;Atg5 Supplementary Figure 1. Gene set enrichment analyses. (a) (b)

fl/+ KRas;Atg5 fl/+ KRas;Atg5 fl/fl KRas;Atg5 fl/fl KRas;Atg5 Supplementary Figure 1. Gene set enrichment analyses. (a) (b) KRas;At KRas;At KRas;At KRas;At a b Supplementary Figure 1. Gene set enrichment analyses. (a) GO gene sets (MSigDB v3. c5) enriched in KRas;Atg5 fl/+ as compared to KRas;Atg5 fl/fl tumors using gene set

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 Frequency of alternative-cassette-exon engagement with the ribosome is consistent across data from multiple human cell types and from mouse stem cells. Box plots showing AS frequency

More information

underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The

underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The Supplementary Figures Figure S1. Patient cohorts and study design. To define and interrogate the genetic alterations underlying metastasis and recurrence in HNSCC, we analyzed two groups of patients. The

More information

OncoLnc: linking TCGA survival data to mrnas, mirnas, and lncrnas

OncoLnc: linking TCGA survival data to mrnas, mirnas, and lncrnas OncoLnc: linking TCGA survival data to mrnas, mirnas, and lncrnas Jordan Anaya Omnesres.com, Charlottesville, United States ABSTRACT OncoLnc is a tool for interactively exploring survival correlations,

More information

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing Last update: 05/10/2017 MODULE 4: SPLICING Lesson Plan: Title MEG LAAKSO Removal of introns from messenger RNA by splicing Objectives Identify splice donor and acceptor sites that are best supported by

More information

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. Supplementary Figure Legends Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. A. Screenshot of the UCSC genome browser from normalized RNAPII and RNA-seq ChIP-seq data

More information

DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging

DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging Genome Biology This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. DiffVar: a new method for detecting

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Binding capacity of DNA-barcoded MHC multimers and recovery of antigen specificity

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Binding capacity of DNA-barcoded MHC multimers and recovery of antigen specificity Supplementary Figure 1 Binding capacity of DNA-barcoded MHC multimers and recovery of antigen specificity (a, b) Fluorescent-based determination of the binding capacity of DNA-barcoded MHC multimers (+barcode)

More information

Data mining with Ensembl Biomart. Stéphanie Le Gras

Data mining with Ensembl Biomart. Stéphanie Le Gras Data mining with Ensembl Biomart Stéphanie Le Gras (slegras@igbmc.fr) Guidelines Genome data Genome browsers Getting access to genomic data: Ensembl/BioMart 2 Genome Sequencing Example: Human genome 2000:

More information

Package mirlab. R topics documented: June 29, Type Package

Package mirlab. R topics documented: June 29, Type Package Type Package Package mirlab June 29, 2018 Title Dry lab for exploring mirna-mrna relationships Version 1.10.0 Date 2016-01-05 Author Thuc Duy Le, Junpeng Zhang Maintainer Thuc Duy Le

More information

Supplement to SCnorm: robust normalization of single-cell RNA-seq data

Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplementary Note 1: SCnorm does not require spike-ins, since we find that the performance of spike-ins in scrna-seq is often compromised,

More information

Current practice, needs and future directions in immuno-oncology research testing

Current practice, needs and future directions in immuno-oncology research testing Current practice, needs and future directions in immuno-oncology research testing Jose Carlos Machado IPATIMUP - Porto, Portugal ESMO 2017- THERMO FISHER SCIENTIFIC SYMPOSIUM Immune Therapies are Revolutionizing

More information

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes Kaifu Chen 1,2,3,4,5,10, Zhong Chen 6,10, Dayong Wu 6, Lili Zhang 7, Xueqiu Lin 1,2,8,

More information

How do they reconcile their data with Raghu Khalluri's data in Nature Cell Biology?

How do they reconcile their data with Raghu Khalluri's data in Nature Cell Biology? Reviewers' comments: Reviewer #1_Cancer Metabolism (Remarks to the Author): This paper demonstrates that cancers undergo a tissue-specific metabolic rewiring, which converges on downregulation of mitochondrial

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

Supplementary Figure 1. IDH1 and IDH2 mutation site sequences on WHO grade III

Supplementary Figure 1. IDH1 and IDH2 mutation site sequences on WHO grade III Supplementary Materials: Supplementary Figure 1. IDH1 and IDH2 mutation site sequences on WHO grade III patient samples. Genomic DNA samples extracted from punch biopsies from either FFPE or frozen tumor

More information

Supplementary Materials

Supplementary Materials 1 Supplementary Materials Rotger et al. Table S1A: Demographic characteristics of study participants. VNP RP EC CP (n=6) (n=66) (n=9) (n=5) Male gender, n(%) 5 (83) 54 (82) 5 (56) 3 (60) White ethnicity,

More information

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, Supplementary Information Supplementary Figures Supplementary Figure 1. a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, gene ID and specifities are provided. Those highlighted

More information

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS dr sc. Ana Krivokuća Laboratory for molecular genetics Institute for Oncology and

More information

Nature Structural & Molecular Biology: doi: /nsmb.2419

Nature Structural & Molecular Biology: doi: /nsmb.2419 Supplementary Figure 1 Mapped sequence reads and nucleosome occupancies. (a) Distribution of sequencing reads on the mouse reference genome for chromosome 14 as an example. The number of reads in a 1 Mb

More information

An innovative multi-dimensional NGS approach to understanding the tumor microenvironment and evolution

An innovative multi-dimensional NGS approach to understanding the tumor microenvironment and evolution An innovative multi-dimensional NGS approach to understanding the tumor microenvironment and evolution James H. Godsey, Ph.D. Vice President, Research & Development Clinical Sequencing Division (CSD) Life

More information

Supplementary Table S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort. Chr. # of Gene 2. Chr. # of Gene 1

Supplementary Table S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort. Chr. # of Gene 2. Chr. # of Gene 1 Supplementary Tale S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort TCGA Case ID Gene-1 Gene-2 Chr. # of Gene 1 Chr. # of Gene 2 Genomic coordiante of Gene 1 at fusion junction Genomic

More information

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Supplemental Data Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Jianfu Jiang 1, Xinna Liu 1, Guotian Liu, Chonghuih Liu*, Shaohuah Li*, and Lijun

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature22976 Supplementary Discussion The adaptive immune system uses a highly diverse population of T lymphocytes to selectively recognize and respond to antigenic

More information

SRARP and HSPB7 are epigenetically regulated gene pairs that function as tumor suppressors and predict clinical outcome in malignancies

SRARP and HSPB7 are epigenetically regulated gene pairs that function as tumor suppressors and predict clinical outcome in malignancies SRARP and HSPB7 are epigenetically regulated gene pairs that function as tumor suppressors and predict clinical outcome in malignancies Ali Naderi Cancer Biology Program, University of Hawaii Cancer Center,

More information

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning Yi-Hsiang Hsu, MD, SCD Sep 16, 2017 yihsianghsu@hsl.harvard.edu HSL GeneticEpi Center,

More information

The Immune Landscape of Cancer

The Immune Landscape of Cancer Resource The Immune Landscape of Cancer Graphical Abstract Authors Vésteinn Thorsson, David L. Gibbs, Scott D. Brown,..., Mary L. Disis, Benjamin G. Vincent, llya Shmulevich Correspondence vesteinn.thorsson@systemsbiology.org

More information

Cancer Informatics Lecture

Cancer Informatics Lecture Cancer Informatics Lecture Mayo-UIUC Computational Genomics Course June 22, 2018 Krishna Rani Kalari Ph.D. Associate Professor 2017 MFMER 3702274-1 Outline The Cancer Genome Atlas (TCGA) Genomic Data Commons

More information

A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers

A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Article A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Highlights d Integrated analysis finds molecular features characteristic of gynecologic tumors d d d Subtypes with high

More information

Supplementary Figure 1: Classification scheme for non-synonymous and nonsense germline MC1R variants. The common variants with previously established

Supplementary Figure 1: Classification scheme for non-synonymous and nonsense germline MC1R variants. The common variants with previously established Supplementary Figure 1: Classification scheme for nonsynonymous and nonsense germline MC1R variants. The common variants with previously established classifications 1 3 are shown. The effect of novel missense

More information

Hands-On Ten The BRCA1 Gene and Protein

Hands-On Ten The BRCA1 Gene and Protein Hands-On Ten The BRCA1 Gene and Protein Objective: To review transcription, translation, reading frames, mutations, and reading files from GenBank, and to review some of the bioinformatics tools, such

More information

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well plate at cell densities ranging from 25-225 cells in

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Analysis with SureCall 2.1

Analysis with SureCall 2.1 Analysis with SureCall 2.1 Danielle Fletcher Field Application Scientist July 2014 1 Stages of NGS Analysis Primary analysis, base calling Control Software FASTQ file reads + quality 2 Stages of NGS Analysis

More information

Supplemental Figure legends

Supplemental Figure legends Supplemental Figure legends Supplemental Figure S1 Frequently mutated genes. Frequently mutated genes (mutated in at least four patients) with information about mutation frequency, RNA-expression and copy-number.

More information

Expanded View Figures

Expanded View Figures Solip Park & Ben Lehner Epistasis is cancer type specific Molecular Systems Biology Expanded View Figures A B G C D E F H Figure EV1. Epistatic interactions detected in a pan-cancer analysis and saturation

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 U1 inhibition causes a shift of RNA-seq reads from exons to introns. (a) Evidence for the high purity of 4-shU-labeled RNAs used for RNA-seq. HeLa cells transfected with control

More information

Unsupervised Identification of Isotope-Labeled Peptides

Unsupervised Identification of Isotope-Labeled Peptides Unsupervised Identification of Isotope-Labeled Peptides Joshua E Goldford 13 and Igor GL Libourel 124 1 Biotechnology institute, University of Minnesota, Saint Paul, MN 55108 2 Department of Plant Biology,

More information

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant

More information

Cancer develops as a result of the accumulation of somatic

Cancer develops as a result of the accumulation of somatic Cancer-mutation network and the number and specificity of driver mutations Jaime Iranzo a,1, Iñigo Martincorena b, and Eugene V. Koonin a,1 a National Center for Biotechnology Information, National Library

More information

Tutorial: RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures

Tutorial: RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures : RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures March 15, 2013 CLC bio Finlandsgade 10-12 8200 Aarhus N Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com support@clcbio.com

More information

Supplementary Figure 1. Metabolic landscape of cancer discovery pipeline. RNAseq raw counts data of cancer and healthy tissue samples were downloaded

Supplementary Figure 1. Metabolic landscape of cancer discovery pipeline. RNAseq raw counts data of cancer and healthy tissue samples were downloaded Supplementary Figure 1. Metabolic landscape of cancer discovery pipeline. RNAseq raw counts data of cancer and healthy tissue samples were downloaded from TCGA and differentially expressed metabolic genes

More information