Frequency(%) 1 a b ALK FS-indel ALK R1Q HRAS Q61R HRAS G13R IDH R17K IDH R14Q MET exon14 SS-indel KIT D8Y KIT L76P KIT exon11 NFS-indel SMAD4 R361 IDH1 R13 CTNNB1 S37 CTNNB1 S4 AKT1 E17K ERBB D769H ERBB LS ERBB V84I ERBB exon NFS-indel ERBB V777L NRAS G13R NRAS G1 NRAS Q61 PTEN R13Q PTEN FS-indel BRAF G469A BRAF K61E BRAF D94 BRAF V6E EGFR L861Q EGFR C797S EGFR G719 EGFR exon NFS-indel EGFR T79M EGFR L88R EGFR exon19 NFS-indel PIK3CA Q46 PIK3CA E4K PIK3CA E4 PIK3CA H147 KRAS K117N KRAS Q61 KRAS A146 KRAS G13 KRAS G1 1 1 Supplementary Figure 1. VAF distributions for actionable SNVs and indels in our cohort. (a) A histogram of VAFs for actionable variants. (b) VAF distributions of individual actionable variants. Only the variants observed more than once are displayed here. NFS, non-frameshift; SS, splicing site; FS, frameshift.
b a T79M(8.%) C797S(4.%) c T79M(6.%) C797S(6.4%) T79M(1.8%) C797S(.%) d T79M(67.9%) C797S(18.%) e C797S(11.1%) T79M(.%) Supplementary Figure. EGFR C797S mutations in refractory lung cancer samples. All EGFR C797S mutations occurred in cis with EGFR T79M mutation and at lower VAFs.
ALK SNVs CTNNB1 SNVs SMAD4 SNVs IDH1 SNVs AKT1 SNVs NRAS SNVs ERBB SNVs BRAF K61 BRAF D94 BRAF V6 EGFR L861 EGFR G719 EGFR L88 EGFR T79 KRAS Q61 KRAS A146 KRAS G13 KRAS G1 PIK3CA Q46 PIK3CA E4 PIK3CA E4 PIK3CA H147 1 1 1 Density Sampling time Pre-treatment Post-treatment Supplementary FIgure 3. Comparison of VAF distributions for actionable SNVs in samples classified as pre- and post-treatment (chemotherapy). (a) VAF distributions for the two groups are shown after kernel density smoothing. There is no significant difference between the two distributions (p >., wilcoxon rank sum test). (b) VAF distributions of individual actionable variants, colored by their sampling time. The mutations that were observed more than once are summarized here. a b
ns ns p=.4 ns 4 Variant count 3 1 Sampling time Pre-treatment Post-treatment Gastric cancer(n=348) Colorectal cancer(n=74) Lung cancer(n=64) Breast cancer(n=13) Diagnosis Supplementary Figure 4. Comparison of pre- and post-treatment variant counts in four cancer types. Only those samples profiled on the V platform were used; the p values were calculated using the Wilcoxon rank sum test. ns, not significant.
a.8.6.4.. b 1..... Proportion(%) Proportion(%) Panel (n=1) 1 3 4 WES (n=31) 1 3 4 c Proportion(%) 6 4 WGS (n=4) 1 1 Supplementary Figure. Comparison of genome coverage distributions for panel, whole-exome sequencing (WES), and whole-genome sequencing (WGS) data in normal samples. (a) 1 normal cell lines, sequenced by the 381-gene CancerSCAN panel. (b) 31 normal blood samples, sequenced by WES (SureSelect XT Human All Exon v kit). (c) 4 normal blood samples, sequenced by WGS (TruSeq Nano kit). All sequencing was performed on Illumina HiSeq. Coverage of each sample was normalized by the median depth of the samples processed on each platform. Samples used are listed in Supplementary Data and Supplementary Table 1.
Gene KRAS PIK3CA BRAF PTEN ERBB MET FGFR1 PDGFRA NRAS MAPK1 IRS EGFR SGI_CS_79 and LOD StageIV colorectal cancer patients SGI_CS_1 SGI_CS_177 SGI_CS_76 SGI_CS_364 SGI_CS_184 SGI_CS_39 SGI_CS_47 SGI_CS_441 SGI_CS_48 SGI_CS_ SGI_CS_4 SGI_CS_3469 SGI_CS_4798 SGI_CS_387 SGI_CS_ SGI_CS_91 SGI_CS_399 SGI_CS_441 SGI_CS_444 SGI_CS_96 SGI_CS_4771 SGI_CS_976 1 Variant type SNV/Indel VAF (%) 1 4 %LOD %LOD Amplification SGI_CS_3819 SGI_CS_168 SGI_CS_336 SGI_CS_483 SGI_CS_18 SGI_CS_6 SGI_CS_61 SGI_CS_489 SGI_CS_318 SGI_CS_3641 SGI_CS_374 Cetuximab resistant Cetuximab responsive Supplementary Figure 6. Limit of detection estimates for genomic alterations. Providing the limit of detection (LOD) estimates based on observed sequencing depths can be informative. For example, anti-egfr therapy (Cetuximab) is given to Stage IV colorectal cancer patients only if they are wild-type for RAS. The heatmap shows the mutations in genes associated with response to anti-egfr therapy 1, with the size of the circles corresponding to the VAFs and LODs indicated by the background color. By annotating the detection results with LODs, it becomes easier to distinguish between true negatives and false negatives.
3 FFPE Fresh 1 MTOR D1 MTOR R MTOR I MTOR E419 MTOR S1 MTOR A MTOR I17 MTOR V6 MTOR T1977 MTOR F1888 MTOR E1799 MTOR C1483 MTOR L146 NRAS Q61 NRAS G13 NRAS G1 DDR S768 ALK R1 ALK G169 ALK F14 ALK D1 ALK S16 ALK G1 ALK L1196 ALK F1174 ALK I1171 ALK C116 ALK L11 ALK T111 ALK D191 IDH1 R13 CTNNB1 S37 CTNNB1 S4 PIK3CA E4 PIK3CA E4 PIK3CA E46 PIK3CA Q46 PIK3CA D49 PIK3CA H147 PDGFRA H687 PDGFRA D84 KIT K KIT Y3 3 1 KIT W7 KIT K8 KIT V9 KIT V6 KIT V61 KIT V6 KIT L76 KIT K64 KIT V64 KIT D816 KIT D8 KIT N8 KIT Y83 KIT A89 EGFR G719 EGFR T79 EGFR C797 EGFR L88 EGFR L861 MET R988 MET T11 MET V11 MET L119 MET F1 MET M111 MET D18 MET Y13 MET Y13 MET M1 SMO A68 SMO R199 SMO T349 SMO D473 SMO R48 SMO L1 SMO W3 SMO R6 SMO A6 SMO P BRAF K61 BRAF V6 BRAF L97 BRAF G96 3 1 BRAF D94 BRAF Y47 BRAF G469 BRAF G466 JAK R683 GNAQ Q9 ABL1 Y3 ABL1 E ABL1 V99 ABL1 T31 ABL1 F317 ABL1 F39 RET C634 RET M918 PTEN R13 PTEN R13 PTEN R19 PTEN R33 HRAS Q61 HRAS G13 HRAS G1 KRAS A146 KRAS K117 KRAS Q61 KRAS G13 KRAS G1 FLT3 I836 FLT3 D83 AKT1 E17 IDH R17 IDH R14 ERBB G39 ERBB L ERBB D769 ERBB V777 ERBB V84 ERBB R896 SMAD4 E33 SMAD4 D31 SMAD4 D3 SMAD4 R361 SMAD4 D37 GNA11 Q9 Hot spot positions Supplementary Figure 7. Difference in sequencing coverage between FFPE and fresh samples. Distribution of read depths at 41 SNV hotspot positions (Tier 1) are shown for 9 samples (1 FFPE samples and 83 fresh samples). The average coverage is generally higher for fresh samples but there is substantial variability across genes.
7% 3% % 6% 67% 47% 1% 41% 7% Gastric cancer (n=949) Lung cancer (n=91) Breast cancer (n=) 3% 6% 7% 38% 9% 3% 7% 7% 36% Colorectal cancer (n=9) Pancreatic cancer (n=71) Others (n=84) Patients with 1 actionable alterations Patients with 1 non-actionable, known alterations Patients without known alteration Supplementary Figure 8. Proportion of patients (panel V, n=398) with actionable, known but non-actionable, and without known alterations. The overall distribution is. shown in Fig. 4c. Here, the proportions are shown separately for the five most common tumor types and others. Known alterations are those found in COSMIC.
High depth panel sequencing with EGD biopsy PIK3CA E4K (VAF:4.1%) 7/F AGC 6 months o f conventional chemothe rapy Pre-treatment At months:pr months o f AKT inhibitor trial At months: PD At 8 months: PR CT EGD Supplementary Figure 9. Second example of a patient with a clinically-relevant low-allele-fraction mutation. A 7-year old female patient (SGI_CS_769) had a metastatic gastric cancer with peritoneal seeding. After failing to respond after 8 cycles of capecitabine/oxaliplatin chemotherapy, she had an esophagogastroduodenoscopy (EGD) biopsy. Genomic profiling of the biopsy tissue revealed a PIK3CA E4K mutation with 4.1% VAF. The variant was validated by dpcr (Supplementary Data 4). The patient was enrolled onto an AKT inhibitor trial, and has achieved partial remission for months. Arrows on the CT (computed tomography) images indicate the location of the tumor and the dotted circles indicate regions of peritoneal seeding. AGC, advanced gastric cancer; PD, progressive disease; PR, partial response.
1. p value.. 67.9 8. 3.9 44.1 3.9 7. 4.9.8.6 18. 18 16.6 1.9 14.9 13. 1.3 11.8 1.7 8.9 8 7.9 7. 6.. 4.6 4.3 3.7 3.6.9.4 1. VAF cutoff for survival analysis(%) Supplementary Figure 1. Robustness to the VAF cut-off for survival analysis. In Fig. e, % VAF was used as the cut-off value for comparing the survival curves of low vs high-vaf cases. This plots shows that the p values of log-rank test are non-significant for all possible divisions into two groups.
a b 1 9 1 1 9 1 Sensitivity(%) Filtering Before After 1 1 4 Sensitivity(%) Mean PPV(%) Filtering Before After 1 1 4 c 4 18 16 14 1 1 8 6 4 1 3 4 6 8 Mean depth of simulation samples 1 d 4 18 16 14 1 1 8 6 4 1 3 4 6 8 Mean depth of simulation samples 1 1 9 1 1 9 1 Sensitivity(%) Mean PPV(%) Filtering Before After 1 1 4 Sensitivity(%) Mean PPV(%) Filtering Before After 1 1 4 e 1 9 4 18 16 14 1 1 8 6 4 1 1 3 4 6 8 Mean depth of simulation samples 1 f 1 9 4 18 16 14 1 1 8 6 4 1 1 3 4 6 8 Mean depth of simulation samples 1 Sensitivity(%) Mean PPV(%) Filtering Before After 1 1 4 Sensitivity(%) Mean PPV(%) Filtering Before After 1 1 4 4 18 16 14 1 1 8 6 4 1 3 4 6 8 Mean depth of simulation samples 1 4 18 16 14 1 1 8 6 4 1 3 4 6 8 Mean depth of simulation samples 1 Supplementary Figure 11. LOD and positive predictive value (PPV) for different SNV caller combinations/parameters. (a) Combination of MuTect (High-confidence (HC) mode and default contamination fraction) and LoFreq (default). (b) Combination of MuTect (HC mode and contamination fraction (.)) and LoFreq (default). (c) MuTect (HC mode and contamination fraction.). (d) MuTect (HC mode and contamination fraction.1). (e) MuTect (HC mode and default contamination fraction). (f) LoFreq (default). Vertical dotted lines indicate athe read depth to achieve LOD (9% sensitivity). In each case, we also show the results with and without a custom filter, which increases PPV in exchanges for a small increase in depth necessary for a given LOD. Although the option (c) shows excellent performance, it does not achieve perfect sensitivity because some variants are filtered as a result of its stringent filtering criteria. Therefore, we chose to use (a), where LoFreq rescues those filtered variants.
1 9 Method In silico Manual Sensitivity(%) VAF (%) 1 4 1 1 3 3 4 4 Supplementary Figure 1. Comparison of detection sensitivity between the manual and the in silico dilution assays. The lines were interpolated using the probit function. LODs of at allele fractions 4%,%, 1%, and % are 19X, 44X, 11X, and 7X for the manual dilution (experimental mixing of cell lines) and 18X, 4X, 94X, and 94X for the in silico dilution, respectively.
a b 3 1 Count Count 1 1 4 4 6 1 1 Supplementary Figure 13. Distributions of VAFs and depths at heterozygous SNPs in NA1878. (a) Distribution of VAFs. At these SNPs in the target exonic regions of the panel, the VAFs are centered near. but there is variability due to variation in depth. (b) Distribution of depths. Some SNPs are in hard-to-capture regions and do not many reads.
Heterozygous SNP of NA1878 ~. ~ 1-p NGS ~. Base replacement ~ P Supplementary Figure 14. Scheme of in silico dilution assay. The in silico dilution assay is based on the heterozygous SNPs in NA1878. Reads harboring the variant base at the SNP positions are sampled; the variant base is replaced by the reference base with a specified probability to achieve the desired VAF.
Supplementary Table 1. Normal HapMap cell-lines used for manual dilution assay. Sample name Dilution assay Platform NA714 Manual Panel V NA184 Manual Panel V NA189 Manual Panel V NA1897 Manual Panel V NA18488 Manual Panel V NA1811 Manual Panel V NA18867 Manual Panel V NA1894 Manual Panel V NA1918 Manual Panel V NA19114 Manual Panel V
Supplementary Table. Classification of variants into three tiers based on the actionable information. Tier Definition of tier classification of Definition of tier classification of fusion SNV,Indel,&CNV 1 Alterations listed as targets of cancer therapeutics in Korean Food Drug Gene-gene fusions of fusion target gene with known partner reported in COSMIC Administration (KFDA)/United States Food Drug Administration (USFDA) or reported to be a candidate for clinical trials Any alterations reported in COSMIC Gene-gene fusions of fusion target gene except tier 1 3 Any novel alterations, not reported in COSMIC with novel partner Gene-gene fusions at non-fusion target region
REFERENCES 1. Bertotti A, et al. The genomic landscape of response to EGFR blockade in colorectal cancer. Nature 6, 63-67 (1).