SUPPLEMENTARY INFORMATION
|
|
- Georgiana Simon
- 6 years ago
- Views:
Transcription
1 doi: /nature13394 Genome sequencing identifies major causes of severe intellectual disability Table of Contents Participants...4 Patient recruitment... 4 Patient selection... 4 Cohort representativeness... 7 Counseling and sample acquisition... 7 Whole genome sequencing and data analysis...8 Whole genome sequencing... 8 Validation of WGS variant calling... 8 Identification of de novo SNVs...8 Proof-of-principle experiments for de novo SNV detection Prioritization of candidate (de novo) mutations Prioritization of autosomal inherited SNVs for recessive inheritance Prioritization of maternally-inherited X-linked recessive SNVs in male patients Validation of candidate (de novo) SNVs Identification of (de novo) CNV and SV event Prioritization for clinically relevant dominant (de novo) and recessive CNV and SVs Validation of clinically relevant CNV and SV events
2 Cross variant type analysis for clinically relevant recessive mutations Generation of gene lists involved in ID Assessment of rare missense burden of known ID genes, candidate ID genes and genes with de novo mutations Statistical analysis for overrepresentation of deleterious mutations in patients with severe ID Overrepresentation of de novo mutations Statistical analysis of clinically relevant CNV and SV events RNA confirmation studies for IQSEC2-TENM3 gene fusion Amplicon-based deep sequencing using Ion Torrent Clinical interpretation of mutations using established diagnostic criteria Mutation impact of SNVs Functional relevance of SNVs Mutation impact and functional relevance of (de novo) CNV and SVs Classification of (de novo) variants as cause of disease Phenotypic re-evaluation Statistical evaluation of diagnostic criteria using de novo mutations identified in control individuals Calculation of de novo mutations as cause of severe ID Supplementary tables Supplementary Table 1: Detailed cohort description of 50 trios studied by WGS
3 Supplementary Table 2: Average genome sequencing statistics per individual Supplementary Table 3: Comparison of variants between WES and WGS identified in regions targeted by exome enrichment kit Supplementary Table 4: Overview of putative de novo SNVs detected through WGS Supplementary Table 5: Validation rates of autosomal candidate de novo SNVs grouped by genomic location and confidence level Supplementary Table 6: Distribution of de novo SNVs in coding sequence identified through WGS Supplementary Table 7: De novo substitution rates and type of mutations identified through WGS Supplementary Table 8: Coding de novo SNVs identified in 50 patients with severe ID Supplementary Table 9: Coding de novo mutation rates of published studies Supplementary Table 10: List of 528 known ID genes used for prioritization of variants Supplementary Table 11: List of 628 Candidate ID genes used for prioritization of variants Supplementary Table 12: Summary of de novo non-coding SNVs identified in 528 known ID genes Supplementary Table 13: De novo SNVs in non-coding sequence identified in patients with severe ID Supplementary Table 14: Phenotypic comparison of patients with mutations in known and candidate ID genes Supplementary Table 15: Molecular diagnosis per patient after WGS References
4 Participants Patient recruitment The Department of Human Genetics from the Radboud university medical center is a tertiary referral center for clinical genetics. Approximately 350 patients with unexplained intellectual disability (ID) are referred annually to our clinic for diagnostic evaluation. Prior to this referral, the majority of patients have received extensive clinical evaluations by other medical specialists such as pediatricians and (pediatric) neurologists. After referral, all patients are clinically evaluated by a clinical geneticist and, in most cases, receive a routine genetic diagnostic assessment, including a 250K Affymetrix SNP microarray analysis for the detection of copy number variations and dedicated DNA diagnostic single-gene tests based on clinical suspicion of specific syndromes. The majority of patients also receive a routine metabolic screen in serum and urine. Patient selection The clinical data of all patients tested with a SNP microarray are stored in an in-house database, which contains information of >5,600 patients with ID and/or multiple congenital anomalies. 31 The vast majority of this cohort (92%) consists of isolated patients from non-consanguineous families. In total, 62% of patients are reported to have an IQ<50. Of these, we previously recruited 100 patients with severe unexplained ID (IQ 50) for family-based exome sequencing. 32 In this series, we reported 16 patients to have a conclusive diagnostic report, and another 19 patients to have a de novo mutation in a candidate ID gene. Since the publication of this report, we performed additional experiments to increase the diagnostic yield in this series, including (i) familial segregation studies for X-linked variants identified in 3 patients; (ii) bioinformatic re-analysis of WES data in all 100 patients using new software (LifeScope v2.1) for mutations in a set of 495 known ID genes (as reported in Supplementary Table 12), (iii) CNV calling using CoNIFER 33 as described by de Ligt et al and (iv) systematic searches for additional patients carrying de novo mutations in candidate ID genes by e.g. literature reports and/or in-house diagnostic WES analyses. A summary of these analyses is presented in Table 1. Combined, this re-evaluated molecular screening increased the number of conclusive diagnoses for ID to 27/100 patients (27%). A possible cause for ID remains in 11% of patients. For a total of 62 patients WES did not reveal a genetic cause for ID. From this series of 62 WES-negative patients, 50 patients were recruited for the current study on family-based whole genome sequencing. 4
5 Table 1: Re-evaluated molecular diagnosis in 100 patients with severe ID after publication of diagnostic exome sequencing in patients with severe ID. Patient ID in de Ligt et al Molecular diagnosis 4 Highly likely Based on [gene] (inheritance) PDHA1 (X-linked) Re-evalutated molecular diagnosis Based on [gene] (inheritance) NO - Change due to: X-linked variant does not segregate 5 Possibly PHIP (de novo) Highly likely PHIP (de novo) In house 2nd case identified through diagnostic WES in patient with severe ID 6 Possibly PSMA7 (de novo) Highly likely 14 NO Highly likely EHMT1 deletion (de novo) DDX3X (de novo) CNV calling on WES data identified deletion of EHMT1 (de novo) Bioinformatic re-analysis of WES data identified de novo mutation in DDX3X; A second patient was subsequently identified by diagnostic WES in a patient with ID 18 Highly likely ARHGEF9 (X-linked) NO X-linked variant does not segregate 40 Possibly WAC/ MIB1 (de novo) Highly likely 60 NO Highly likely WAC (de novo) TCF4 (de novo) In house 2nd case identified through diagnostic WES in patient with severe ID Bioinformatic re-analysis identified a de novo mutation in TCF4 72 Possibly PPP2R5D (de novo) Highly likely PPP2R5D (de novo) 2nd case identified in house in patient with severe ID 74 Possibly KIF5C (de novo) Highly likely KIF5C (de novo) Literature reports additional patient with de novo mutation in KIF5C 77 NO Highly likely 85 NO Highly likely STXBP1 (de novo) SCN2A (de novo) Bioinformatic re-analysis identified a de novo mutation STXBP1 Bioinformatic re-analysis identified a de novo mutation in SCN2A 87 Possibly RAPGEF1 (de novo) Highly likely MECP2 (de novo) Bioinformatic re-analysis identified MECP2 de novo mutation 91 Possibly EEF1A2 (de novo) Highly likely EEF1A2 (de novo) 2nd patient in literature with exact same de novo mutation 92 Possibly MYT1L (de novo) Highly likely MYT1L (de novo) Overlapping small CNV containing MYT1L described in literature 96 NO Highly likely ADNP (de novo) Bioinformatic re-analysis of WES data identified a de novo 5
6 mutation in ADNP; A second patient was subsequently identified by diagnostic WES in a patient with ID 6
7 Cohort representativeness Details on the clinical characteristics of all patients are provided by de Ligt et al For the purpose of clarity, Table 2 provides the patient identifiers used in this study as well as the ones used in the previously published WES study. 32 Based on the extensive clinical and diagnostic work-up performed for this series, including genomic microarray analysis (250K Affymetrix SNP microarray) as well as diagnostic exome sequencing our series represents the end point of current diagnostic strategies for patients with unexplained severe ID. Table 2: Cross-references for patient numbers used in de Ligt et al. and the current study Trio ID current study Trio ID de Ligt et al Trio ID current study Trio ID de Ligt et al Trio ID current study Trio ID de Ligt et al Counseling and sample acquisition All 50 patients were part of our previous study on diagnostic exome sequencing in patients with severe ID and were seen and counseled by clinical specialist (MHW, BvB, TK, BBAdV or HB). This study was approved by the Medical Ethics Committee of the Radboud university medical center (NL ; NL ; NL ). All participants or their legal representatives signed informed consent. 7
8 Whole genome sequencing and data analysis Whole genome sequencing Whole Genome Sequencing (WGS) was performed by Complete Genomics (Mountain View, California, USA) as previously described by Drmanac et al. 35 Prior to sequencing, a minimum set of quality controls was performed including DNA measurement using picogreen and gel electrophoreses to exclude sample degradation. Sequence reads were mapped to the reference genome (GRCh37) and variants were called in the aligned genome according to the methods described by Carnevali et al. using version 2.4 of the Complete Genomics software. 36,37 All samples passed internal Complete Genomics quality control parameters and a gender check and 96-SNP genotyping assay were performed to exclude sample-swaps. For additional quality control metrics Complete Genomics calculated the ratio of variants reported in dbsnpv137 to all variants identified (on average 97.3% of all called variants and 98.5% of high confidence variants). Validation of WGS variant calling As proof-of-principle experiments, we performed WGS on samples from 8 patients with known pathogenic mutations. All eight known pathogenic mutations were accurately identified (Table 3a). In addition, for all 50 trios (totaling 150 samples), we compared the overlap between variants previously called by WES and WGS. To compare the performance of both platforms, we first filtered the WGS variants to only include variants from within the exome targets (Agilent SureSelect version 2). The differences in variants identified could be explained by annotation differences between WGS and WES variant calling pipelines. For example, WGS might have called a three base substitution (e.g. ACG > TCC) whereas WES called two separate variants (e.g. A > T and G > C). In addition, we annotated all variants with the average exome target coverage for the same sample. Variants not called by WES but only called by WGS could be found predominantly in regions of low WES coverage. Vice versa, variants only identified in WES could be found more frequently in regions of above average exome coverage, likely indicating alignment problems due to repeats (Supplementary Table 3). Additionally, we identified a bias in our initial variant calling algorithm that only calle variants if they were found in reads on both strands, thereby missing some of the de novo variants. Identification of de novo SNVs Variants were identified by Complete Genomics reported in var format. The initial set of candidate de novo mutations, including both SNVs and insertion/deletion events, was generated using the cgatools calldiff program (Complete Genomics). This program compares 8
9 Table 3: De novo pathogenic and non-pathogenic variants previously detected using WES used to optimize de novo variant detection Trio ID Gene Genomic position (GRCh37) Protein level A: Known pathogenic mutations in patients with severe ID identified using WES Variant in WGS data Identified as de novo mutation? Pos. control 1 TCF4 Chr18(GRCh37):g C>T p.(arg576gln) + n.a. Pos. control 2 SCN2A Chr2(GRCh37):g G>A p.(trp1398*) + n.a Pos. control 3 GRIN2B Chr12(GRCh37):g G>A p.(pro553leu) + n.a Pos. control 4 GRIN2A Chr16(GRCh37):g G>C p.(pro522arg) + n.a Pos. control 5 PDHA1 ChrX(GRCh37):g delinsAGA p.(pro110argfs*71) + n.a Pos. control 6 GATAD2B Chr1(GRCH37):g G>A p.(gln470*) + n.a Pos. control 7 CTNNB1 Chr3(GRCh37):g del p.(ser425thrfs*11) + n.a Pos. control 8 LRP2 Chr2(GRCh37):g del p.(gly4146glufs*2) + n.a B: De novo mutations identified using WES 2 that are not considered to be the cause of ID 2 TRIO Chr5(GRCh37):g A>T p.(aps1368val) C15orf40 Chr15(GRCh37):g C>T p.(ala89thr) RB1 Chr13(GRCh37):g T>C p.(tyr173his) CNGA3 Chr2(GRCh37):g G>T p.(trp335cys) CXXC11 Chr2(GRCh37):g G>A p.(gly412asp) CLRN3 Chr10(GRCh37):g A>T p.(ile22asn) OSBPL9 Chr1(GRCh37):g C>T p.(pro276leu) MFAP3 Chr5(GRCh37):g A>G p.(ser53gly) FHDC1 Chr4(GRCh37):g G>C p.(leu297phe) ALG13 ChrX(GRCh37):g A>G p.(asn107ser) KRT32 Chr17(GRCh37):g C>T p.(glu89lys) HADHA Chr2(GRCh37):g C>T p.(=) ZNF831 Chr20(GRCh37):g G>A p.(arg1615his) USP8 Chr15(GRCh37):g G>A p.(=) NEK1 Chr4(GRCh37):g T>G p.(lys901asn) + -* 36 ATP7B Chr13(GRCh37):g G>A p.(=) EFS Chr14(GRCh37):g G>A p.(=) ZFYVE16 Chr5(GRCh37):g C>T p.(ala380val) C20orf26 Chr20(GRCh37):g A>C p.(thr790pro) RBL2 Chr16(GRCh37):g G>T p.(val99phe) ABCC8 Chr11(GRCh37):g G>A p.(=) + + +: Identified; -: not identified; n.a.: not assessed. *Variant was detected as somatic variant in the patient after deep-sequencing (see also Figure 1). Trio ID refer to numbers used in this study. 9
10 two variant files to determine where and how two genomes differ and returns a confidence score for each comparison (e.g. for de novo mutations, two scores are assigned one for the comparison of the child to his/her father and one for the comparison to his/her mother). This program can be used under the assumption of a non-diploid genome (suited for the calling of somatic mutations) as well as under the assumption of a diploid genome (suited for the calling of germline mutations). In our analysis for candidate de novo mutations, we combined these approaches; first, we compared the set of all initial candidate autosomal variants in the child separately to those of each of the parents under the assumption of a non-diploid genome. To this end, calldiff thus determined whether a variant is truly found in only the child s genome by gathering variants into superloci and performing a local refinement of variant calls. Next, we performed the same analysis under the explicit assumption of a diploid genome. For both analyses, the individual comparisons to each parent were merged and filtered for variants with varquality = PASS. The above analyses resulted in two separate lists of candidate de novo variants, diploid and non-diploid, which were merged into a single list. Note that the diploid calls were for the most part a subset of the non-dipolid calls. Variants occurring in both lists were assigned scores from the non-diploid list. Based on the rank order of the two confidence scores for each candidate de novo mutation, we binned the variants in three groups: Low confidence: at least one score < 0 Medium confidence: both scores 0 but at least one score < 5 High confidence: both scores 5 On average, 82 high, 99 medium and 1,279 low confidence de novo SNVs were called on the autosomes of each patient (Supplementary Table 4). As calldiff requires the same ploidy state for comparison of variants, we applied a different method for calling of de novo variants on sex chromosomes, including a separate analysis for male and female offspring. Variants on the X chromosome in female patients were compared to the maternal genome (using the diploid assumption) and binning of candidate de novo variants was based on a single confidence score (i.e. comparison to the single parent rather than both as performed for the autosomes). For male patients, variants were prioritized based only on variant call quality (i.e. varfilter = PASS). Proof-of-principle experiments for de novo SNV detection In the cohort of 50 patients selected for this study, WES had not revealed a de novo mutation as the cause for ID. Nonetheless, 21 de novo mutations were detected in these 50 patients that most likely represent background mutations explained by the human per-generation de novo 10
11 mutation rate. The de novo substitution rate in this set of patients (0.42) is lower than the full series of 100 patients. Analysis of previous WES for quality metrics, including coverage, however, did not reveal a significant difference between this subset of 50 patients compared to the 50 patients that were not selected for our current study (median WES coverage for patients in WGS study: 62.9-fold, and for patients not part of this study 62.2-fold; Mann-Whitney U-test, P=0.85). To further verify de novo mutation detection, we assessed whether these de novo mutations were accurately detected using WGS. Using the method described above, 21/21 mutations were present in the patient s genome (present in var file). In the downstream analysis for de novo mutation detection, 19/21 mutations were routinely detected as a de novo (Table 3b). Further evaluation of why 2/21 mutations were not identified as a potential de novo mutation revealed that one mutation (in CXXC11) was absent from the de novo candidate list due to low QC values after comparison to parental samples ( PASS;SQLOW, i.e. a somatic score < -10). Interestingly, the other mutation (in NEK1) was not correctly annotated as candidate de novo since the variant was suggested to be present in mosaic state in the patient (confidence scores at for comparison to parental genomes were -4 and -4). Follow-up experiments with deep-sequencing up to >25,000 sequence depth using Ion Torrent (Life Technologies) indeed confirmed this variant to be present at 32%, thereby confirming its somatic origin and explaining why it was not correctly identified in our pipeline. Prioritization of candidate (de novo) mutations All variants reported in the candidate de novo lists were annotated using our custom in-house bioinformatic pipeline, including information on their genomic position (exon, splice site, intron, UTR, promoter region), their predicted effect on protein level, and their frequency in variant databases (in-house; dbsnpv37; Exome variant server (ESP), 38 the genome of the Netherlands (GoNL), 39 and the Wellderly dataset 40 ). We then proceeded to systematically validate all variants that according to RefSeq were: within the coding sequence and canonical splice sites (+2bp/-2bp) and that o had a high or medium confidence of being de novo o disrupted the gene by causing a frameshift or stop mutation o were within a known ID gene (Supplementary Table 10) within a acceptor (+20bp) or donor (8bp) splice site and that had a high or medium confidence of being de novo outside of the coding regions, had a high confidence of being de novo and were within: o the introns of a known ID gene o the UTR of a known ID gene
12 o the promoter region of a known ID gene according to ENCODE information (Chromatin state segments of nine human cell types (Broad ChromHMM) and transcription factor binding sites (Txn Factor ChIP)). 41 This prioritization scheme resulted in 382 potential de novo autosomal mutations for further validation using Sanger sequencing. For the X-chromosome, candidate de novo mutations were selected in a similar way as autosomal variants, albeit that only a single confidence score was used (see Identification of de novo SNV mutations ). For male patients we selected variants with VarFilter = PASS, followed by the same filter steps as used in female patients. The systematic filtering process identified 36 candidate de novo mutations on the X-chromosome for further validation using Sanger sequencing. Prioritization of autosomal inherited SNVs for recessive inheritance Using the Complete Genomics software cgatools listvariants and cgatools testvariants, variants identified in the patient were annotated with the parent-of-origin. Using these annotated files, we selected coding variants (or affecting canonical splice sites) in known ID genes (Supplementary Table 10) that segregated according to a recessive inheritance pattern, including both homozygous variants (genotype calls for patient, father, mother respectively) as well as compound heterozygous mutations (two variants in same gene, one showing genotype call and one for patient, father, mother respectively). To further prioritize for clinically relevant mutations, variants were filtered for a minor allele frequency <=2% using our in-house variant database, GoNL 39, dbsnpv137, and the Wellderly data set 40. Additionally, synonymous variants as well as missense mutations with a PhyloP score <3.5 were excluded for analysis. 42 This analysis did not identify recessive SNV mutations, either compound heterozygous or homozygous. Prioritization of maternally-inherited X-linked recessive SNVs in male patients X-linked variants that segregated from mother to male offspring were identified using the Complete Genomics cgatools listvariants and testvariants programs by genotype calls (genotype child, father, mother). We selected according to an X-linked inheritance assuming a pattern of a heterozygous variant identified in the mother and the same hemizygous variant in the male offspring. Variants were subsequently annotated using a custom annotation pipeline and prioritized for coding and canonical splice-site variants located in known ID genes (Supplementary Table 10). To further prioritize for clinically relevant mutations, variants were filtered for a minor allele frequency <=2% using our in-house variant server, GoNL 39,
13 dbsnpv137, and the Wellderly data set. 40 Additionally, synonymous variants as well as missense mutations with a PhyloP score 5.0 were excluded for analysis. 42 This filter process revealed one X-linked variant in patient 24, for which segregation analysis has so far been inconclusive in determining its relevance to the patient s phenotype. Validation of candidate (de novo) SNVs Validation and de novo testing for all candidate de novo mutations was performed using standard Sanger sequencing approaches. Primers were designed to surround the candidate mutation, and PCR reactions were performed using RedTaq Readymix PCR reaction mix (Sigma-Aldrich, St. Louis, MO, USA), AmpliTaqGold 360 Master mix (Life Technologies, Bleiswijk, NL) and/or Phire Hot Start Master mix (Finnzymes, Bioke, NL). Primer sequences and PCR conditions are available upon request. Of 382 autosomal candidate de novo mutations, 136 variants (35.6%) were confirmed by Sanger sequencing of DNA of the patient. Analyses of parental DNA samples subsequently established de novo occurrence for 131 of these (34.3%). Of note, validation rates correlated well with the confidence levels, being 3.0%, 20.0% and 78.9% for low, medium and high confidence autosomal candidate de novo variants respectively (Supplementary Table 5). For candidate de novo SNVs on the X-chromosome, 8/36 (22%) were validated and shown to be of de novo origin using Sanger sequencing. Identification of (de novo) CNV and SV event Two different approaches were used to detect copy number variation, i.e. one based on read depth information (referred to as CNV calling), and one based on discordant read pairs (referred to as SV calling), both provided by Complete Genomics. The latter method (SV calling) was used both to provide additional evidence to support for CNV calls, as well as to detect copy neutral events such as genomic inversions and/or balanced translocations. That is, a CNV identified using the read depth approach (with a resolution determined by the window bin-size) could be fine-mapped using the SV algorithm to single nucleotide level (using the discordant read pairs). In addition, the SV detection method was able to detect copy number variants below the bin-size used by the read depth method. In total, 13,799 CNVs were detected by Complete Genomics based on deviations in the sequencing read depth in comparison to a baseline using 2kb, GC-corrected windows and a Hidden Markov Model. 37 The called CNV segments were filtered on the ploidity score > 10 and subsequently filtered based on frequency in the population and likely functional impact. Annotations were based on gene components (as per RefSeq), overlap with CNVs from GoNL release 1 ref 39, overlap with non-related parents from the dataset, as well as Wellderly data set 40 using both a 80%, 50% reciprocal and 1bp overlap. Similarly regions were annotated for overlap
14 with the ID gene list (Supplementary Table 10) using a 1bp overlap. Subsequently CNVs were filtered based on frequency in the population and likely functional impact. CNVs were considered rare when i) they were found in less than 1% of the Wellderly population, ii) less than or equal to two non-related parents out of a subset of 48 parents, iii) less than or equal to three out of 250 GoNL trios. A copy number aware reciprocal overlap of 80% was used that considered two segments to overlap when the control CNV had the same copy number or higher than the patient s CNV gains or the same copy number or lower than the patient s CNV losses. In addition, only male individuals were included as control samples for chromosome X. 551 rare CNV segments were identified outside hypervariable regions, of which 130 overlaped with a coding region of the genome, these segements were then considered for prioritization based on inheritance. CNVs were visualized using Biodiscovery Nexus software. Secondly, SV calling was performed using uniquely mapped discordant read pairs. Discordant read pairs were clustered into junctions based on genomic location and mapping orientation and finally merged into 194,913 events: tandem duplications (n=11,422), distal duplications (n=6,532), deletions (n=99,471), inversions (n=68,461), translocations and complex events (n=8,980). The junction breakpoints were refined where possible by local reassembly of neighboring reads. 37 After SV calling, filtering was performed on all duplication types, deletions, inversions and translocations to identify rare de novo SVs occurring, i) in 1% of Wellderly samples 43, ii) occurring in neither parent, iii) the least frequent junction involved in the event should have a Complete Genomics baseline frequency of 0.1 and iv) mean read depth of at least 10 reads across the event. SV events were considered overlapping when all genomic regions contributing to the event combined overlapped reciprocally by more than 80% or had matching breakpoints within a 50 bp window. In total 1,350 rare events were identified of which 67 had at least one high confidence junction and were considered for prioritization based on inheritance. Prioritization for clinically relevant dominant (de novo) and recessive CNV and SVs A candidate list of CNV events was created to include regions overlapping with RefSeq genes and segments with a higher copy number gain or a lower copy number loss than the patient s own parents. After filtering, the CNVs were analyzed for de novo occurrence, autosomal recessive inheritance, as well as X-linked inheritance and compound heterozygosity. Losses or gains in the patient were considered de novo when overlapping parental segments (80% reciprocal) were copy number neutral. Potential autosomal recessive inheritance CNVs were extracted by identifying CNVs in which the parents had overlapping segments of the same type (i.e. gain or loss) thereby extracting homozygous deletions and triplications. X-linked inheritance was considered for male offspring when the mother had an overlapping segment of
15 the same CNV type. Finally, compound events were investigated by comparing events that overlap the same RefSeq gene when one CNV was maternally inherited and the other was paternally inherited. SVs were annotated for overlap with coding sequence or to occur in non-coding sequencing of a known ID gene (Supplementary Table 10). To identify genic SVs with autosomal recessive or X- linked inheritance the filtering was adjusted to include events overlapping SVs from the same SV class segregating from parent to patient. To detect SVs occurring via X-linked inheritance SVs were identified occurring in male offspring and their mothers. Finally compound SV-SV events were investigated by comparing events that overlap the same RefSeq gene. Variants were only paired when one variant was maternally inherited and the other paternally. The resulting compound events were filtered to identify high confidence rare events (occurring <= 1% in the Wellderly population and having at least 10 supporting read pairs). Validation of clinically relevant CNV and SV events All candidate pathogenic CNVs and SVs were validated using the Affymetrix CytoscanHD microarray platform and/or breakpoint spanning PCRs. Affymetrix arrays were used according the manufacturer s instructions. For breakpoint spanning PCRs, primer sequences were chosen to be located ~ base pairs from the predicted junction. Primer sequences are available upon request. PCR reactions were performed using RedTaq Readymix PCR reaction mix (Sigma-Aldrich, St. Louis, MO, USA), AmpliTaqGold 360 Master mix (Life Technologies, Bleiswijk, NL) and/or Phire Hot Start Master mix (Finnzymes, Bioke, NL). Afterwards, Sanger sequencing was used to verify the junction fragments. Cross variant type analysis for clinically relevant recessive mutations Cross variant compound events were considered when two variants of different mutation types affected the same gene but were inherited from different parents (e.g. an inherited SNV from father and an inherited CNV from mother, both affecting the same gene). Cross variant compound analysis was performed between all variant types called by Complete Genomics. SNVs that were heterozygous in the child and found in heterozygous state in only one of the parents were considered for the compound analysis. Such SNVs were extracted by filtering the output of cgatools listvariants and cgatools testvariants by genotype calls or for patient, father, mother, respectively. CNVs that were considered for compound analysis must have only one parent with an overlapping CNV of the same type with more than 80% reciprocal overlap. Likewise, SVs were considered for compound analysis if overlap was identified in only one of the parents using a reciprocal overlap of 80% or more. Of note, the
16 analysis between SVs and CNVs excluded regions that had an reciprocal overlap of 50% or matching breakpoints within a 200bp window, to avoid predicting events that were in fact the same variant. All resulting cross compound variants were filtered on frequency in the control population (see above) and ranked on likely functional importance, e.g. coding exons having a higher priority than introns. This analysis did not reveal any candidates of potential clinical relevance. Generation of gene lists involved in ID To systematically search for genes involved in ID phenotypes and to classify them as known, or candidate ID gene the following actions were performed: 1. All protein-coding variants reported in HGMD (Professional th December 2013) that were reported to occur in patients with a phenotype of intellectual disability or mental retardation (n=368 genes). HGMD attempts to catalog known (published) gene lesions responsible for human genetic disease. Recent publications however have shown that some mutations collated do not in fact represent disease-causing mutations, hence, we subsequently annotated and prioritized all mutations based on mutation type (synonymous, missense, nonsense, splice site, insertion/deletion, gross CNV) and genomic conservation (phylop 3.5). We considered mutations to be pathogenic when they were either (i) deleterious (nonsense, splice, insertion/deletion and gross CNV) or (ii) represented missense mutation with a disease mutation (DM) annotation. Next, we counted the number of patients known to HGMD with pathogenic mutations in each of the genes. 2. We systematically collected all de novo mutations reported by large-scale exome sequencing studies that focused on the detection of de novo mutations in patients with intellectual disability. 32,44,45 These data were supplemented with de novo mutations identified in patients with epilepsy 46, autism and schizophrenia because of phenotypic overlap with ID. To prioritize likely pathogenic mutations, variants were prioritized according to the following: The variant was identified in a proband The variant was validated by an alternative technique and proven of de novo origin De novo variants that were previously identified by de Ligt et al Ref 32 and occurring in the current series of 50 patients with severe ID were excluded (to prevent over-interpretation of genes Table 3b in supplement) - Only non-synonymous mutations were taken into account The PhyloP conservation of missense mutations was
17 For all genes (n=462) that remained after this selection, we subsequently interrogated HGMD to see how many additional patients were reported to have mutations in these genes. 3. We performed a systematic literature search in PubMed using the search terms Whole exome sequencing AND Intellectual Disability (23 rd January 2014). This search returned 110 papers describing an individual gene and its relation to ID as well as 24 studies describing multiple genes. All genes (n=154) were manually extracted from these publications and subsequently annotated with the number of patients reported. Of note, for publications describing families with autosomal recessive ID, each family was counted as one patient. 4. Within the diagnostic section of our Department of Human Genetics, we routinely screen patient-parents trios with severe intellectual disability using exome sequencing. During data analysis, prioritization of (de novo) variants is based on a list of ID genes previously reported in patients with ID (n=494 genes). Due to the phenotypic overlap of ID with epilepsy and/or several metabolic disorders, this gene lists represents a wide spectrum of genes involved in ID. We prioritized this gene list in based on the number of patients with pathogenic mutations using the variants from action 1 as well as from our in-house diagnostic and research databases collecting patients with de novo mutations. Based on published studies screening large patient cohorts we calculated that 1,476 patients were systematically screened. Including the single gene studies, we therefore estimated the total number of patients to be roughly 2,500. We then calculated the required number of patients with a de novo mutation in a single gene to achieve a 0.1/18,424 (the number of genes being tested) probability of a false positive. More than 90% of all 18,424 genes have a coding size of 3,400 bases (or less). We subsequently used this as a conservative gene size estimate. Applying a Poisson distribution showed that mutations in a single gene in 5 patients or more are required for it to be a ID gene, whereas genes with mutations in one or more but less than 5 patients were used as candidate ID gene. Table 4: Classification of genes according to the frequency of reported pathogenic mutations Total genes Known ID gene Candidate ID gene HGMD with phenotype ID or MR PubMed search WES+ID Diagnostic gene list Large-scale sequencing studies with overlapping phenotypes Total unique genes
18 Based on these calculations, the four lists of known ID genes and four lists of candidate ID genes were combined (Table 4), checked for overlap and adjusted for gene category when necessary (when after combining multiple searches 5 patients were reported). Both lists were then filtered for unique genes. These efforts collectively resulted in a known ID gene list containing 528 genes (Supplementary Table 10), and a candidate ID gene list containing 628 genes (Supplementary Table 11). Assessment of rare missense burden of known ID genes, candidate ID genes and genes with de novo mutations Similar to Petrovski et al. 54 and Kurana et al. 55 we evaluated the tolerance of genes for variation by assessing variation burden in the normal population. We downloaded all variants called in 6,503 exomes from the Washington variation server (ESP6500SI) 38 from which we selected all missense and synonymous variants not associated to disease, hence removing 1) insertions and deletions 2) nonsense variants 3) variants that incorporated multiple bases or alleles 4) variants of which the effect on protein could not be inferred using refseq annotation and 5) variants that were reported in the HGMD database. 56 This resulted in a total of 1,118,178 SNVs (696,013 missense, 422,165 synonymous) across 18,424 refseq genes. Based on the hypothesis that functionally more important genes harbor less variation we calculated for each gene two separate features: the number of positions where a rare variant in the population (<1%) results in a missense change, per 1000 base pairs of coding sequence (Ns/Kb) and the ratio of the number of positions where a variant in the population results in a missense change to the number of positions resulting in a rare synonymous change (Ns/Ss). For a given set of genes (e.g. all RefSeq genes) we calculated the overall measures by taking the median of scores of the individual genes in the set. Gene lengths were calculated by summing the lengths of the individual coding exons of a gene. We tested for significant differences in tolerance to varation between gene sets by permutation testing against all RefSeq genes; the median values for each gene sets was compared to that from generating 1,000,000 gene sets of the same size with randomly chosen genes from the complete list of refseq genes. As a proof-of-principle, we validated the method by comparing rare missense burden of the known ID genes (Supplementary Table 10) to genes that are known to be loss-of-function tolerant. 57 From the list by MacArthur et al. 57 we excluded genes in which variants were identified that (1) do not lead to nonsense-mediated decay (i.e. truncating variants located within the last exon), (2) that truncated only specific transcripts of a gene, (3) were not validated by an alternative technique, or are otherwise indicated as potential false positives. This resulted in a list of 242 variants compromising 225 unique genes of which 169 had a refseq annotation. We found a significant lower tolerance for rare missense variants in known ID genes than expected (Median Ns/Kb =
19 17.68, P < 1.0E-06; Median Ns/Ss = 1.5, P<1.0E-06) as well as in candidate ID genes (Median Ns/Kb = 17.61, P<1.0E-06; Median Ns/Ss = 1.545, P<1.0E-06), whereas we found a significantly higher tolerance in loss-of-function tolerant genes (Median Ns/Kb = 26.49, P<1.0E-06; Median Ns/Ss = 2.214, P<1.0E-06) (Extended Data Figure 1). We repeated this analysis for the 9 known and 10 candidate ID genes in which we found de novo mutations in our patients. These analyses showed a significant lower tolerance for rare missense variants for the 9 known ID genes (Median Ns/Kb = 9.45, P = ; Median Ns/Ss = 0.71, P=2.03E-04) as well as for the 10 candidate ID genes (Median Ns/Kb = 12.8, P = 0.013; Median Ns/Ss = 1.27, P=0.037). P-values from both scores were combined using Fisher s combined probability test. Statistical analysis for overrepresentation of deleterious mutations in patients with severe ID We compared the proportion of de novo loss-of-function (LoF) mutations (n=18, including nonsense, frameshift and canonical splice site) identified in our cohort to frequencies reported in healthy controls from multiple published studies (Table 5). 45,47,49-51,53 To test significance, we used a two-sided Fisher s exact test as implemented in R. 29 Overrepresentation of de novo mutations Overrepresentation of de novo mutations in known and candidate ID gene lists was calculated by first calculating the total coding size in the genome based on refseq annotation (32,342,354) and comparing the coding de novo rate to the coding de novo rate within the coding sequence Table 5: Overview of de novo mutations in healthy individuals Number of controls (c) or siblings (s) Total LoF Missense Synonymous Rauch et al. 20 (c) O'Roak et al. 50 (s) Sanders et al.* 200 (s) Iossifov et al.** 343 (s) Gulsuner et al. 84 (c) Xu et al.*** 34 (c) Total Putative LoF includes nonsense, frameshift and canonical splice site mutations based on gene annotation; * study did not consider indels; ** not all de novo mutations were validated by Sanger sequencing; ***excluded splice site variants if not in canonical di-nucleotide of splice site. For some studies the numbers deviate from the numbers given in the main publication, numbers considered here were retrieved from de novo mutation overviews from supplementary tables
20 of the gene list using a two-sided Fisher s Exact test. Overrepresentation for loss-of-function mutations was subsequently calculated using a two-sided Fisher s exact test and control data described in Table 4. Statistical analysis of clinically relevant CNV and SV events Enrichment analysis for known ID genes was performed in comparison to de novo CNVs identified in controls 58,59 using Fisher s Exact test. In addition, CNV and SV events identified as clinically relevant were tested for the occurrence of exonic CNVs within the same gene in a larger cohort of both inhouse individuals affected with ID (n=7,743) and public as well as inhouse control individuals (n=4,056 of which 2,108 were male). 60,61 Microarray experiments were performed on these cohorts to identify CNVs using the Affymetrix CytoScanHD (n=10,544) and Affymetrix 6.0 (n=1,255) microarray platforms. All experiments passed quality control (MAPD 0.25 and SNP QC 15), after which CNVs were identified using Affymetrix Power Tools v The frequency of CNVs identified in these two groups was then compared and odds ratios calculated. Multiple testing correction was performed using Benjamini Hochberg with a FDR of 0.1. In the case of events occurring on the X-Chromosome only male controls were used in the comparison. RNA confirmation studies for IQSEC2-TENM3 gene fusion To assess the presence of a gene fusion of IQSEC2-TENM3 in patient 31, Epstein Barr Virus transformed lymphoblastoid cells (EBV-LCLs) were cultured with and without the addition of cyclohexamide. As negative control, EBV-LCLs from a non-related control sample was used. Total RNA was isolated using a RNeasy mini columns (QIAGEN) according the instructions of the manufacturer. A total of 2 µg was subsequently processed for cdna generation using the M MLV Reverse Transcriptase mix (Invitrogen). As a result of the inverted insertion of TENM3- ex22-27 duplication in intron 2 IQSEC2, it was predicted that an inframe cdna were to be generated. As such, forward primers were chosen to be located in exon 2 of IQSEC2, whereas the reverse primers were chosen to be located in exon 23 of TENM3. PCRs were then performed using AmpliTaqGold 360 Master mix (Life Technologies, Bleiswijk, NL). Generated PCR products were Sanger sequenced and mapped against a the predicted fusion gene sequence using Vector NTI (Life Technologies)
21 Amplicon-based deep sequencing using Ion Torrent To confirm the somatic presence of mutations, PCR amplicons were generated using standard PCR protocols and used for deep-sequencing on an Ion PGM sequencer (Life Technologies). PCR products were sheared to 200-bp fragments by use of Covaris E210 (Covaris Inc.). Library preparation was performed using the Ion Xpress Plus Fragment Library Kit in combination with the Ion Xpress Barcode adapters 1-96 kit. Emulsion PCR and enrichment were carried out on the Ion OneTouch System using the Ion PGM TemplateOT2 200 Kit as described by the manufacturer. Sequencing was performed using the 200 kit v2, with Ion Sphere particles loaded to an Ion314 or Ion316 chip (Life Technologies). Raw sequencing reads were mapped to the reference genome (hg19) using the Torrent Server (part of the PGM software) with use of the Torrent suite software. FASTQ files were subsequently analyzed and visualized in SeqNext (JSI Medical systems). Clinical interpretation of mutations using established diagnostic criteria For each mutation a standardized pipeline was applied to interpret gene function and mutation impact (Supplementary Table 8). This pipeline was previously described by de Ligt et al and is in concordance with protocols for variant interpretation by Bell et al and Berg et al Essential steps in classification and interpretation are described in more detail below. As a consequence of this systematic analysis, de novo mutations in known/candidate ID genes may be classified as not relevant due to lack of mutation impact. Also, other seemingly severely disruptive mutations may be classified as not relevant for the ID phenotype as they either (i) are involved in phenotypes not related to ID (e.g. cardiomyopathies) or (ii) evidence for biological functioning related to ID is (currently) absent. Despite these potential pitfalls, the classification and interpretation as described below has been shown to be very powerful and is used on daily basis in our diagnostic laboratory of the department (accredited to the CCKL Code of practice, which is based on EN/ISO (2003) (Registration numbers R114/R115, Accreditation numbers 095/103)). Furthermore, the use of this classification system in defining new candidate ID genes (e.g. for genes in which, to our knowledge, this report describes the first de novo mutation in a patient with ID), has proven successful as demonstrated in Table 1 of this supplement (page 5); For approximately 66% of all candidate genes we previously reported 32,44, additional de novo mutations in patients with overlapping clinical phenotypes have since been identified
22 Mutation impact of SNVs Mutation impact for (de novo) SNVs on protein level was assessed in two different ways. Firstly, mutations were classified as deleterious (reported as D in Supplementary Table 8) if the mutation resulted in (i) a nonsense mutation, (ii) frameshift, (iii) inframe insertion/deletion event, or (iv) splice defect (assessed using Interactive Biosoftware Alamut version 2.3-2). For missense mutations prediction of pathogenicity was evaluated using Condel 64 (Referred to as P in Supplementary Table 8). As an additional indicator for mutation severity, the genomic conservation was used, assessed as a PhyloP 3.5 (referred to as C, Supplementary Table 8). 42,44 This specific PhyloP score is based on comparison of the PhyloP score for all missense mutations listed in HGMD 56 (a catalog of disease causing mutation) and dbsnp (as proxy for variants in healthy individuals). Whereas no PhyloP score is mutually exclusive to either type of database, the relative frequency of missense variants with a PhyloP score 3.5 is higher in HGMD than in dbsnp. 44 Functional relevance of SNVs A systematic literature study using various resources such as PubMed was conducted for all mutated genes to identify their potential link to ID (referred to as F, Supplementary Table 8). Additional functional support as provided by enrichment for Gene Ontology (GO) Biological Processes and/or mouse phenotypes (MP) known to be involved in known ID genes 32. To determine which GO and MP terms were enriched, the list of 528 known ID genes (Supplementary Table 10) was used. Next, all GO and MP terms attributed to genes with de novo mutations were systematically compared to those ID enrichmed terms; Mutations scored positive for these analysis if there was overlap for the enriched terms (referred to as G and M in Supplementary Table 8, respectively). In addition, the brain expression profile for each gene was assessed using data derived from the Human Brain Transcriptome database (version 2-dec-2013), which provides spatio-temporal information on genes in human brain development. 65 A gene was considered to be brain expressed if the Log2 intensity was 6 for a minimum of three periods as described by Neale et al. 48 (referred to as B in Supplementary Table 8). When no brain expression data was available, the expression was considered to be unknown (referred to as u ). Of note, variants that were previously identified using WES were re-evaluated in this process. Mutation impact and functional relevance of (de novo) CNV and SVs Mutation impact of (de novo) CNVs and SVs was evaluated using established clinical diagnostic methods. 31,66,67 In brief, genomic regions affected by copy number alterations are evaluated for the frequency in the normal population and/or occurrence in other patients with clinical
Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.
Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets
More informationGlobal variation in copy number in the human genome
Global variation in copy number in the human genome Redon et. al. Nature 444:444-454 (2006) 12.03.2007 Tarmo Puurand Study 270 individuals (HapMap collection) Affymetrix 500K Whole Genome TilePath (WGTP)
More informationIntroduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016
Introduction to genetic variation He Zhang Bioinformatics Core Facility 6/22/2016 Outline Basic concepts of genetic variation Genetic variation in human populations Variation and genetic disorders Databases
More informationNature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.
Supplementary Figure 1 PCA for ancestry in SNV data. (a) EIGENSTRAT principal-component analysis (PCA) of SNV genotype data on all samples. (b) PCA of only proband SNV genotype data. (c) PCA of SNV genotype
More informationCNV Detection and Interpretation in Genomic Data
CNV Detection and Interpretation in Genomic Data Benjamin W. Darbro, M.D., Ph.D. Assistant Professor of Pediatrics Director of the Shivanand R. Patil Cytogenetics and Molecular Laboratory Overview What
More informationNature Biotechnology: doi: /nbt.1904
Supplementary Information Comparison between assembly-based SV calls and array CGH results Genome-wide array assessment of copy number changes, such as array comparative genomic hybridization (acgh), is
More informationNature Genetics: doi: /ng Supplementary Figure 1
Supplementary Figure 1 Illustrative example of ptdt using height The expected value of a child s polygenic risk score (PRS) for a trait is the average of maternal and paternal PRS values. For example,
More informationSUPPLEMENTARY INFORMATION
doi:10.1038/nature13908 Supplementary Tables Supplementary Table 1: Families in this study (.xlsx) All families included in the study are listed. For each family, we show: the genders of the probands and
More informationAbstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction
Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant
More informationAnalysis with SureCall 2.1
Analysis with SureCall 2.1 Danielle Fletcher Field Application Scientist July 2014 1 Stages of NGS Analysis Primary analysis, base calling Control Software FASTQ file reads + quality 2 Stages of NGS Analysis
More informationNo mutations were identified.
Hereditary High Cholesterol Test ORDERING PHYSICIAN PRIMARY CONTACT SPECIMEN Report date: Aug 1, 2017 Dr. Jenny Jones Sample Medical Group 123 Main St. Sample, CA Kelly Peters Sample Medical Group 123
More informationIdentifying Mutations Responsible for Rare Disorders Using New Technologies
Identifying Mutations Responsible for Rare Disorders Using New Technologies Jacek Majewski, Department of Human Genetics, McGill University, Montreal, QC Canada Mendelian Diseases Clear mode of inheritance
More informationInvestigating rare diseases with Agilent NGS solutions
Investigating rare diseases with Agilent NGS solutions Chitra Kotwaliwale, Ph.D. 1 Rare diseases affect 350 million people worldwide 7,000 rare diseases 80% are genetic 60 million affected in the US, Europe
More informationMEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG)
Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG) Ordering Information Acceptable specimen types: Fresh blood sample (3-6 ml EDTA; no time limitations associated with receipt)
More informationDNA-seq Bioinformatics Analysis: Copy Number Variation
DNA-seq Bioinformatics Analysis: Copy Number Variation Elodie Girard elodie.girard@curie.fr U900 institut Curie, INSERM, Mines ParisTech, PSL Research University Paris, France NGS Applications 5C HiC DNA-seq
More informationDr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.
Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Topics Overview of Data Processing Pipeline Overview of Data Files 2 DNA Nano-Ball (DNB) Read Structure Genome : acgtacatgcattcacacatgcttagctatctctcgccag
More informationCITATION FILE CONTENT/FORMAT
CITATION For any resultant publications using please cite: Matthew A. Field, Vicky Cho, T. Daniel Andrews, and Chris C. Goodnow (2015). "Reliably detecting clinically important variants requires both combined
More informationSupplementary Figure 1. Estimation of tumour content
Supplementary Figure 1. Estimation of tumour content a, Approach used to estimate the tumour content in S13T1/T2, S6T1/T2, S3T1/T2 and S12T1/T2. Tissue and tumour areas were evaluated by two independent
More informationMutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research
Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research Application Note Authors John McGuigan, Megan Manion,
More informationSNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.
SAMPLE REPORT SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. RESULTS SNP Array Copy Number Variations Result: GAIN,
More informationPerformance Characteristics BRCA MASTR Plus Dx
Performance Characteristics BRCA MASTR Plus Dx with drmid Dx for Illumina NGS systems Manufacturer Multiplicom N.V. Galileïlaan 18 2845 Niel Belgium Table of Contents 1. Workflow... 4 2. Performance Characteristics
More informationLTA Analysis of HapMap Genotype Data
LTA Analysis of HapMap Genotype Data Introduction. This supplement to Global variation in copy number in the human genome, by Redon et al., describes the details of the LTA analysis used to screen HapMap
More informationBWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space
Whole genome sequencing Whole exome sequencing BWA alignment to reference transcriptome and genome Convert transcriptome mappings back to genome space genomes Filter on MQ, distance, Cigar string Annotate
More informationMerging single gene-level CNV with sequence variant interpretation following the ACMGG/AMP sequence variant guidelines
Merging single gene-level CNV with sequence variant interpretation following the ACMGG/AMP sequence variant guidelines Tracy Brandt, Ph.D., FACMG Disclosure I am an employee of GeneDx, Inc., a wholly-owned
More informationSNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.
SAMPLE REPORT SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. RESULTS SNP Array Copy Number Variations Result: LOSS,
More informationConcurrent Practical Session ACMG Classification
Variant Effect Prediction Training Course 6-8 November 2017 Prague, Czech Republic Concurrent Practical Session ACMG Classification Andreas Laner / Anna Benet-Pagès 1 Content 1. Background... 3 2. Aim
More informationDetection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit
APPLICATION NOTE Ion PGM System Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit Key findings The Ion PGM System, in concert with the Ion ReproSeq PGS View Kit and Ion Reporter
More informationVariant Detection & Interpretation in a diagnostic context. Christian Gilissen
Variant Detection & Interpretation in a diagnostic context Christian Gilissen c.gilissen@gen.umcn.nl 28-05-2013 So far Sequencing Johan den Dunnen Marja Jakobs Ewart de Bruijn Mapping Victor Guryev Variant
More informationMultiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016
Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016 Marwan Tayeh, PhD, FACMG Director, MMGL Molecular Genetics Assistant Professor of Pediatrics Department of Pediatrics
More informationAD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients
AD (Leave blank) Award Number: W81XWH-12-1-0444 TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients PRINCIPAL INVESTIGATOR: Mark A. Watson, MD PhD CONTRACTING ORGANIZATION:
More informationUsing large-scale human genetic variation to inform variant prioritization in neuropsychiatric disorders
Using large-scale human genetic variation to inform variant prioritization in neuropsychiatric disorders Kaitlin E. Samocha Hurles lab, Wellcome Trust Sanger Institute ACGS Summer Scientific Meeting 27
More informationReporting TP53 gene analysis results in CLL
Reporting TP53 gene analysis results in CLL Mutations in TP53 - From discovery to clinical practice in CLL Discovery Validation Clinical practice Variant diversity *Leroy at al, Cancer Research Review
More informationCHROMOSOMAL MICROARRAY (CGH+SNP)
Chromosome imbalances are a significant cause of developmental delay, mental retardation, autism spectrum disorders, dysmorphic features and/or birth defects. The imbalance of genetic material may be due
More informationCHR POS REF OBS ALLELE BUILD CLINICAL_SIGNIFICANCE
CHR POS REF OBS ALLELE BUILD CLINICAL_SIGNIFICANCE is_clinical dbsnp MITO GENE chr1 13273 G C heterozygous - - -. - DDX11L1 chr1 949654 A G Homozygous 52 - - rs8997 - ISG15 chr1 1021346 A G heterozygous
More informationJULY 21, Genetics 101: SCN1A. Katie Angione, MS CGC Certified Genetic Counselor CHCO Neurology
JULY 21, 2018 Genetics 101: SCN1A Katie Angione, MS CGC Certified Genetic Counselor CHCO Neurology Disclosures: I have no financial interests or relationships to disclose. Objectives 1. Review genetic
More informationAssessing Laboratory Performance for Next Generation Sequencing Based Detection of Germline Variants through Proficiency Testing
Assessing Laboratory Performance for Next Generation Sequencing Based Detection of Germline Variants through Proficiency Testing Karl V. Voelkerding, MD Professor of Pathology University of Utah Medical
More informationAVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB
Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits Next-generation performance in liquid biopsies 2 Accelerating clinical research From liquid biopsy to next-generation
More informationProposal form for the evaluation of a genetic test for NHS Service Gene Dossier/Additional Provider
Proposal form for the evaluation of a genetic test for NHS Service Gene Dossier/Additional Provider TEST DISORDER/CONDITION POPULATION TRIAD Submitting laboratory: Exeter RGC Approved: Sept 2013 1. Disorder/condition
More information6/12/2018. Disclosures. Clinical Genomics The CLIA Lab Perspective. Outline. COH HopeSeq Heme Panels
Clinical Genomics The CLIA Lab Perspective Disclosures Raju K. Pillai, M.D. Hematopathologist / Molecular Pathologist Director, Pathology Bioinformatics City of Hope National Medical Center, Duarte, CA
More informationCRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery
CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery PacBio CoLab Session October 20, 2017 For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences
More informationCURRENT GENETIC TESTING TOOLS IN NEONATAL MEDICINE. Dr. Bahar Naghavi
2 CURRENT GENETIC TESTING TOOLS IN NEONATAL MEDICINE Dr. Bahar Naghavi Assistant professor of Basic Science Department, Shahid Beheshti University of Medical Sciences, Tehran,Iran 3 Introduction Over 4000
More informationSUPPLEMENTARY INFORMATION
doi:10.1038/nature11396 A total of 2078 samples from a large sequencing project at decode were used in this study, 219 samples from 78 trios with two grandchildren who were not also members of other trios,
More informationPersonalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection
Personalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection Personalis, Inc. 1350 Willow Road, Suite 202, Menlo Park, California 94025
More informationProposal form for the evaluation of a genetic test for NHS Service Gene Dossier
Proposal form for the evaluation of a genetic test for NHS Service Gene Dossier Test Disease Population Triad Disease name Leber congenital amaurosis OMIM number for disease 204000 Disease alternative
More informationvariant led to a premature stop codon p.k316* which resulted in nonsense-mediated mrna decay. Although the exact function of the C19L1 is still
157 Neurological disorders primarily affect and impair the functioning of the brain and/or neurological system. Structural, electrical or metabolic abnormalities in the brain or neurological system can
More informationGenerating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University
Role of Chemical lexposure in Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University CNV Discovery Reference Genetic
More informationAVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits
AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits Accelerating clinical research Next-generation sequencing (NGS) has the ability to interrogate many different genes and detect
More informationNovember 9, Johns Hopkins School of Medicine, Baltimore, MD,
Fast detection of de-novo copy number variants from case-parent SNP arrays identifies a deletion on chromosome 7p14.1 associated with non-syndromic isolated cleft lip/palate Samuel G. Younkin 1, Robert
More informationProposal form for the evaluation of a genetic test for NHS Service Gene Dossier
Proposal form for the evaluation of a genetic test for NHS Service Gene Dossier Test Disease Population Triad Disease name Epileptic encephalopathy, early infantile 4. OMIM number for disease 612164 Disease
More informationUsing the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep
Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep Tom Walsh, PhD Division of Medical Genetics University of Washington Next generation sequencing Sanger sequencing gold
More informationImplementation of the DDD/ClinGen OGT (CytoSure v3) Microarray
Implementation of the DDD/ClinGen OGT (CytoSure v3) Microarray OGT UGM Birmingham 08/09/2016 Dom McMullan Birmingham Women's NHS Trust WM chromosomal microarray (CMA) testing Population of ~6 million (10%)
More informationRACP Congress 2017 Genetics of Intellectual Disability and Autism: Past Present and Future 9 th May 2017
RACP Congress 2017 Genetics of Intellectual Disability and Autism: Past Present and Future 9 th May 2017 Why causation? Explanation for family Prognosis Recurrence risk and reproductive options Guide medical
More informationTOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY. Giuseppe Narzisi, PhD Bioinformatics Scientist
TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY Giuseppe Narzisi, PhD Bioinformatics Scientist July 29, 2014 Micro-Assembly Approach to detect INDELs 2 Outline 1 Detecting INDELs:
More informationNature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma.
Supplementary Figure 1 Mutational signatures in BCC compared to melanoma. (a) The effect of transcription-coupled repair as a function of gene expression in BCC. Tumor type specific gene expression levels
More informationRecurrent de novo mutations in neurodevelopmental disorders: properties and clinical implications
Wilfert et al. Genome Medicine (2017) 9:101 DOI 10.1186/s13073-017-0498-x REVIEW Recurrent de novo mutations in neurodevelopmental disorders: properties and clinical implications Amy B. Wilfert 1, Arvis
More informationAnalysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers
Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers Gordon Blackshields Senior Bioinformatician Source BioScience 1 To Cancer Genetics Studies
More informationSCALPEL MICRO-ASSEMBLY APPROACH TO DETECT INDELS WITHIN EXOME-CAPTURE DATA. Giuseppe Narzisi, PhD Schatz Lab
SCALPEL MICRO-ASSEMBLY APPROACH TO DETECT INDELS WITHIN EXOME-CAPTURE DATA Giuseppe Narzisi, PhD Schatz Lab November 14, 2013 Micro-Assembly Approach to detect INDELs 2 Outline Scalpel micro-assembly pipeline
More informationAdvance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library
Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library Marilou Wijdicks International Product Manager Research For Life Science Research Only. Not for Use in Diagnostic Procedures.
More informationCHAPTER IV RESULTS Microcephaly General description
47 CHAPTER IV RESULTS 4.1. Microcephaly 4.1.1. General description This study found that from a previous study of 527 individuals with MR, 48 (23 female and 25 male) unrelated individuals were identified
More informationGenomic structural variation
Genomic structural variation Mario Cáceres The new genomic variation DNA sequence differs across individuals much more than researchers had suspected through structural changes A huge amount of structural
More informationNGS panels in clinical diagnostics: Utrecht experience. Van Gijn ME PhD Genome Diagnostics UMCUtrecht
NGS panels in clinical diagnostics: Utrecht experience Van Gijn ME PhD Genome Diagnostics UMCUtrecht 93 Gene panels UMC Utrecht Cardiovascular disease (CAR) (5 panels) Epilepsy (EPI) (11 panels) Hereditary
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#4:(October-0-4-2010) Cancer and Signals 1 2 1 2 Evidence in Favor Somatic mutations, Aneuploidy, Copy-number changes and LOH
More informationVan test naar diagnose naar
Van test naar diagnose naar V therapie op maat Marjolein Kriek, LUMC Joris Veltman, RUNMC Exome diagnostics in genetically heterogeneous disease Joris Veltman, PhD Department of Human Genetics Radboud
More informationBenefits and pitfalls of new genetic tests
Benefits and pitfalls of new genetic tests Amanda Krause Division of Human Genetics, NHLS and University of the Witwatersrand Definition of Genetic Testing the analysis of human DNA, RNA, chromosomes,
More informationIdentification of regions with common copy-number variations using SNP array
Identification of regions with common copy-number variations using SNP array Agus Salim Epidemiology and Public Health National University of Singapore Copy Number Variation (CNV) Copy number alteration
More informationStructural Variation and Medical Genomics
Structural Variation and Medical Genomics Andrew King Department of Biomedical Informatics July 8, 2014 You already know about small scale genetic mutations Single nucleotide polymorphism (SNPs) Deletions,
More informationIdentification of genomic alterations in cervical cancer biopsies by exome sequencing
Chapter- 4 Identification of genomic alterations in cervical cancer biopsies by exome sequencing 105 4.1 INTRODUCTION Athough HPV has been identified as the prime etiological factor for cervical cancer,
More informationGolden Helix s End-to-End Solution for Clinical Labs
Golden Helix s End-to-End Solution for Clinical Labs Steven Hystad - Field Application Scientist Nathan Fortier Senior Software Engineer 20 most promising Biotech Technology Providers Top 10 Analytics
More informationEpi4K. Epi4K Consortium. Epi4K: gene discovery in 4,000 genomes, Epilepsia, 2012 Aug;53(8):
Epi4K Epi4K Consortium. Epi4K: gene discovery in 4,000 genomes, Epilepsia, 2012 Aug;53(8):1457-67. Genetics of Epileptic Encephalopathies Infantile Spasms (IS) 1 in 3000 live births and onset between 4-12
More informationPREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland
AD Award Number: W81XWH-12-1-0298 TITLE: MTHFR Functional Polymorphism C677T and Genomic Instability in the Etiology of Idiopathic Autism in Simplex Families PRINCIPAL INVESTIGATOR: Xudong Liu, PhD CONTRACTING
More informationCharacterisation of structural variation in breast. cancer genomes using paired-end sequencing on. the Illumina Genome Analyser
Characterisation of structural variation in breast cancer genomes using paired-end sequencing on the Illumina Genome Analyser Phil Stephens Cancer Genome Project Why is it important to study cancer? Why
More informationA complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis
APPLICATION NOTE Cell-Free DNA Isolation Kit A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis Abstract Circulating cell-free DNA (cfdna) has been shown
More informationApplications of Chromosomal Microarray Analysis (CMA) in pre- and postnatal Diagnostic: advantages, limitations and concerns
Applications of Chromosomal Microarray Analysis (CMA) in pre- and postnatal Diagnostic: advantages, limitations and concerns جواد کریمزاد حق PhD of Medical Genetics آزمايشگاه پاتوبيولوژي و ژنتيك پارسه
More informationNature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.
Supplementary Figure 1 Rates of different mutation types in CRC. (a) Stratification by mutation type indicates that C>T mutations occur at a significantly greater rate than other types. (b) As for the
More informationMultiplex target enrichment using DNA indexing for ultra-high throughput variant detection
Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection Dr Elaine Kenny Neuropsychiatric Genetics Research Group Institute of Molecular Medicine Trinity College Dublin
More informationCytogenetics 101: Clinical Research and Molecular Genetic Technologies
Cytogenetics 101: Clinical Research and Molecular Genetic Technologies Topics for Today s Presentation 1 Classical vs Molecular Cytogenetics 2 What acgh? 3 What is FISH? 4 What is NGS? 5 How can these
More informationHow many disease-causing variants in a normal person? Matthew Hurles
How many disease-causing variants in a normal person? Matthew Hurles Summary What is in a genome? What is normal? Depends on age What is a disease-causing variant? Different classes of variation Final
More informationUnderstanding The Genetics of Diamond Blackfan Anemia
Understanding The Genetics of Diamond Blackfan Anemia Jason Farrar, MD jefarrar@ About Me Assistant Professor of Pediatrics at University of Arkansas for Medical Sciences & Arkansas Children s Hospital
More informationTitle:Exome sequencing helped the fine diagnosis of two siblings afflicted with atypical Timothy syndrome (TS2)
Author's response to reviews Title:Exome sequencing helped the fine diagnosis of two siblings afflicted with atypical Timothy syndrome (TS2) Authors: Sebastian Fröhler (Sebastian.Froehler@mdc-berlin.de)
More informationWhole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute
Whole Genome and Transcriptome Analysis of Anaplastic Meningioma Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute Outline Anaplastic meningioma compared to other cancers Whole genomes
More informationChapter 1 : Genetics 101
Chapter 1 : Genetics 101 Understanding the underlying concepts of human genetics and the role of genes, behavior, and the environment will be important to appropriately collecting and applying genetic
More information38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16
38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and
More informationp.arg119gly p.arg119his p.ala179thr c.540+1g>a c.617_633+6del Prediction basis structure
a Missense ATG p.arg119gly p.arg119his p.ala179thr p.ala189val p.gly206trp p.gly206arg p.arg251his p.ala257thr TGA 5 UTR 1 2 3 4 5 6 7 3 UTR Splice Site, Frameshift b Mutation p.gly206trp p.gly4fsx50 c.138+1g>a
More informationMEDICAL GENOMICS LABORATORY. Non-NF1 RASopathy panel by Next-Gen Sequencing and Deletion/Duplication Analysis of SPRED1 (NNP-NG)
Non-NF1 RASopathy panel by Next-Gen Sequencing and Deletion/Duplication Analysis of SPRED1 (NNP-NG) Ordering Information Acceptable specimen types: Blood (3-6ml EDTA; no time limitations associated with
More informationWelcome to the Genetic Code: An Overview of Basic Genetics. October 24, :00pm 3:00pm
Welcome to the Genetic Code: An Overview of Basic Genetics October 24, 2016 12:00pm 3:00pm Course Schedule 12:00 pm 2:00 pm Principles of Mendelian Genetics Introduction to Genetics of Complex Disease
More informationMEDICAL GENOMICS LABORATORY. Peripheral Nerve Sheath Tumor Panel by Next-Gen Sequencing (PNT-NG)
Peripheral Nerve Sheath Tumor Panel by Next-Gen Sequencing (PNT-NG) Ordering Information Acceptable specimen types: Blood (3-6ml EDTA; no time limitations associated with receipt) Saliva (OGR-575 DNA Genotek;
More informationThe Meaning of Genetic Variation
Activity 2 The Meaning of Genetic Variation Focus: Students investigate variation in the beta globin gene by identifying base changes that do and do not alter function, and by using several CD-ROM-based
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Missense damaging predictions as a function of allele frequency
Supplementary Figure 1 Missense damaging predictions as a function of allele frequency Percentage of missense variants classified as damaging by eight different classifiers and a classifier consisting
More informationCopy Number Varia/on Detec/on. Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015
Copy Number Varia/on Detec/on Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015 Today s Goals Understand the applica5on and capabili5es of using targe5ng sequencing and CNV calling
More informationMost severely affected will be the probe for exon 15. Please keep an eye on the D-fragments (especially the 96 nt fragment).
SALSA MLPA probemix P343-C3 Autism-1 Lot C3-1016. As compared to version C2 (lot C2-0312) five reference probes have been replaced, one reference probe added and several lengths have been adjusted. Warning:
More informationLecture 17: Human Genetics. I. Types of Genetic Disorders. A. Single gene disorders
Lecture 17: Human Genetics I. Types of Genetic Disorders A. Single gene disorders B. Multifactorial traits 1. Mutant alleles at several loci acting in concert C. Chromosomal abnormalities 1. Physical changes
More informationNature Neuroscience: doi: /nn Supplementary Figure 1
Supplementary Figure 1 Illustration of the working of network-based SVM to confidently predict a new (and now confirmed) ASD gene. Gene CTNND2 s brain network neighborhood that enabled its prediction by
More informationProposal form for the evaluation of a genetic test for NHS Service Gene Dossier
Proposal form for the evaluation of a genetic test for NHS Service Gene Dossier Test Disease Population Triad Disease name Choroideremia OMIM number for disease 303100 Disease alternative names please
More informationPractical challenges that copy number variation and whole genome sequencing create for genetic diagnostic labs
Practical challenges that copy number variation and whole genome sequencing create for genetic diagnostic labs Joris Vermeesch, Center for Human Genetics K.U.Leuven, Belgium ESHG June 11, 2010 When and
More informationAnalysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing
www.sciencemag.org/cgi/content/full/science.1186802/dc1 Supporting Online Material for Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing Jared C. Roach, Gustavo Glusman, Arian
More informationAssociating Copy Number and SNP Variation with Human Disease. Autism Segmental duplication Neurobehavioral, includes social disability
Technical Note Associating Copy Number and SNP Variation with Human Disease Abstract The Genome-Wide Human SNP Array 6.0 is an affordable tool to examine the role of copy number variation in disease by
More informationThe Promise of Epilepsy Genetics A Personal & Scientific Perspective December 3, 2012
The Promise of Epilepsy Genetics A Personal & Scientific Perspective December 3, 2012 Tracy Dixon-Salazar, Ph.D. University of California, San Diego American Epilepsy Society Annual Meeting 1 Disclosure
More informationSVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN
SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN Structural variation (SV) Variants larger than 50bps Affect
More informationA guide to understanding variant classification
White paper A guide to understanding variant classification In a diagnostic setting, variant classification forms the basis for clinical judgment, making proper classification of variants critical to your
More informationNational Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2011 Formula Grant
National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2011 Formula Grant Reporting Period July 1, 2012 June 30, 2013 Formula Grant Overview The NSABP Foundation
More information