Golden Helix s End-to-End Solution for Clinical Labs Steven Hystad - Field Application Scientist Nathan Fortier Senior Software Engineer 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences
Questions during the presentation Use the Questions pane in your GoToWebinar window
Golden Helix Golden Helix is a global bioinformatics company founded in 1998. Variant Calling Filtering and Annotation Clinical Reports CNV Analysis Pipeline: Run Workflows Variant Warehouse Centralized Annotations Hosted Reports Sharing and Integration GWAS Genomic Prediction Large-N-Population Studies RNA-Seq Large-N CNV-Analysis
Cited in over 1,100 peer-reviewed publications
Over 350 customers globally
Golden Helix Who We Are When you choose a Golden Helix solution, you get more than just software REPUTATION TRUST EXPERIENCE INDUSTRY FOCUS THOUGHT LEADERSHIP COMMUNITY TRAINING SUPPORT RESPONSIVENESS INNOVATION and SPEED CUSTOMIZATIONS
Secondary Analysis
Why Sentieon? Pipeline Improvements Why Improve Broad s GATK Haplotype Caller? GATK Haplotype Caller is the most accurate DNA analysis tool Problem: GATK Haplotype Caller is too Slow Solution Consistent: No down-sampling or thread dependency = no run-to-run differences = More accurate No Javascript: low level C More Robust: Can handle joint genotyping on over 100K samples simultaneously without needing a gvcf Software-only Solution: Easily deployable, easily scalable, and easily upgraded. Two Product Lines DNAseq BWA-GATK Haplotype Caller Tnseq MuTect and MuTec2
Performance: Sentieon vs Broad DNAseq TNseq
Proven Accuracy Precision FDA Consistency Challenge - Processed single sample consistency - Single biological sample x two library preps - Top overall performance - Highest reproducibility Truth Challenge - Determine accuracy in Truth Set - Highest indel precision - Highest SNP recall
Proven Accuracy - ICGC-TCGA DREAM Mutation Calling Challenge Evaluates Somatic Variant Call Accuracy SNV INDEL Structural Variants Sentieon 98.57% Sentieon 98.14% Sentieon 100% Bina/Roche 97.57% Bina/Roche 97.01% Genowis 99.82% Genowis 96.92% OICR GSI 86.99% Gridss 99.63% Sentieon leads in all categories
Customizable secondary pipeline CNV caller Sequencer FASTQ Sentieon BAM Linux VCF -Annotate & filter -Clinical reports
CNV and LoH Detection VarSeq Calls CNVs on - Gene Panel Data - Exome Data - Whole Genome Data Faster than CMA and MLPA Utilizes existing coverage data Calls CNVs of every size - Small Single Exon Events - Large Cytogenetic Events
CNV Detection via NGS CNVs are called from coverage data Challenges - Coverage varies between samples - Coverage fluctuates between targets - Systematic biases impact coverage Solutions - Data Normalization - Reference Sample Comparison
CNV calling in VarSeq Reference samples used for normalization Metrics - Z-score: number of standard deviations from reference sample mean - Ratio: sample coverage divided by reference sample mean - VAF: Variant Allele Frequency For Gene Panels and Exomes - Probabilistic model used to call CNVs - Segmentation identifies large cytogenetic events For Whole Genome Data - Targets segmented using Z-scores - Events called based on Z-score and Ratio thresholds
VarSeq Full Clinical Stack
VarSeq Suite Flexible Simple VarSeq Scalable Variant annotation, filtering, and interpretation Rich visualizations with GenomeBrowse built-in Powerful GUI and command-line interfaces Repeatable workflows
Example Workflows Illumina TruSight Cancer Sequencing Panel - Oncogenes & Tumor Suppressor Genes - Drug Targeting Information & Ongoing Clinical Trials Exome Trio Workflow - Casual Variants Associated with Phenotype - Incidental Findings Hereditary Risk Sequencing Panel - Comprehensive coverage of 175 genes with known associations to inherited cardiac conditions, cancers, and other inherited diseases.
Sample Data for Cancer Gene Panels Illumina TruSight Cancer Sequencing Panel - Comprehensive coverage of 154 genes designed to target exons of key tumor suppressor genes and frequently cited oncogenes. - BAM and VCF files for each replicate are available VSReports
VarSeq Demonstration
Sample Data for Exome Trio Workflows NA1278 17-year old Caucasian Women - Admitted for Symptoms of Acute Gastroenteritis. - Serum testing revealed elevated pancreatic enzymes. - Genetic testing revealed Heterozygosity for CFTR gene mutation Δ F508 Incidental Findings NA12892 NA12891 NA12878
VarSeq Demonstration
VSWarehouse Variant Warehouse Server A place to archive full VCFs of every sequenced sample Centralized genomic data hosting, integration with other systems Ask the Variant Warehouse: - Have I ever seen this variant in my previous test samples? - At what frequency? (counts as well) - Does this gene contain other rare variants in my cohort? - Did I provide a pathogenicity assessment for this variant? Has that changed? - Has ClinVar changed since that assessment was initially made? - Have I put this variant into a clinical report for any previous samples?
Integration with VarSeq VarSeq can Export to Warehouse - Create a new Project - Current project template is used - Add Variants to Existing (data only) Annotate with Warehouse Counts - Allele / Genotype counts computed on warehouse variants (or other sources in template). Remote Assessment Catalogs - Customize databases of variant classifications or internal flags like false-positives. Remote Reports - Central versioning of report - Queryable database of all saved reports, with all data in form ready for EMR/LIMS integration
VSWarehouse Centralized Collaboration Project A Project B VSWarehouse Server Group A Group B Collaborators
VSWarehouse Centralized Collaboration
VarSeq Full Clinical Stack
Questions or more info: Email info@goldenhelix.com Request an evaluation of the software at www.goldenhelix.com