NGS, Cancer and Bioinforma;cs. 20/10/15 Yannick Boursin

Size: px
Start display at page:

Download "NGS, Cancer and Bioinforma;cs. 20/10/15 Yannick Boursin"

Transcription

1 NGS, Cancer and Bioinforma;cs 1

2 NGS and Clinical Oncology NGS in hereditary cancer genome tes;ng BRCA1/2 (breast/ovary cancer) XPC (melanoma) ERCC1 (colorectal cancer) NGS for personalized cancer treatment Clinical trials: MOSCATO (GR), SAFIR (GR), SHIVA (Curie), Ipilimumab (an;-ctla4), Nivolumab (an;-pd1), Trastuzumab (an;-her2), Cetuximab (an;-egfr) Detec;on of chimeric transcripts Chronic Myeloid Leukemia: Philadelphia chromosome (BCR/ABL) Non-Small-Cell Lung Cancer: EML4-ALK 2

3 NGS and Oncology NGS is now widely used as: A research tool to screen a large amount of cancer samples A clinical/diagnosis tool in daily prac;ce These projects require dedicated bioinforma;cs integra;on project to access and analyses this huge amount of data. 1 3

4 Why do we need computers for NGS Sequencing data size evolu7on Needs to address Store PetaBytes of data (1 PB is 1000 TB). Share data around the world through networks Analyze huge amounts of data with complex algorithms 4

5 Bioinformatics and Oncology Problem: finding, extrac;ng, and presen;ng relevant informa;ons. Par;al solu;on: designing workflows in order to ease data analysis. 5

6 Interdisciplinary collaboration Bioinforma;cs acts as a hubs between the different fields. Trust between partners is needed, training is needed as well for efficient understanding. Biology knowledge Knowledge modeling, Bioinformatics Medical staff Clinicians, specialists, Raw data storage Integration of biological and clinical data Quality Control Data analysis Clinical Biostatistics Report for biological/medical staff Biological staff Biologists, Geneticists, Technological platforms Sequencing, Microarrays, ImmunoChemistry, 6

7 Standard Workflow for NGS Analysis Depends on the NGS Application Sequencing & Primary Analysis Raw Reads Reads Cleaning Reads Mapping Data Analysis QC: 1 QC: 2 QC: 3 A typical NGS workflow 7

8 Step 1: Quality Check and improvements 8

9 NGS Data: what do they look like? A raw data file (.fastq,.sff,.fa,.csfasta/.qual) with millions of short reads of the same size (SOLiD, HiSeq) or reads of different size (Ion PGM/Proton) Enhanced view of the reads in a fastq file 9

10 FASTQ format 1 sequence = 1 read = 4 lines in the file First line = sequence iden;fier 10

11 Fourth line = Quality FASTQ format ASCII encoded (Reduce the file size) 11

12 Sequence quality encoding Phred scores Q : Q scores are defined as a property that is logarithmically related to the base-calling error probabilities (P). Q = -10 log10 P 12

13 Quality controls on raw reads : lets start after sequencing A raw read is characterized by three parameters: Its length Its sequence Per-base-in-sequence quality ACTGATTAGTCTGAATTAGANNGATAGGAT GATCGATGCATAGCGATCAGCATCGATACG CGGCGCTCCGCTCTCGAAACTAGCACTGAC AGCATCAGGATCTACGATCTAGCGAACTGAC ACTACTTACGACATCGAGGTTAGGAGCATCA ACTAGGCATCGGCATCACGGACNNNNNNNN ACTAGCTATCGAGCTATCAGCGAGCATCTATC ACTAGCTACTATCGAGCGAGCGATCATCGAC CTGACTACTATCGAGCGAGCTACTAACTGAC ACTATCAGCTAGCGCTTCAGCATTACCGT ACTANNGACTAGGAATTAGCTACTGAGCTAC ACTAGCAGCTATATGAGCTACTAGCACTGAC NNNNNNNNNNNNNNNNNNNNNNNNNNNNN Raw reads 13

14 Why looking at sequencing quality? Quality of data is very important for various downstream analyses: Sequence assembly or mapping Variants detec;on Gene expression studies... Quality of data = poor Try to find a reason Can we correct/improve the quality? May lead to erroneous conclusions 14

15 Quality controls on raw reads: which metrics to check? Mainly: Quality score per base and over the reads But also: Read length distribu;on Sequence content per base and % of GC Kmers content Overrepresented sequences Duplicated reads 15

16 Quality scores Per base (Box Whisker type plot) -> to see wether base calls falls into low quality (commonly towards the end of a read) Per sequence (mean quality distribu;on) -> to see if a subset of your sequences have universally low quality values 16

17 Quality scores PGM run A PGM run A PGM run B PGM run B 17

18 Quality scores Illumina run C Illumina run C Illumina run D Illumina run D 18

19 Quality control on raw reads: adapters removal An adapter is a small piece of known DNA located at the end of the reads Adapters roles: Hang read to the sequencer flowcell Allows a specific PCR enrichment of reads having adapter Use in mul;plex sequencing (samples in mix) Available tools to trim adapters: Cutadapt SeqPrep RmAdapter In blue: adapters. In orange: informa;ve part of the read. 19

20 Quality controls on raw reads : lets start after sequencing A first Quality Control of raw reads is mandatory and can be established according to the applica;on ('N', adapter sequences, barcode, contamina;on, etc.) ACTGATTAGTCTGAATTAGANNGATAGGAT GATCGATGCATAGCGATCAGCATCGATACG CGGCGCTCCGCTCTCGAAACTAGCATCGAC ACTGAC AGCATCAGGATCTACGATCTAGCGAACTGAC ACTGAC ACTACTTACGACATCGAGGTTAGGAGCATCA ACTAGGCATCGGCATCACGGACNNNNNNNN ACTAGCTATCGAGCTATCAGCGAGCATCTATC ACTAGCTACTATCGAGCGAGCGATCATCGAC CTGACTACTATCGAGCGAGCTACTAACTGAC ACTGAC ACTATCAGCTAGCGCTTCAGCATTACCGT ACTANNGACTAGGAATTAGCTACTGAGCTAC ACTAGCAGCTATATGAGCTACTAGCACTGAC ACTGAC NNNNNNNNNNNNNNNNNNNNNNNNNNNNN Processed reads: blue parts are to be kept, green and red parts to be removed 20

21 Quality controls : Standard Workflow for NGS Analysis Depends on the NGS Application Sequencing & Primary Analysis Raw Reads Reads Cleaning Reads Mapping Data Analysis QC: 1 QC: 2 QC: 3 A typical NGS workflow 21

22 Step 2: Short Reads Alignment 22

23 Reads alignment - Vocabulary Alignment : (mapping) The reads alignment aims at transforming the single reads informa;on in an organized and reduced set of informa;on. Mismatch : Incoherence between two nucleo;des Reference Genome : The reference genome is a known sequence, supposed to be as close as possible to the input genome, and which is used as an anchor to organize the single reads informa;on. Gap : Bridge within the read alignment (i.e. small Inser;on/dele;on) Mappability : Uniqueness of a region (repeated region = low mappability, unique region = good mappability) Indels : Inser;on/Dele;on into the reference genome 23

24 Reads alignment Two strategies The reads alignment aims at transforming the single reads informa;on in an organized and reduced set of informa;on. Two strategies can be applied : - De novo Reads Assembly Used when no reference genome are available. It aims at reconstruc;ng long scaffolds from single reads informa;on. - Alignment on a Reference Genome The reads are directly compared to a known reference genome. 24

25 Alignment on a reference genome The reference genome is a known sequence, supposed to be as close as possible to the input genome, and which is used as an anchor to organize the single reads informa;on. T T T A C G A A C T A C G A G C T C C T A T G C C A A C A G C T A C T A C G A C T T C A T C T A C T T T A C G A C G A G C T G C G A G C T G T C C T A G C A G C T G C G A C G A G C T A C C T T G G C T A C G A G A G C T A C T G G C C A A C C G G C C A A Reference Genome Sequence A C T A C G A C T C T A C G A G C A T C T A C G A G C T A C T A G C G A T C T A C G A G C T G C G A G C A A C G GC C A A C Alignment of reads against reference genome 25

26 Alignment on a reference genome The reference genome is a known sequence, supposed to be as close as possible to the input genome, and which is used as an anchor to organize the single reads informa;on. T G C C A A C A C C T T G G C G A G C T G A C G A G C T G G C C A A C C G G C C A A T C C T A G C A G C T G C G G C T C C T A C G A G C T G T T T A C G A A G C T A C T T T T A C G A A G C T A C T A C G A C T T C T A C G A G A C T A C G A C A T C T A C Reference Genome Sequence A C T A C G A C T C T A C G A G C A T C T A C G A G C T A C T A G C G A T C T A C G A G C T G C G A G C A A C G GC C A Homozygous Polymorphism (T/C) Alignment of reads against reference genome 26

27 Alignment on a reference genome - Challenges New alignment algorithms must address the requirements and characterics of NGS reads Millions of reads per run (30x of genome coverage) Reads of different size (35bp - 200bp) Different types of reads (single-end, paired-end, mate-pair, etc.) Base-calling quality factors Sequencing errors ( ~ 1%) Repe;;ve regions Sequencing organism vs. reference genome Must adjust to evolving sequencing technologies and data formats 27

28 Alignment on a reference genome Bioinformatics tools Mappers timeline (since 2001) 28

29 Finding the best alignment - Rational Given a reference and a set of reads, report at least one good local alignment for each read if one exists What is good? For now, we concentrate on: Fewer mismatches is beuer T G A T C A T A... Is better than G A T C A A T G A T.C A T A... G A G A A T Failing to align a low-quality base is beuer than failing to align a high-quality base T G A T A T T A... Is better than G A T c a.t T G A T c a T A... G T A C A T Based on a scoring system, i.e. score for a match (1), MM penalty (3), gap open penalty (5), gap extension penalty (2). The best alignment is the one with the highest score. 29

30 Alignment key parameters - Repeats Approximately 50% of the human genome is comprised of repeats Treangen T.J. and Salzberg S.L Nature review Gene;cs 13, NGS and Bioinformatics 30

31 Alignment key parameters - Repeats Close proximity with genes : intergenic and intragenic posi;ons BRCA2: a mosaic of repeated regions 31

32 Alignment key parameters Repeats 3 strategies -1- Report only unique alignment -2- Report best alignments and randomly assign reads across equaly good loci -3- Report all (best) alignments A B A B A B Treangen T.J. and Salzberg S.L Nature review Gene;cs 13,

33 Alignment key parameters Using single or paired-end reads? The type of sequencing (i.e. single or paired-end reads) is owen driven by the applica;on. Exemple : Finding large indels, genomic rearrangements,... However, in most of the case, the pair informa;on can improve the mapping specificity - Single-end alignment repeated sequence A C G A C T C A C G A C T C Reference Genome Sequence A C T A C G A C T C T A C G A G C A T C T A C G A G C T A C T A G C G A T C T A C G A G C T G C G A G C A A C G GC C A A C - Paired-end alignment unique sequence A C G A C T C G G C C A A C A C G A C T C G G C C A A C Reference Genome Sequence A C T A C G A C T C T A C G A G C A T C T A C G A G C T A C T A G C G A T C T A C G A G C T G C G A G C A A C G GC C A A C Alignment of reads against reference genome 33

34 Key points Alignment on a reference genome The alignment is a crucial step of the NGS analysis. The reference genome has to be carefully chosen. The mappability of the region of interest has to be taken into account (primer design). The scoring method has to be chosen accordingly to the sequencing error rate and the quality of the raw reads. The alignment parameters have to be set properly. 34

35 Limitations of Alignment Tools Even if we have now some nice tools to align reads on a reference genome, several issues are s;ll important : - Homopolymer mapping - Efficiently align small indels - Alignment on several genomes - Alignment on repeated sequences

36 Alignment formats A lot of formats exists: SAM BAM ELAND (Illumina specific) MAQ map SAM and BAM are now the standard for aligned data 36

37 SAM format SAM for Sequence Alignment Map Tabulated text file 1 line per read Each line is composed of 11 fields (minimum) 37

38 SAM format 11695_6 0 chr M * 0 0 AAGAGATCTGGAACCATAGA DGDFCDGFFGBEFFGFDEEF XA:i:0 MD:Z:20 NM:i:0 XX:i: _1 0 chr M * 0 0 AGAGATCTGGAACCATAGA IIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:19 NM:i:0 XX:i: _1 0 chr M * 0 0 TCTGCAAGGCAAAAGACACTGT GHHHHHGHGHHHGHHHHBHBGG XA:i:0 MD:Z:22 NM:i:0 XX:i: _1 0 chr M * 0 0 AAGAAAGAGAACTTCAGACC GGGG+GGGGGGIIIIIBHII XA:i:0 MD:Z:20 NM:i:0 XX:i: _1 0 chr M * 0 0 GGGACTCAGCAGAACTTAGGA?@GGGDGGGG>DDGGGGGGDB XA:i:0 MD:Z:21 NM:i:0 XX:i: _1 0 chr M * 0 0 AGTCTGAACAGGTTAGAGGGTGC IIIIIIEGIHIGID<DBDGDBGB XA:i:0 MD:Z:23 NM:i:0 XX:i:

39 SAM format Second field can be used for quick sort of file With Samtools (command line) and f et F op;ons Useful webpage: hup://broadins;tute.github.io/picard/explain-flags.html 39

40 BAM format BAM for Binary Alignment/Map Correspond to SAM format compressed as BGZF Reduce by 5 ;mes the size of the alignment file Not directly readable as SAM format Require Samtools Best format for alignment file sharing Couples with an index file (BAI) Avoid a sequen;al read of the complete file 40

41 Quality controls on aligned data : Standard workflow for NGS analysis Depends on the NGS Application Sequencing & Primary Analysis Raw Reads Reads Cleaning Reads Mapping Data Analysis QC: 1 QC: 2 QC: 3 A typical NGS workflow 41

42 QC 3 : Which metric to check? In prac7ce, how to validate my alignment? Be aware of the mapping strategy used Look at simple descrip;ve sta;s;cs Number of aligned reads Coverage/Depth Mapping quality Number of normal/abnormal pairs for paired-end data Strand bias... 42

43 Paired-end mapping Insert-size checking % of "All Good"= both reads in the pair have aligned "the pair is properly aligned" meaning that they mapped within a proper distance from each other % of "All Bad" = neither the read nor its mate mapped % of Only one read maps = only one read in a pair is mapped 43

44 NGS Analysis : How can I work with my NGS data? Difficult on personal computer (lack of ressources) 1 alignement = 4 processors + 15gb Ram (to mul;ply by the number of samples) Impossible to open files into sofwares like text editor Need a very large storage capacity Data backup administra;on Applica;ons server connected to a compu;ng cluster and storage array: Commercials solu;on (CLC Bio, NextGene,...) Galaxy server: hwps://galaxy.gustaveroussy.fr/galaxyprod 44

45 Data analysis Depends on the NGS Application Sequencing & Primary Analysis Raw Reads Reads Cleaning Reads Mapping Data Analysis QC: 1 QC: 2 QC: 3 A typical NGS workflow 45

46 Data Analyses in Cancer 20/10/15 Chimeric transcript search Alterna;ve transcripts study Differen;al expression study Methyla;on study Detec;on of genomic variants Detec;on of copynumber varia;on Yannick Boursin 46

47 Chimeric transcripts Does the tumoral cells express any chimeric transcript? History of the bcr-abl fusion 47

48 Alternative transcripts 48

49 Differential expression Are there genes that would be strongly expressed in one kind of tumor that are not in the other kind? Can we group tumors according to their expression profiles? Clustering differen;al expression in breast tumours. 49

50 Methylome Is there any difference between DNA methyla;on in tumors and in normal cells? How does methyla;on promotes cancer? 50

51 Detection of copynumber variations Are there any copy-number altera;on (gain or loss of chomosomal regions, amplifica;ons ) that could explain tumorigenesis? Copynumber varia;ons in cancer. MYC and KRAS are amplified. 51

52 Detection of genomic variants Are there muta;onal events that are specific to the tumoral genome? Could the tumorigenesis be explained by those? Is there any drug targe;ng those muta;ons? Pancreas adenocarcinoma: from normal cells to tumoral cells 52

53 Limitations: Detection of genomic variants Between 1.4 and 8.9 % of the variants are technology specific 53

54 Limitations: Detection of genomic variants Common genomic variants between different variant callers 54

55 Conclusion Nowadays, NGS is widely used in cancer centers in order to categorize cancers and link pa;ents with personnalized treatments (Precision Medicine) NGS are also used in cancer research, in order to discover new oncogene;c mechanisms, to understand the way a treatment works, to link biological and gene;cal characters Due to technical and how-the-universe-works-related issues, using NGS might not solve your problems. It is important to know that the technique is limited: A) by the ques;on you asked at first. If a cancer cannot be explained by muta;onal events, it might be explained by other mechanisms. But s;ll, nothing is to be found in data. B) by technical issues. Sequencers and sowwares are prone to errors. Sta;s;cally, there will be at least one error for your analysis. You can owen limit the role of this limita;on by making biological and technical replicates. 55

56 Galaxy: a web-based genome analysis platform Galaxy is an open-source framework for integra;ng various computa;onal tools and databases into a cohesive workspace hwps://main.g2.bx.psu.edu/ A web-based service that provides and integrates many popular tools and resources for compara;ve genomics A completely self-contained applica;on for building your own Galaxy style sites 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

57 Galaxy: the instant web-based tool and data resource integration platform Open Source downloadable package that can be deployed in individual labs Modularized Add new tools Integrate new data sources Easy to plug in your own components Straigh orward to run your own private galaxy server 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

58 Galaxy: the one-stop shop for genome analysis Analyze Retrieve shared data between galaxy users or upload your own Interac;vely manipulate genomic data with a comprehensive and expanding best-prac;ces toolset Galaxy is designed to work with many different datatypes. hup://wiki.galaxyproject.org/learn/datatypes Visualize Visual analysis environment of your data, your analysis workflows. Publish and Share Results and step-by-step analysis record (Data Libraries and Histories) Customizable pipelines (Workflows) Complete protocols/documenta;ons (Pages) 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

59 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

60 Data libraries Datasets are accessible from Galaxy or for download. 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

61 History Histories are all steps in the process and the used se}ng. Histories can be imported into your session and rerun as is or modified. 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

62 Workflows Workflows specify the steps in a process (a suite of ordered tools). Workflows are analyses that are meant to be run, each ;me with different user-provided datasets. 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

63 User account Galaxy public Main or Test instances An account is not required to access it But if used, the data quota is increased and full func;onality across sessions opens up, such as naming, saving, sharing, and publishing Galaxy objects (Histories, Workflows, Datasets, Pages). GR: hups://galaxy.gustaveroussy.fr/galaxyprod An account is required to access it full func;onality across sessions opens up, such as naming, saving, sharing, and publishing Galaxy objects (Histories, Workflows, Datasets, Pages). 29 janvier 2015 Forma;on NGS & Cancer - Analyses Exome

64 64

Copy Number Varia/on Detec/on. Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015

Copy Number Varia/on Detec/on. Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015 Copy Number Varia/on Detec/on Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015 Today s Goals Understand the applica5on and capabili5es of using targe5ng sequencing and CNV calling

More information

DNA-seq Bioinformatics Analysis: Copy Number Variation

DNA-seq Bioinformatics Analysis: Copy Number Variation DNA-seq Bioinformatics Analysis: Copy Number Variation Elodie Girard elodie.girard@curie.fr U900 institut Curie, INSERM, Mines ParisTech, PSL Research University Paris, France NGS Applications 5C HiC DNA-seq

More information

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant

More information

DNA Sequence Bioinformatics Analysis with the Galaxy Platform

DNA Sequence Bioinformatics Analysis with the Galaxy Platform DNA Sequence Bioinformatics Analysis with the Galaxy Platform University of São Paulo, Brazil 28 July - 1 August 2014 Dave Clements Johns Hopkins University Robson Francisco de Souza University of São

More information

Canadian Bioinforma1cs Workshops

Canadian Bioinforma1cs Workshops 5/12/16 Canadian Bioinforma1cs Workshops www.bioinforma1cs.ca Module #: Title of Module 2 1 Module 3 Introduc1on to WGBS and analysis Guillaume Bourque Learning Objec/ves of Module Know the different technologies

More information

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University.

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University. Databases and Tools for High Throughput Sequencing Analysis P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University. HTseq Platforms Applications on Biomedical Sciences

More information

Recherche de variants génomiques en oncologie clinique. Avec des diapos, données & scripts R de: Yannick Boursin, IGR Bastien Job, IGR

Recherche de variants génomiques en oncologie clinique. Avec des diapos, données & scripts R de: Yannick Boursin, IGR Bastien Job, IGR Recherche de variants génomiques en oncologie clinique Avec des diapos, données & scripts R de: Yannick Boursin, IGR Bastien Job, IGR Génétique constitutionnelle At hospital Blood sample Sequence gene

More information

Analysis with SureCall 2.1

Analysis with SureCall 2.1 Analysis with SureCall 2.1 Danielle Fletcher Field Application Scientist July 2014 1 Stages of NGS Analysis Primary analysis, base calling Control Software FASTQ file reads + quality 2 Stages of NGS Analysis

More information

Below, we included the point-to-point response to the comments of both reviewers.

Below, we included the point-to-point response to the comments of both reviewers. To the Editor and Reviewers: We would like to thank the editor and reviewers for careful reading, and constructive suggestions for our manuscript. According to comments from both reviewers, we have comprehensively

More information

ChIP-seq data analysis

ChIP-seq data analysis ChIP-seq data analysis Harri Lähdesmäki Department of Computer Science Aalto University November 24, 2017 Contents Background ChIP-seq protocol ChIP-seq data analysis Transcriptional regulation Transcriptional

More information

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits Accelerating clinical research Next-generation sequencing (NGS) has the ability to interrogate many different genes and detect

More information

RNA- seq Introduc1on. Promises and pi7alls

RNA- seq Introduc1on. Promises and pi7alls RNA- seq Introduc1on Promises and pi7alls DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different func1onal RNAs Which RNAs (and some1mes

More information

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs ChIP-seq hands-on Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs Main goals Becoming familiar with essential tools and formats Visualizing and contextualizing raw data Understand

More information

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS dr sc. Ana Krivokuća Laboratory for molecular genetics Institute for Oncology and

More information

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute Whole Genome and Transcriptome Analysis of Anaplastic Meningioma Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute Outline Anaplastic meningioma compared to other cancers Whole genomes

More information

Characteriza*on of Soma*c Muta*ons in Cancer Genomes

Characteriza*on of Soma*c Muta*ons in Cancer Genomes Characteriza*on of Soma*c Muta*ons in Cancer Genomes Ben Raphael Department of Computer Science Center for Computa*onal Molecular Biology Soma*c Muta*ons and Cancer Clonal Theory (Nowell 1976) Passenger

More information

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits Next-generation performance in liquid biopsies 2 Accelerating clinical research From liquid biopsy to next-generation

More information

Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research

Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research Application Note Authors John McGuigan, Megan Manion,

More information

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc. Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Topics Overview of Data Processing Pipeline Overview of Data Files 2 DNA Nano-Ball (DNB) Read Structure Genome : acgtacatgcattcacacatgcttagctatctctcgccag

More information

Simple, rapid, and reliable RNA sequencing

Simple, rapid, and reliable RNA sequencing Simple, rapid, and reliable RNA sequencing RNA sequencing applications RNA sequencing provides fundamental insights into how genomes are organized and regulated, giving us valuable information about the

More information

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit APPLICATION NOTE Ion PGM System Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit Key findings The Ion PGM System, in concert with the Ion ReproSeq PGS View Kit and Ion Reporter

More information

Cancer Gene Panels. Dr. Andreas Scherer. Dr. Andreas Scherer President and CEO Golden Helix, Inc. Twitter: andreasscherer

Cancer Gene Panels. Dr. Andreas Scherer. Dr. Andreas Scherer President and CEO Golden Helix, Inc. Twitter: andreasscherer Cancer Gene Panels Dr. Andreas Scherer Dr. Andreas Scherer President and CEO Golden Helix, Inc. scherer@goldenhelix.com Twitter: andreasscherer About Golden Helix - Founded in 1998 - Main outside investor:

More information

Implementation of nation-wide molecular testing in oncology in the French Health care system : quality assurance issues & challenges

Implementation of nation-wide molecular testing in oncology in the French Health care system : quality assurance issues & challenges Implementation of nation-wide molecular testing in oncology in the French Health care system : quality assurance issues & challenges Frédérique Nowak - 21 october 2015 "Putting Science into Standards event:

More information

Module 3: Pathway and Drug Development

Module 3: Pathway and Drug Development Module 3: Pathway and Drug Development Table of Contents 1.1 Getting Started... 6 1.2 Identifying a Dasatinib sensitive cancer signature... 7 1.2.1 Identifying and validating a Dasatinib Signature... 7

More information

Investigating rare diseases with Agilent NGS solutions

Investigating rare diseases with Agilent NGS solutions Investigating rare diseases with Agilent NGS solutions Chitra Kotwaliwale, Ph.D. 1 Rare diseases affect 350 million people worldwide 7,000 rare diseases 80% are genetic 60 million affected in the US, Europe

More information

Integrated Analysis of Copy Number and Gene Expression

Integrated Analysis of Copy Number and Gene Expression Integrated Analysis of Copy Number and Gene Expression Nexus Copy Number provides user-friendly interface and functionalities to integrate copy number analysis with gene expression results for the purpose

More information

Characterisation of structural variation in breast. cancer genomes using paired-end sequencing on. the Illumina Genome Analyser

Characterisation of structural variation in breast. cancer genomes using paired-end sequencing on. the Illumina Genome Analyser Characterisation of structural variation in breast cancer genomes using paired-end sequencing on the Illumina Genome Analyser Phil Stephens Cancer Genome Project Why is it important to study cancer? Why

More information

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library Marilou Wijdicks International Product Manager Research For Life Science Research Only. Not for Use in Diagnostic Procedures.

More information

Small RNAs and how to analyze them using sequencing

Small RNAs and how to analyze them using sequencing Small RNAs and how to analyze them using sequencing Jakub Orzechowski Westholm (1) Long- term bioinforma=cs support, Science For Life Laboratory Stockholm (2) Department of Biophysics and Biochemistry,

More information

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers Gordon Blackshields Senior Bioinformatician Source BioScience 1 To Cancer Genetics Studies

More information

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland AD Award Number: W81XWH-12-1-0298 TITLE: MTHFR Functional Polymorphism C677T and Genomic Instability in the Etiology of Idiopathic Autism in Simplex Families PRINCIPAL INVESTIGATOR: Xudong Liu, PhD CONTRACTING

More information

Transcript reconstruction

Transcript reconstruction Transcript reconstruction Summary I Data types, file formats and utilities Annotation: Genomic regions Genes Peaks bedtools Alignment: Map reads BAM/SAM Samtools Aggregation: Summary files Wig (UCSC) TDF

More information

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz Software Manual Institute of Bioinformatics, Johannes Kepler University Linz cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University

More information

Golden Helix s End-to-End Solution for Clinical Labs

Golden Helix s End-to-End Solution for Clinical Labs Golden Helix s End-to-End Solution for Clinical Labs Steven Hystad - Field Application Scientist Nathan Fortier Senior Software Engineer 20 most promising Biotech Technology Providers Top 10 Analytics

More information

Part-II: Statistical analysis of ChIP-seq data

Part-II: Statistical analysis of ChIP-seq data Part-II: Statistical analysis of ChIP-seq data Outline ChIP-seq data, features, detailed modeling aspects (today). Other ChIP-seq related problems - overview (next lecture). IDR (next lecture) Stat 877

More information

Reconstruc*ng Human Tumor Histories By Comparing Genomes From Different Parts of the Same Cancer

Reconstruc*ng Human Tumor Histories By Comparing Genomes From Different Parts of the Same Cancer Reconstruc*ng Human Tumor Histories By Comparing Genomes From Different Parts of the Same Cancer Darryl Shibata Professor of Pathology University of Southern California Keck School of Medicine dshibata@usc.edu

More information

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/ CSCI1950 Z Computa4onal Methods for Biology Lecture 18 Ben Raphael April 8, 2009 hip://cs.brown.edu/courses/csci1950 z/ Binary classifica,on Given a set of examples (x i, y i ), where y i = + 1, from unknown

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Length of mrna transcripts in the human genome 5,000 5,000 4,000 3,000 2,000 4,000 1,000 0 0 200 400 600 800 3,000 2,000 1,000 0 0 2,000 4,000 6,000 8,000 10,000 Length

More information

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc. Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets

More information

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA www.impactjournals.com/oncotarget/ Oncotarget, Supplementary Materials 2016 Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) DNA Supplementary Materials

More information

Chip Seq Peak Calling in Galaxy

Chip Seq Peak Calling in Galaxy Chip Seq Peak Calling in Galaxy Chris Seward PowerPoint by Pei-Chen Peng Chip-Seq Peak Calling in Galaxy Chris Seward 2018 1 Introduction This goals of the lab are as follows: 1. Gain experience using

More information

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection Dr Elaine Kenny Neuropsychiatric Genetics Research Group Institute of Molecular Medicine Trinity College Dublin

More information

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation Michael R. Rossi, PhD, FACMG Assistant Professor Division of Cancer Biology, Department of Radiation Oncology Department

More information

SCALPEL MICRO-ASSEMBLY APPROACH TO DETECT INDELS WITHIN EXOME-CAPTURE DATA. Giuseppe Narzisi, PhD Schatz Lab

SCALPEL MICRO-ASSEMBLY APPROACH TO DETECT INDELS WITHIN EXOME-CAPTURE DATA. Giuseppe Narzisi, PhD Schatz Lab SCALPEL MICRO-ASSEMBLY APPROACH TO DETECT INDELS WITHIN EXOME-CAPTURE DATA Giuseppe Narzisi, PhD Schatz Lab November 14, 2013 Micro-Assembly Approach to detect INDELs 2 Outline Scalpel micro-assembly pipeline

More information

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics Precision Genomics for Immuno-Oncology Personalis, Inc. ACE ImmunoID When one biomarker doesn t tell the whole

More information

Assessing Laboratory Performance for Next Generation Sequencing Based Detection of Germline Variants through Proficiency Testing

Assessing Laboratory Performance for Next Generation Sequencing Based Detection of Germline Variants through Proficiency Testing Assessing Laboratory Performance for Next Generation Sequencing Based Detection of Germline Variants through Proficiency Testing Karl V. Voelkerding, MD Professor of Pathology University of Utah Medical

More information

Small RNA Sequencing. Project Workflow. Service Description. Sequencing Service Specification BGISEQ-500 SERVICE OVERVIEW SAMPLE PREPARATION

Small RNA Sequencing. Project Workflow. Service Description. Sequencing Service Specification BGISEQ-500 SERVICE OVERVIEW SAMPLE PREPARATION BGISEQ-500 SERVICE OVERVIEW Small RNA Sequencing Service Description Small RNAs are a type of non-coding RNA (ncrna) molecules that are less than 200nt in length. They are often involved in gene silencing

More information

Clinical Utility of Actionable Genome Information in Precision Oncology Clinic

Clinical Utility of Actionable Genome Information in Precision Oncology Clinic Indian Ocean Rim 2017 Laboratory Haematology Congress 2017. 6.18-19, Singapore Clinical Utility of Actionable Genome Information in Precision Oncology Clinic Reimbursement program for NGS panel tests in

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

Review: Genome assembly Reads

Review: Genome assembly Reads Assembly validation Review: Genome assembly Reads Contigs Scaffolds Chromosome Review: Mate pair data Overlap-Layout-Consensus AMOS project: A Modular Open Source assembler Importing data to an AMOS bank

More information

SubLasso:a feature selection and classification R package with a. fixed feature subset

SubLasso:a feature selection and classification R package with a. fixed feature subset SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 20

More information

ACE ImmunoID. ACE ImmunoID. Precision immunogenomics. Precision Genomics for Immuno-Oncology

ACE ImmunoID. ACE ImmunoID. Precision immunogenomics. Precision Genomics for Immuno-Oncology ACE ImmunoID ACE ImmunoID Precision immunogenomics Precision Genomics for Immuno-Oncology Personalis, Inc. A universal biomarker platform for immuno-oncology Patient response to cancer immunotherapies

More information

Bigomics : Challenges and promises in large scale sequencing projects

Bigomics : Challenges and promises in large scale sequencing projects Bigomics : Challenges and promises in large scale sequencing projects Theodore M. Wong Ken Yocum MSST 2014 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa, Making Sense Out of Life,

More information

ncounter Data Analysis Guidelines for Copy Number Variation (CNV) Molecules That Count NanoString Technologies, Inc.

ncounter Data Analysis Guidelines for Copy Number Variation (CNV) Molecules That Count NanoString Technologies, Inc. ncounter Data Analysis Guidelines for Copy Number Variation (CNV) NanoString Technologies, Inc. 530 Fairview Ave N Suite 2000 Seattle, Washington 98109 www.nanostring.com Tel: 206.378.6266 888.358.6266

More information

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz Software Manual Institute of Bioinformatics, Johannes Kepler University Linz cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University

More information

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space Whole genome sequencing Whole exome sequencing BWA alignment to reference transcriptome and genome Convert transcriptome mappings back to genome space genomes Filter on MQ, distance, Cigar string Annotate

More information

Genome. Institute. GenomeVIP: A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon s Cloud. R. Jay Mashl.

Genome. Institute. GenomeVIP: A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon s Cloud. R. Jay Mashl. GenomeVIP: the Genome Institute at Washington University A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon s Cloud R. Jay Mashl October 20, 2014 Turnkey Variant

More information

Valida5on of a Microsatellite Instability Assay by NGS

Valida5on of a Microsatellite Instability Assay by NGS Valida5on of a Microsatellite Instability Assay by NGS Mark R. Miglarese, Ph.D. VP R&D 1 Caris Life Sciences 230,000+ tests performed in 2016 Headquarters: Irving, Texas Laboratory: Phoenix, Arizona 66,000

More information

Calling DNA variants SNVs, CNVs, and SVs. Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017

Calling DNA variants SNVs, CNVs, and SVs. Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017 1 Calling DNA variants SNVs, CNVs, and SVs Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017 Calling DNA variants SNVs, CNVs, SVs 2 1. What is a variant? 2. Paired End read

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature11396 A total of 2078 samples from a large sequencing project at decode were used in this study, 219 samples from 78 trios with two grandchildren who were not also members of other trios,

More information

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS APPLICATION NOTE Fluxion Biosciences and Swift Biosciences OVERVIEW This application note describes a robust method for detecting somatic mutations from liquid biopsy samples by combining circulating tumor

More information

Exercises: Differential Methylation

Exercises: Differential Methylation Exercises: Differential Methylation Version 2018-04 Exercises: Differential Methylation 2 Licence This manual is 2014-18, Simon Andrews. This manual is distributed under the creative commons Attribution-Non-Commercial-Share

More information

New Drug development and Personalized Therapy in The Era of Molecular Medicine

New Drug development and Personalized Therapy in The Era of Molecular Medicine New Drug development and Personalized Therapy in The Era of Molecular Medicine Ramesh K. Ramanathan MD Virginia G. Piper Cancer Center Translational Genomics Research Institute Scottsdale, AZ Clinical

More information

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and

More information

Lyon, 1 3 February 2012 Auditorium

Lyon, 1 3 February 2012 Auditorium Forty-eighth Session 04/11/2011 Lyon, 1 3 February 2012 Auditorium OPEN SESSION ON SCIENTIFIC TOPICS OF IMPORTANCE TO IARC The advice from the Scientific Council on the two scientific topics selected by

More information

STREAMLINED MUTATION ANALYSIS FOR CLINICAL NEXT GENERATION SEQUENCING DATA

STREAMLINED MUTATION ANALYSIS FOR CLINICAL NEXT GENERATION SEQUENCING DATA STREAMLINED MUTATION ANALYSIS FOR CLINICAL NEXT GENERATION SEQUENCING DATA An Interactive Qualifying Project Report Submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of

More information

Illuminating the genetics of complex human diseases

Illuminating the genetics of complex human diseases Illuminating the genetics of complex human diseases Michael Schatz Sept 27, 2012 Beyond the Genome @mike_schatz / #BTG2012 Outline 1. De novo mutations in human diseases 1. Autism Spectrum Disorder 2.

More information

Tumor mutational burden and its transition towards the clinic

Tumor mutational burden and its transition towards the clinic Tumor mutational burden and its transition towards the clinic G C C A T C A C Wolfram Jochum Institute of Pathology Kantonsspital St.Gallen CH-9007 St.Gallen wolfram.jochum@kssg.ch 30th European Congress

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

QIAGEN Complete Solutions for Liquid Biopsy Molecular Testing

QIAGEN Complete Solutions for Liquid Biopsy Molecular Testing QIAGEN Complete Solutions for Liquid Biopsy Molecular Testing Christopher Swagell, PhD Market Development Manager, Advanced Molecular Pathology QIAGEN 1 Agenda QIAGEN Solid Tumor Testing and Liquid Biopsy

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 28

More information

User Guide. Association analysis. Input

User Guide. Association analysis. Input User Guide TFEA.ChIP is a tool to estimate transcription factor enrichment in a set of differentially expressed genes using data from ChIP-Seq experiments performed in different tissues and conditions.

More information

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB CSF-NGS January 22, 214 Contents 1 Introduction 1 2 Experimental Details 1 3 Results And Discussion 1 3.1 ERCC spike ins............................................

More information

A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis

A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis APPLICATION NOTE Cell-Free DNA Isolation Kit A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis Abstract Circulating cell-free DNA (cfdna) has been shown

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Hands-On Ten The BRCA1 Gene and Protein

Hands-On Ten The BRCA1 Gene and Protein Hands-On Ten The BRCA1 Gene and Protein Objective: To review transcription, translation, reading frames, mutations, and reading files from GenBank, and to review some of the bioinformatics tools, such

More information

TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY. Giuseppe Narzisi, PhD Bioinformatics Scientist

TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY. Giuseppe Narzisi, PhD Bioinformatics Scientist TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY Giuseppe Narzisi, PhD Bioinformatics Scientist July 29, 2014 Micro-Assembly Approach to detect INDELs 2 Outline 1 Detecting INDELs:

More information

Data mining with Ensembl Biomart. Stéphanie Le Gras

Data mining with Ensembl Biomart. Stéphanie Le Gras Data mining with Ensembl Biomart Stéphanie Le Gras (slegras@igbmc.fr) Guidelines Genome data Genome browsers Getting access to genomic data: Ensembl/BioMart 2 Genome Sequencing Example: Human genome 2000:

More information

Detection of copy number variations in PCR-enriched targeted sequencing data

Detection of copy number variations in PCR-enriched targeted sequencing data Detection of copy number variations in PCR-enriched targeted sequencing data German Demidov Parseq Lab, Saint-Petersburg University of Russian Academy of Sciences, current: Center for Genomic Regulation

More information

RNA-seq Introduction

RNA-seq Introduction RNA-seq Introduction DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different functional RNAs Which RNAs (and sometimes then translated

More information

Personalised medicine: Past, present and future

Personalised medicine: Past, present and future Kathmandu, Bir Hospital visit, August 2018 Personalised medicine: Past, present and future Rodney J. Scott University of Newcastle, NSW, Australia & Hunter Area Pathology Service Current Medical Care Started

More information

5 th July 2016 ACGS Dr Michelle Wood Laboratory Genetics, Cardiff

5 th July 2016 ACGS Dr Michelle Wood Laboratory Genetics, Cardiff 5 th July 2016 ACGS Dr Michelle Wood Laboratory Genetics, Cardiff National molecular screening of patients with lung cancer for a national trial of multiple novel agents. 2000 NSCLC patients/year (late

More information

The mutations that drive cancer. Paul Edwards. Department of Pathology and Cancer Research UK Cambridge Institute, University of Cambridge

The mutations that drive cancer. Paul Edwards. Department of Pathology and Cancer Research UK Cambridge Institute, University of Cambridge The mutations that drive cancer Paul Edwards Department of Pathology and Cancer Research UK Cambridge Institute, University of Cambridge Previously on Cancer... hereditary predisposition Normal Cell Slightly

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Download slides and package http://odin.mdacc.tmc.edu/~rverhaak/package.zip http://odin.mdacc.tmc.edu/~rverhaak/rna-seqlecture.zip Overview Introduction into the topic

More information

PRECISION INSIGHTS. Liquid GPS. Blood-based tumor profiling and quantitative monitoring. Reveal more with cfdna + cfrna.

PRECISION INSIGHTS. Liquid GPS. Blood-based tumor profiling and quantitative monitoring. Reveal more with cfdna + cfrna. PRECISION INSIGHTS Liquid GPS Blood-based tumor profiling and quantitative monitoring Reveal more with cfdna + cfrna www.nanthealth.com Why Blood-Based Tumor Profiling? Although tissue-based molecular

More information

Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep

Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep Tom Walsh, PhD Division of Medical Genetics University of Washington Next generation sequencing Sanger sequencing gold

More information

The Cancer Genome Atlas & International Cancer Genome Consortium

The Cancer Genome Atlas & International Cancer Genome Consortium The Cancer Genome Atlas & International Cancer Genome Consortium Session 3 Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 31 st July 2014 1

More information

Ginkgo Interactive analysis and quality assessment of single-cell CNV data

Ginkgo Interactive analysis and quality assessment of single-cell CNV data Ginkgo Interactive analysis and quality assessment of single-cell CNV data @RobAboukhalil Robert Aboukhalil, Tyler Garvin, Jude Kendall, Timour Baslan, Gurinder S. Atwal, Jim Hicks, Michael Wigler, Michael

More information

Colorspace & Matching

Colorspace & Matching Colorspace & Matching Outline Color space and 2-base-encoding Quality Values and filtering Mapping algorithm and considerations Estimate accuracy Coverage 2 2008 Applied Biosystems Color Space Properties

More information

SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN

SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN Structural variation (SV) Variants larger than 50bps Affect

More information

The Epigenome Tools 2: ChIP-Seq and Data Analysis

The Epigenome Tools 2: ChIP-Seq and Data Analysis The Epigenome Tools 2: ChIP-Seq and Data Analysis Chongzhi Zang zang@virginia.edu http://zanglab.com PHS5705: Public Health Genomics March 20, 2017 1 Outline Epigenome: basics review ChIP-seq overview

More information

VirusDetect pipeline - virus detection with small RNA sequencing

VirusDetect pipeline - virus detection with small RNA sequencing VirusDetect pipeline - virus detection with small RNA sequencing CSC webinar 16.1.2018 Eija Korpelainen, Kimmo Mattila, Maria Lehtivaara Big thanks to Jan Kreuze and Jari Valkonen! Outline Small interfering

More information

TCGA. The Cancer Genome Atlas

TCGA. The Cancer Genome Atlas TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue

More information

Nature Biotechnology: doi: /nbt.1904

Nature Biotechnology: doi: /nbt.1904 Supplementary Information Comparison between assembly-based SV calls and array CGH results Genome-wide array assessment of copy number changes, such as array comparative genomic hybridization (acgh), is

More information

VARIANT PRIORIZATION AND ANALYSIS INCORPORATING PROBLEMATIC REGIONS OF THE GENOME ANIL PATWARDHAN

VARIANT PRIORIZATION AND ANALYSIS INCORPORATING PROBLEMATIC REGIONS OF THE GENOME ANIL PATWARDHAN VARIANT PRIORIZATION AND ANALYSIS INCORPORATING PROBLEMATIC REGIONS OF THE GENOME ANIL PATWARDHAN Email: apatwardhan@personalis.com MICHAEL CLARK Email: michael.clark@personalis.com ALEX MORGAN Email:

More information

NGS in tissue and liquid biopsy

NGS in tissue and liquid biopsy NGS in tissue and liquid biopsy Ana Vivancos, PhD Referencias So, why NGS in the clinics? 2000 Sanger Sequencing (1977-) 2016 NGS (2006-) ABIPrism (Applied Biosystems) Up to 2304 per day (96 sequences

More information

Transform genomic data into real-life results

Transform genomic data into real-life results CLINICAL SUMMARY Transform genomic data into real-life results Biomarker testing and targeted therapies can drive improved outcomes in clinical practice New FDA-Approved Broad Companion Diagnostic for

More information

Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010

Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 C.J.Vaske et al. May 22, 2013 Presented by: Rami Eitan Complex Genomic

More information