Expert-guided Visual Exploration (EVE) for patient stratification. Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland

Size: px
Start display at page:

Download "Expert-guided Visual Exploration (EVE) for patient stratification. Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland"

Transcription

1 Expert-guided Visual Exploration (EVE) for patient stratification Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland

2 Oncoscape.sttrcancer.org Paul Lisa Ken Jenny Desert Eric

3 The challenge Given - patient clinical records and genome-wide data for one or more of - mrna, mirna, protein expression - DNA single nucleotide and copy number alterations (SNAs, CNAs) - Methylated DNA regions per gene Identify clinically-relevant patient sub-groups

4 Classical approaches to finding disease sub-types: 1) Supervised stratification - Divide patients by known criteria (e.g. fusion genes) - Test for enrichment of markers (e.g. mutations) within groups limited applicability 2) Guided clustering - Cluster samples by most variable or candidate genes/probes - Find optimum number of clusters and cluster boundaries - Find genes/probes that best distinguish among clusters 3) Biomarker-based - Find genes/probes correlated with one or more phenotypes sophisticated methods, complex to generalize

5 Motivation: - Empower disease experts, exploit their biomedical knowledge/insights - Leverage users pattern recognition and reasoning skills - Use methods that can easily be adapted to all cancers - Use methods applicable to all genome-wide data types - Enable integration of large-scale and in-house data - Allow for nonlinear gene-gene interactions Overview of EVE Step 1: Step 2: Step 3: Step 4: Reduce genome-wide data to informative features (gene/probe sets) Calculate sample similarities/distances using each feature Visualize feature-based distance matrices as 2/3D scatter plots Simultaneously color-in samples with shared properties across all plots Useful side effects - Can use published gene/probe sets - Methods available for automated feature and parameter searching

6 Feature-based pattern-recognition Discover/define features cluster by features find informative clusters Features End point : cluster archetypes (representatives) Two types of features: local (e.g. eye color) global (e.g. head shape) Example feature relationships: distance between eyes relative size of mouth/nose

7 MGMT probe methylation levels compared to all probes genome-wide (in a single GBM patient) Molecular features to cluster by: sets of genes / probes measuring expression, methylation, DNA-variation, genomic location,... associated with a particular clinical phenotype cellular process cell type... promoter (<1Kbp of TSS) all MGMT probes

8 Example available gene sets (similarity/distance features )

9 Example distance measures

10 1) similaritysna inner product of per-gene indicator vectors 2) similaritycna inner product if per-gene thresholded GISTIC scores 3) joint.sna.cna sum of normalized SNA and CNA similarity scores 4) joint.exp.cna inner product of per-gene expression Z-scores & thresholded GISITC scores 5) me.top8000 top 8000 methylation probes by Median Absolute Deviation 6) me.marker.probes 55 probes reported to be differentially methylated in gliomas (14 genes) 7) me.cimp 1444 of 1503 G-CIMP classifier probes (TCGA 2010) 8) me. ingene 291,151 me-probes within gene bodies 9) me.promotercpg 120,560 me-probes within CpG islands <1000bp of transcription start sites 10) corr.me.exp correlation(top 8000 MAD(per gene methylation, per gene expression)) 11) glioma.gene.exp 162 glioma-associated genes from the literature 12) stemness.exp 396 stemness marker genes 13) metabolic.exp PCA(1157/1240 KEGG Hs. metabolic genes excl. GBM subtype classifiers) 14) GSEA.C2 Manhattan distance(ssgsea of MSIGDB C2 (pathways) gene set) 15) GSEA.C7 Manhattan distance(ssgsea of MSIGDB C7 (immunologic) gene set) 16) exp.os PCA(expression of 46 genes reported to predict Overall Survival)

11 Example (user-defined) sample similarity measures I SNA ( gene i ) = 1 if gene i is mutated 0 otherwise I CNA ( gene i ) = -2 if ploidy = 0-1 if ploidy = 1 0 if ploidy = 2 +1 if ploidy = 3 +2 if ploidy > 3 I SNA ( gene 1 ) I cna ( gene 1 ) I SNA ( gene 2 ) I cna ( gene 2 ) s i = sample SNA vector =.. c i = sample CNA vector =.... I SNA ( gene 20K ) I cna ( gene 20K ) SNA similarity (s 1, s 2 ) = s 1. s 2 CNA similarity (c 1, c 2 ) = c 1. c 2 Joint SNA:CNA similarity = s/sum(s) + c/sum(c)

12 I ll use data for 1105 TCGA glioma samples as example Single Nucleotide Alterations (SNAs) from exome-sequencing Copy Number Alterations (CNAs) from SNP6.0 arrays DNA methylation from Infinium 450K arrays mrna-seq Clinical data, but ~ 2/3 rd of lower grade gliomas were alive at data collection ~ 1/5 th have no status information

13 Example views of sample similarity Lower Grade Glioma (LGG) GBM (grade 4 glioma) published CpG-island methylator phenotype (CIMP) GBM

14 non-cimp LGGs have GBM-like survival CIMP non-cimp

15 CIMP LGGs noncimp LGGs mutation frequency GBMs CIMP LGGs noncimp LGGs GBMs noncimp LGGs have no del(1p19q) & GBM-like freq(cnvs) in CDKN2A/2B, EGFR, PDGFA, MET PTEN EGFR 17 authors

16 non-cimp LGGs are GBM-like in DNA sequence, but distinct in expression & a subset of DNA-me extent of anti-correlation between promoter-me & expression

17 Gliomas can be divided into eight distinct genomic clusters (empirical p-value <0.0001)

18 P-value calculation for GMB SNA:CNA cluster 1

19 Density based clustering verifies genomic clusters 1) density-based clustering 2) density contours 4) cluster members within selected contour 3) points inside a selected contour

20 97 genes shared by Vogelstein et al & UW Oncoplex 274 genes from Vogelstein et al & UW Oncoplex change in sum(all distances) 274 cancer-causing genes are sufficient to capture sample similarities IDH1 top 20 genes not sufficient TP53 ATRX Impact of leaving 1 of 274 genes out impact of leaving ATRX out

21 Robustness: the location of a tumor can be estimated from its similarity to other samples (sample leave-one-out experiment) Sample locations vs. centroid of 3 nearest neighbors Left out sample s location estimated by 3-neighbor centroid

22 % of HER2+ or HER2p+ samples cluster 3 all cluster 3 all cluster 3 all Clusters 1-8 can be used to identify patient groups highly enriched for candidate drugs e.g. cluster 3 is ~ 3-fold enriched for EGFR signaling HER2+ HER2p+ combined

23 Genomic clusters have distinct DNA-methylation & gene expression profiles differential methylation analysis clustering by 4,500 methylation probes HDAC1 a g5,6 REST/NRSF d REST/NRSF mrna expression z-score mrna expression z-score b g7,8 c

24 Automated searching identifies distinct groups of marker-genes (example 1: ZNF821)

25 Automated searching identifies distinct groups of marker-genes (example 2: GPG5)

26 Expression clusters were ranked by the number of misclassifications by a linear SVM PCA of all samples for 396 stemness markers genes PC2 low mrna high mrna PC1

27 frequency in 631 discovered genes 631 genes whose low/high expression segregate metabolic space cluster into 3 groups Expression levels of groups 1 & 3 are ~ anti-correlated. Group 2 is orthogonal to 1 and 3. angle of direction vector

28 High cofilin-1 levels correlate with cisplatin resistance in lung adenocarcinomas. Becker M, et al. Tumour Biol, 2014 Expression levels of some genes delineate specific features (example: CFL1)

29 Dist. to > density To come: combine clustering & marker detection to auto-rank genes sets cluster detection Local density c.f. Science, 27 June 2014 hierarchical complete K-means

30 The self correcting nature of EVE

31 EVE detection of a sequencing batch effect in TCGA lung adenocarcinoma SNA data

32 Sample purity confounding effects in TCGA prostate adenocarcinomas mrna DNA-me mirna

33 Methylation batch effects were corrected with Functional Normalization (R/Bioconductor package minfi ) Expression batch effects were corrected with ComBat (R/Bioconductor package swamp ) Before batch-effect correction After ComBat batch-effect correction p-value(linear-fit of per-gene expression to batch ID)

34 15 OS marker genes, 1 (TAZ) is on ChrX (4 out of 59 probes) Same plot with TAZ removed

35 Expert-guided Visual Exploration (EVE) for patient stratification Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland

36

37 1) density-based clustering 2) density contours 4) density-based cluster members within selected contour 3) points inside a selected (green) contour (10%) above

38 non-cimp LGGs are GBM-like CIMP LGG non-cimp LGG non-cimp GBM DNA-me verified CIMP GBM GBM, no DNA-me data

39

DNA methylation signatures for 2016 WHO classification subtypes of diffuse gliomas

DNA methylation signatures for 2016 WHO classification subtypes of diffuse gliomas Paul et al. Clinical Epigenetics (2017) 9:32 DOI 10.1186/s13148-017-0331-9 RESEARCH Open Access DNA methylation signatures for 2016 WHO classification subtypes of diffuse gliomas Yashna Paul, Baisakhi

More information

Nature Medicine: doi: /nm.3967

Nature Medicine: doi: /nm.3967 Supplementary Figure 1. Network clustering. (a) Clustering performance as a function of inflation factor. The grey curve shows the median weighted Silhouette widths for varying inflation factors (f [1.6,

More information

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SUPPLEMENTARY FIGURES: Supplementary Figure 1 SUPPLEMENTARY FIGURES: Supplementary Figure 1 Supplementary Figure 1. Glioblastoma 5hmC quantified by paired BS and oxbs treated DNA hybridized to Infinium DNA methylation arrays. Workflow depicts analytic

More information

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature10866 a b 1 2 3 4 5 6 7 Match No Match 1 2 3 4 5 6 7 Turcan et al. Supplementary Fig.1 Concepts mapping H3K27 targets in EF CBX8 targets in EF H3K27 targets in ES SUZ12 targets in ES

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

Biomarker development in the era of precision medicine. Bei Li, Interdisciplinary Technical Journal Club

Biomarker development in the era of precision medicine. Bei Li, Interdisciplinary Technical Journal Club Biomarker development in the era of precision medicine Bei Li, 23.08.2016 Interdisciplinary Technical Journal Club The top ten highest-grossing drugs in the United States help between 1 in 25 and 1 in

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Pan-cancer analysis of global and local DNA methylation variation a) Variations in global DNA methylation are shown as measured by averaging the genome-wide

More information

Nature Genetics: doi: /ng.2995

Nature Genetics: doi: /ng.2995 Supplementary Figure 1 Kaplan-Meier survival curves of patients with brainstem tumors. (a) Comparison of patients with PPM1D mutation versus wild-type PPM1D. (b) Comparison of patients with PPM1D mutation

More information

Colon cancer subtypes from gene expression data

Colon cancer subtypes from gene expression data Colon cancer subtypes from gene expression data Nathan Cunningham Giuseppe Di Benedetto Sherman Ip Leon Law Module 6: Applied Statistics 26th February 2016 Aim Replicate findings of Felipe De Sousa et

More information

Expanded View Figures

Expanded View Figures Solip Park & Ben Lehner Epistasis is cancer type specific Molecular Systems Biology Expanded View Figures A B G C D E F H Figure EV1. Epistatic interactions detected in a pan-cancer analysis and saturation

More information

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075 Aiello & Alter (216) PLoS One vol. 11 no. 1 e164546 S1 Appendix A-1 S1 Appendix: Figs A G and Table A a Tumor Generalized Fraction b Normal Generalized Fraction.25.5.75.25.5.75 1 53 4 59 2 58 8 57 3 48

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

User s Manual Version 1.0

User s Manual Version 1.0 User s Manual Version 1.0 #639 Longmian Avenue, Jiangning District, Nanjing,211198,P.R.China. http://tcoa.cpu.edu.cn/ Contact us at xiaosheng.wang@cpu.edu.cn for technical issue and questions Catalogue

More information

Precision medicine: How to exploit the growing knowledge on the evolving genomes of cells to improve cancer prevention and therapy.

Precision medicine: How to exploit the growing knowledge on the evolving genomes of cells to improve cancer prevention and therapy. Precision medicine: How to exploit the growing knowledge on the evolving genomes of cells to improve cancer prevention and therapy Joe Costello, PhD Department of Neurological Surgery A more accurate and

More information

Patient characteristics of training and validation set. Patient selection and inclusion overview can be found in Supp Data 9. Training set (103)

Patient characteristics of training and validation set. Patient selection and inclusion overview can be found in Supp Data 9. Training set (103) Roepman P, et al. An immune response enriched 72-gene prognostic profile for early stage Non-Small- Supplementary Data 1. Patient characteristics of training and validation set. Patient selection and inclusion

More information

Metabolomic and Proteomics Solutions for Integrated Biology. Christine Miller Omics Market Manager ASMS 2015

Metabolomic and Proteomics Solutions for Integrated Biology. Christine Miller Omics Market Manager ASMS 2015 Metabolomic and Proteomics Solutions for Integrated Biology Christine Miller Omics Market Manager ASMS 2015 Integrating Biological Analysis Using Pathways Protein A R HO R Protein B Protein X Identifies

More information

Supplementary Materials for

Supplementary Materials for www.sciencemag.org/content/355/6332/eaai8478/suppl/dc1 Supplementary Materials for Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq Andrew S. Venteicher,

More information

Expanded View Figures

Expanded View Figures EMO Molecular Medicine Proteomic map of squamous cell carcinomas Hanibal ohnenberger et al Expanded View Figures Figure EV1. Technical reproducibility. Pearson s correlation analysis of normalised SILC

More information

A clinical perspective on neuropathology and molecular genetics in brain tumors

A clinical perspective on neuropathology and molecular genetics in brain tumors A clinical perspective on neuropathology and molecular genetics in brain tumors M.J. van den Bent Erasmus MC Cancer Institute Rotterdam, the Netherlands Disclosures Member speakersbureau: MSD Consultancy:

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

SUPPLEMENTARY APPENDIX

SUPPLEMENTARY APPENDIX SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature

More information

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015 Goals/Expectations Computer Science, Biology, and Biomedical (CoSBBI) We want to excite you about the world of computer science, biology, and biomedical informatics. Experience what it is like to be a

More information

Clasificación Molecular del Cáncer de Próstata. JM Piulats

Clasificación Molecular del Cáncer de Próstata. JM Piulats Clasificación Molecular del Cáncer de Próstata JM Piulats Introduction The Gleason score is the major method for prostate cancer tissue grading and the most important prognostic factor in this disease.

More information

Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers

Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers Kai-Ming Jiang 1,2, Bao-Liang Lu 1,2, and Lei Xu 1,2,3(&) 1 Department of Computer Science and Engineering,

More information

Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang

Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang Abstract: Unlike most cancers, thyroid cancer has an everincreasing incidence rate

More information

ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data

ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data Durairaj Renu 1, Vadiraja Bhat 2, Mona Al-Gizawiy 3, Carolina B. Livi 2, Stephen

More information

The Cancer Genome Atlas & International Cancer Genome Consortium

The Cancer Genome Atlas & International Cancer Genome Consortium The Cancer Genome Atlas & International Cancer Genome Consortium Session 3 Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 31 st July 2014 1

More information

Feature Vector Denoising with Prior Network Structures. (with Y. Fan, L. Raphael) NESS 2015, University of Connecticut

Feature Vector Denoising with Prior Network Structures. (with Y. Fan, L. Raphael) NESS 2015, University of Connecticut Feature Vector Denoising with Prior Network Structures (with Y. Fan, L. Raphael) NESS 2015, University of Connecticut Summary: I. General idea: denoising functions on Euclidean space ---> denoising in

More information

COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION

COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION Pierre Martinez, Nicholas McGranahan, Nicolai Juul Birkbak, Marco Gerlinger, Charles Swanton* SUPPLEMENTARY INFORMATION SUPPLEMENTARY

More information

RNA-SEQUENCING APPLICATIONS: GENE EXPRESSION QUANTIFICATION AND METHYLATOR PHENOTYPE IDENTIFICATION

RNA-SEQUENCING APPLICATIONS: GENE EXPRESSION QUANTIFICATION AND METHYLATOR PHENOTYPE IDENTIFICATION Texas Medical Center Library DigitalCommons@TMC UT GSBS Dissertations and Theses (Open Access) Graduate School of Biomedical Sciences 8-2013 RNA-SEQUENCING APPLICATIONS: GENE EXPRESSION QUANTIFICATION

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 20

More information

Supplementary Materials for

Supplementary Materials for www.sciencetranslationalmedicine.org/cgi/content/full/7/303/303ra139/dc1 Supplementary Materials for Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular

More information

Package inote. June 8, 2017

Package inote. June 8, 2017 Type Package Package inote June 8, 2017 Title Integrative Network Omnibus Total Effect Test Version 1.0 Date 2017-06-05 Author Su H. Chu Yen-Tsung Huang

More information

The Cancer Genome Atlas

The Cancer Genome Atlas The Cancer Genome Atlas July 14, 2011 Kenna M. Shaw, Ph.D. Deputy Director The Cancer Genome Atlas Program TCGA: Core Objectives Launched in 2006 as a pilot and expanded in 2009, the goals of TCGA are

More information

T. R. Golub, D. K. Slonim & Others 1999

T. R. Golub, D. K. Slonim & Others 1999 T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have

More information

EXPression ANalyzer and DisplayER

EXPression ANalyzer and DisplayER EXPression ANalyzer and DisplayER Tom Hait Aviv Steiner Igor Ulitsky Chaim Linhart Amos Tanay Seagull Shavit Rani Elkon Adi Maron-Katz Dorit Sagir Eyal David Roded Sharan Israel Steinfeld Yossi Shiloh

More information

Introduction to Gene Sets Analysis

Introduction to Gene Sets Analysis Introduction to Svitlana Tyekucheva Dana-Farber Cancer Institute May 15, 2012 Introduction Various measurements: gene expression, copy number variation, methylation status, mutation profile, etc. Main

More information

Integrated genomic analysis of human osteosarcomas

Integrated genomic analysis of human osteosarcomas Integrated genomic analysis of human osteosarcomas Leonardo A. Meza-Zepeda Project Leader Genomic Section Department of Tumor Biology The Norwegian Radium Hospital Head Microarray Core Facility Norwegian

More information

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station ncounter Assay ncounter Prep Station Automated Process Hybridize Reporter to RNA Remove excess reporters Bind reporter to surface Immobilize and align reporter Image surface Count codes Immobilize and

More information

Patient networks! in cancer:! a platform for data integration

Patient networks! in cancer:! a platform for data integration Anna Goldenberg and The Goldenberg Lab Patient networks! in cancer:! a platform for data integration Outline o Data integra-on problem setup o Pa-ent network representa-on why and how o Similarity Network

More information

Advances in Brain Tumor Research: Leveraging BIG data for BIG discoveries

Advances in Brain Tumor Research: Leveraging BIG data for BIG discoveries Advances in Brain Tumor Research: Leveraging BIG data for BIG discoveries Jill Barnholtz-Sloan, PhD Associate Professor & Associate Director for Bioinformatics and Translational Informatics jsb42@case.edu

More information

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA www.impactjournals.com/oncotarget/ Oncotarget, Supplementary Materials 2016 Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) DNA Supplementary Materials

More information

DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging

DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging Genome Biology This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. DiffVar: a new method for detecting

More information

From reference genes to global mean normalization

From reference genes to global mean normalization From reference genes to global mean normalization Jo Vandesompele professor, Ghent University co-founder and CEO, Biogazelle qpcr Symposium USA November 9, 2009 Millbrae, CA outline what is normalization

More information

Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017

Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 A.K.A. Artificial Intelligence Unsupervised learning! Cluster analysis Patterns, Clumps, and Joining

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Length of mrna transcripts in the human genome 5,000 5,000 4,000 3,000 2,000 4,000 1,000 0 0 200 400 600 800 3,000 2,000 1,000 0 0 2,000 4,000 6,000 8,000 10,000 Length

More information

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers Sung-Hou Kim University of California Berkeley, CA Global Bio Conference 2017 MFDS, Seoul, Korea June 28, 2017 Cancer

More information

Supplemental Information. Molecular, Pathological, Radiological, and Immune. Profiling of Non-brainstem Pediatric High-Grade

Supplemental Information. Molecular, Pathological, Radiological, and Immune. Profiling of Non-brainstem Pediatric High-Grade Cancer Cell, Volume 33 Supplemental Information Molecular, Pathological, Radiological, and Immune Profiling of Non-brainstem Pediatric High-Grade Glioma from the HERBY Phase II Randomized Trial Alan Mackay,

More information

Reviewer #1 (Remarks to the Author)

Reviewer #1 (Remarks to the Author) Reviewer #1 (Remarks to the Author) The authors examine genome-wide profiles of 5-methylcytosine and 5- hydroxymethylcytosine in glioblastoma. They show that 5-hydroxymethylcytosine is depleted in glioblastoma

More information

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity

Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity Supplementary Figure 1: LUMP Leukocytes unmethylabon to infer tumor purity A Consistently unmethylated sites (30%) in 21 cancer types 174,696

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 28

More information

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant

National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant National Surgical Adjuvant Breast and Bowel Project (NSABP) Foundation Annual Progress Report: 2009 Formula Grant Reporting Period July 1, 2011 June 30, 2012 Formula Grant Overview The National Surgical

More information

Supplementary Table S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort. Chr. # of Gene 2. Chr. # of Gene 1

Supplementary Table S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort. Chr. # of Gene 2. Chr. # of Gene 1 Supplementary Tale S1. List of PTPRK-RSPO3 gene fusions in TCGA's colon cancer cohort TCGA Case ID Gene-1 Gene-2 Chr. # of Gene 1 Chr. # of Gene 2 Genomic coordiante of Gene 1 at fusion junction Genomic

More information

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Introduction RNA splicing is a critical step in eukaryotic gene

More information

NGS in tissue and liquid biopsy

NGS in tissue and liquid biopsy NGS in tissue and liquid biopsy Ana Vivancos, PhD Referencias So, why NGS in the clinics? 2000 Sanger Sequencing (1977-) 2016 NGS (2006-) ABIPrism (Applied Biosystems) Up to 2304 per day (96 sequences

More information

Supplementary Figure 1. Copy Number Alterations TP53 Mutation Type. C-class TP53 WT. TP53 mut. Nature Genetics: doi: /ng.

Supplementary Figure 1. Copy Number Alterations TP53 Mutation Type. C-class TP53 WT. TP53 mut. Nature Genetics: doi: /ng. Supplementary Figure a Copy Number Alterations in M-class b TP53 Mutation Type Recurrent Copy Number Alterations 8 6 4 2 TP53 WT TP53 mut TP53-mutated samples (%) 7 6 5 4 3 2 Missense Truncating M-class

More information

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene

More information

IPA Advanced Training Course

IPA Advanced Training Course IPA Advanced Training Course October 2013 Academia sinica Gene (Kuan Wen Chen) IPA Certified Analyst Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises

More information

Journal: Nature Methods

Journal: Nature Methods Journal: Nature Methods Article Title: Network-based stratification of tumor mutations Corresponding Author: Trey Ideker Supplementary Item Supplementary Figure 1 Supplementary Figure 2 Supplementary Figure

More information

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Supplementary Materials and Methods Phylogenetic tree of the HMT superfamily The phylogeny outlined in the

More information

Integration of high-throughput biological data

Integration of high-throughput biological data Integration of high-throughput biological data Jean Yang and Vivek Jayaswal School of Mathematics and Statistics University of Sydney Meeting the Challenges of High Dimension: Statistical Methodology,

More information

Classification of cancer profiles. ABDBM Ron Shamir

Classification of cancer profiles. ABDBM Ron Shamir Classification of cancer profiles 1 Background: Cancer Classification Cancer classification is central to cancer treatment; Traditional cancer classification methods: location; morphology, cytogenesis;

More information

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Supplementary Figure 1 7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Regions targeted by the Even and Odd ChIRP probes mapped to a secondary structure model 56 of the

More information

ncounter Assay Automated Process Capture & Reporter Probes Bind reporter to surface Remove excess reporters Hybridize CodeSet to RNA

ncounter Assay Automated Process Capture & Reporter Probes Bind reporter to surface Remove excess reporters Hybridize CodeSet to RNA ncounter Assay Automated Process Hybridize CodeSet to RNA Remove excess reporters Bind reporter to surface Immobilize and align reporter Image surface Count codes mrna Capture & Reporter Probes slides

More information

Gene-microRNA network module analysis for ovarian cancer

Gene-microRNA network module analysis for ovarian cancer Gene-microRNA network module analysis for ovarian cancer Shuqin Zhang School of Mathematical Sciences Fudan University Oct. 4, 2016 Outline Introduction Materials and Methods Results Conclusions Introduction

More information

Genomic analysis of childhood High grade glial (HGG) brain tumors

Genomic analysis of childhood High grade glial (HGG) brain tumors Genomic analysis of childhood High grade glial (HGG) brain tumors Linda D Cooley Children s Mercy, Kansas City The Children s Mercy Hospital, 2017 Genomic analysis of childhood High grade glial (HGG) brain

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1

Nature Neuroscience: doi: /nn Supplementary Figure 1 Supplementary Figure 1 Illustration of the working of network-based SVM to confidently predict a new (and now confirmed) ASD gene. Gene CTNND2 s brain network neighborhood that enabled its prediction by

More information

Expanded View Figures

Expanded View Figures Molecular Systems iology Tumor CNs reflect metabolic selection Nicholas Graham et al Expanded View Figures Human primary tumors CN CN characterization by unsupervised PC Human Signature Human Signature

More information

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Donald J. Patterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 Pacific Symposium on Biocomputing University of Washington Computational

More information

Supplementary Figure 1

Supplementary Figure 1 Supplementary Figure 1 Supplementary Fig. 1: Quality assessment of formalin-fixed paraffin-embedded (FFPE)-derived DNA and nuclei. (a) Multiplex PCR analysis of unrepaired and repaired bulk FFPE gdna from

More information

The Cancer Genome Atlas Research Network* abstract

The Cancer Genome Atlas Research Network* abstract The new england journal of medicine established in 1812 June 25, 2015 vol. 372 no. 26 Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas The Cancer Genome Atlas Research Network*

More information

CS2220 Introduction to Computational Biology

CS2220 Introduction to Computational Biology CS2220 Introduction to Computational Biology WEEK 8: GENOME-WIDE ASSOCIATION STUDIES (GWAS) 1 Dr. Mengling FENG Institute for Infocomm Research Massachusetts Institute of Technology mfeng@mit.edu PLANS

More information

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer.

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Supplementary Figure 1 SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer. Scatter plots comparing expression profiles of matched pretreatment

More information

SALSA MLPA probemix P315-B1 EGFR

SALSA MLPA probemix P315-B1 EGFR SALSA MLPA probemix P315-B1 EGFR Lot B1-0215 and B1-0112. As compared to the previous A1 version (lot 0208), two mutation-specific probes for the EGFR mutations L858R and T709M as well as one additional

More information

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well

Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well Supplementary Figure 1: High-throughput profiling of survival after exposure to - radiation. (a) Cells were plated in at least 7 wells in a 384-well plate at cell densities ranging from 25-225 cells in

More information

Cytogenetics 101: Clinical Research and Molecular Genetic Technologies

Cytogenetics 101: Clinical Research and Molecular Genetic Technologies Cytogenetics 101: Clinical Research and Molecular Genetic Technologies Topics for Today s Presentation 1 Classical vs Molecular Cytogenetics 2 What acgh? 3 What is FISH? 4 What is NGS? 5 How can these

More information

Predicting Kidney Cancer Survival from Genomic Data

Predicting Kidney Cancer Survival from Genomic Data Predicting Kidney Cancer Survival from Genomic Data Christopher Sauer, Rishi Bedi, Duc Nguyen, Benedikt Bünz Abstract Cancers are on par with heart disease as the leading cause for mortality in the United

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC. Supplementary Figure 1 Rates of different mutation types in CRC. (a) Stratification by mutation type indicates that C>T mutations occur at a significantly greater rate than other types. (b) As for the

More information

Results and Discussion of Receptor Tyrosine Kinase. Activation

Results and Discussion of Receptor Tyrosine Kinase. Activation Results and Discussion of Receptor Tyrosine Kinase Activation To demonstrate the contribution which RCytoscape s molecular maps can make to biological understanding via exploratory data analysis, we here

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Mathematical Modeling of PDGF-Driven Glioblastoma Reveals Optimized Radiation Dosing Schedules

Mathematical Modeling of PDGF-Driven Glioblastoma Reveals Optimized Radiation Dosing Schedules Mathematical Modeling of PDGF-Driven Glioblastoma Reveals Optimized Radiation Dosing Schedules Kevin Leder, Ken Pittner, Quincey LaPlant, Dolores Hambardzumyan, Brian D. Ross, Timothy A. Chan, Eric C.

More information

Phenotype prediction based on genome-wide DNA methylation data

Phenotype prediction based on genome-wide DNA methylation data Wilhelm BMC Bioinformatics 2014, 15:193 METHODOLOGY ARTICLE Open Access Phenotype prediction based on genome-wide DNA methylation data Thomas Wilhelm Abstract Background: DNA methylation (DNAm) has important

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

Fibroblasts cell lines misclassified as cancer cell lines

Fibroblasts cell lines misclassified as cancer cell lines Fibroblasts cell lines misclassified as cancer cell lines Antoine de Weck 1, Hans Bitter 2, Audrey Kauffmann 1 1. Novartis Institutes for Biomedical Research, Basel CH-4002, Switzerland. 2. Novartis Institutes

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Somatic coding mutations identified by WES/WGS for 83 ATL cases.

Nature Genetics: doi: /ng Supplementary Figure 1. Somatic coding mutations identified by WES/WGS for 83 ATL cases. Supplementary Figure 1 Somatic coding mutations identified by WES/WGS for 83 ATL cases. (a) The percentage of targeted bases covered by at least 2, 10, 20 and 30 sequencing reads (top) and average read

More information

microrna Presented for: Presented by: Date:

microrna Presented for: Presented by: Date: microrna Presented for: Presented by: Date: 2 micrornas Non protein coding, endogenous RNAs of 21-22nt length Evolutionarily conserved Regulate gene expression by binding complementary regions at 3 regions

More information

MethylMix An R package for identifying DNA methylation driven genes

MethylMix An R package for identifying DNA methylation driven genes MethylMix An R package for identifying DNA methylation driven genes Olivier Gevaert May 3, 2016 Stanford Center for Biomedical Informatics Department of Medicine 1265 Welch Road Stanford CA, 94305-5479

More information

Genomic Analyses across Six Cancer Types Identify Basal-like Breast Cancer as a Unique Molecular Entity

Genomic Analyses across Six Cancer Types Identify Basal-like Breast Cancer as a Unique Molecular Entity Genomic Analyses across Six Cancer Types Identify Basal-like Breast Cancer as a Unique Molecular Entity Aleix Prat, Barbara Adamo, Cheng Fan, Vicente Peg, Maria Vidal, Patricia Galván, Ana Vivancos, Paolo

More information

Interactive analysis and quality assessment of single-cell copy-number variations

Interactive analysis and quality assessment of single-cell copy-number variations Interactive analysis and quality assessment of single-cell copy-number variations Tyler Garvin, Robert Aboukhalil, Jude Kendall, Timour Baslan, Gurinder S. Atwal, James Hicks, Michael Wigler, Michael C.

More information

A mathematical model for short-term vs. long-term survival in patients with. glioma

A mathematical model for short-term vs. long-term survival in patients with. glioma SUPPLEMENTARY DATA A mathematical model for short-term vs. long-term survival in patients with glioma Jason B. Nikas 1* 1 Genomix, Inc., Minneapolis, MN 55364, USA * Correspondence: Dr. Jason B. Nikas

More information

Nature Medicine: doi: /nm.4439

Nature Medicine: doi: /nm.4439 Figure S1. Overview of the variant calling and verification process. This figure expands on Fig. 1c with details of verified variants identification in 547 additional validation samples. Somatic variants

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

TCGA. The Cancer Genome Atlas

TCGA. The Cancer Genome Atlas TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue

More information

The Tail Rank Test. Kevin R. Coombes. July 20, Performing the Tail Rank Test Which genes are significant?... 3

The Tail Rank Test. Kevin R. Coombes. July 20, Performing the Tail Rank Test Which genes are significant?... 3 The Tail Rank Test Kevin R. Coombes July 20, 2009 Contents 1 Introduction 1 2 Getting Started 1 3 Performing the Tail Rank Test 2 3.1 Which genes are significant?..................... 3 4 Power Computations

More information

Module 3: Pathway and Drug Development

Module 3: Pathway and Drug Development Module 3: Pathway and Drug Development Table of Contents 1.1 Getting Started... 6 1.2 Identifying a Dasatinib sensitive cancer signature... 7 1.2.1 Identifying and validating a Dasatinib Signature... 7

More information

Integration of Genetic and Genomic Approaches for the Analysis of Chronic Fatigue Syndrome Implicates Forkhead Box N1

Integration of Genetic and Genomic Approaches for the Analysis of Chronic Fatigue Syndrome Implicates Forkhead Box N1 Integration of Genetic and Genomic Approaches for the Analysis of Chronic Fatigue Syndrome Implicates Forkhead Box N1 Angela Presson, Jeanette Papp, Eric Sobel, and Steve Horvath Biostatistics and Human

More information

Proceedings of the UGC Sponsored National Conference on Advanced Networking and Applications, 27 th March 2015

Proceedings of the UGC Sponsored National Conference on Advanced Networking and Applications, 27 th March 2015 Brain Tumor Detection and Identification Using K-Means Clustering Technique Malathi R Department of Computer Science, SAAS College, Ramanathapuram, Email: malapraba@gmail.com Dr. Nadirabanu Kamal A R Department

More information

Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies

Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Stanford Biostatistics Workshop Pierre Neuvial with Henrik Bengtsson and Terry Speed Department of Statistics, UC Berkeley

More information