Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010

Size: px
Start display at page:

Download "Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010"

Transcription

1 Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 C.J.Vaske et al. May 22, 2013 Presented by: Rami Eitan

2 Complex Genomic Rearrangements Cancer tissue experience molecular changes Varied genomic data available copy number variations mutations, gene expression Stratification of cancers can improve: diagnosis prognosis risk assessment response to treatment

3 Complex Genomic Rearrangements Genetic alterations differ between patients Pathways often are common

4 Pathways What is a pathway? Figure : The P53 pathway

5 Pathways A set of interactions between entities, logically grouped together around a biological process. Protein-coding genes, small molecules, complexes, gene families, abstract processes Available databases: Reactome, KEGG, NCI

6 Motivation Integra+ve analysis of cancer genome data Copy number varia+ons, gene expressions Leverage pathway informa+on to find frequently occurring pathway perturba+ons NCI pathway interac+on database, KEGG etc.

7 Observed Data Figure : Gene expression Figure : Copy number

8 Motivation Pathway informa+on contains informa+on on how genes are supposed to behave

9 Input Infer integrated pathway activity (IPA) Produce a matrix A. A ij is the inferred activity of entity i in patient j

10 PARADIGM

11 Factor graph Factor graph is a probabilistic graphical model. Variables, factors. Figure : A simple factor graph

12 PARADIGM Model Factor graph representa+on of various en++es corresponding to a single gene

13 PARADIGM Model: Gene Interactions

14 A factor graph for a pathway PARADIGM Model:

15 Model Specification Convert an NCI pathway into a factor graph NCI pathway to Bayesian network Directed network Each variable takes values of - 1 (de- ac+va+on), 0 (normal), 1 (ac+va+on) mrna: over expression for ac+va+on Copy number varia+ons: more than two copies for ac+va+ons Probability distribu+on of each node Labeled edges for posi+ve/nega+ve interac+ons Set the value of the child node as weighted votes from its parents

16 Model Specification Conver+ng the Bayesian network to a factor graph Assign a factor to each group of variables consis+ng of a node and its parents Z: normaliza+on constant ε = 0.001

17 Inference Observed variables: copy number variations, gene expressions Unobserved variables: protein, protein activity, overall pathway activity state Learn models with EM algorithm E step: Infer the probabilities of the unobserved variables M step: Change parameters to to maximize the likelihood given the probabilities

18 Expectation Maximization Figure : EM algorithm

19 Log-likelihood Ratio Test Test sta+s+c for assessing en+ty i s ac+vity given data D The probabili+es can be obtained by performing inference on the factor graph

20 Significance assessment Permutate the labels of the observed data Within permutation: choosing random genes from the same pathway Any permutation: choosing any random genes 1000 permutations of each type are used to determine null distribution

21 Decoy paths Create decoy paths by replacing genes with random genes Maintain the same structure All complexes and abstract processes remain the same

22 Log-likelihood Ratio Test Aggrega+ng over mul+ple values en+ty i takes

23 Dataset Breast cancer copy number and gene expression data TCGA Glioblastoma copy number and gene expression data Pathways from NCI pathway interac+on database (PID)

24 Results - breast cancer Breast Cancer dataset: IPA s (7%) found to be significantly higher 497 significant entities per patient on average 103 out of 127 pathways had at least one entity altered in 20% or more of the patients

25 Results - GBM GBM dataset: IPA s (9%) found to be significantly higher 616 significant entities per patient on average 110 out of 127 pathways had at least one entity altered in 20% or more of the patients

26 EM Convergence Original data vs. permuted data Red: real data Green: permuted data

27 Results - decoy paths Distinguishing decoy from real pathways Figure : PARADIGM vs SPIA: FP rate

28 Results - decoy paths Distinguishing decoy from real pathways Breast cancer AUC: PARADIGM: SPIA: GBM AUC: PARADIGM: SPIA: 0.604

29 Top PARADIGM Pathways of Breast Cancer

30 Top PARADIGM Pathways of Glioblastoma

31 Glioblastoma Subtypes

32 Survival Rates for Each Subtypes

33 Results - Patient vs permutation Figure : Patient vs permuted IPA s

34 Results - Patient vs permutation Figure : Patient vs permuted IPA s. Source: Broad Institute/Dana-Farber Cancer Institute/Harvard Medical School

35 Summary PARADIGM integrates different types of data, including gene- expression, copy number varia+on, and pathway database, in order to infer pathway ac+vi+es for individual cancer pa+ents. Factor graph model for represen+ng pathway and modeling datasets Pathway ac+vi+es inferred by PARADIGM can be used to iden+fy cancer subtypes

36 Questions

37 Discussion Can the method be successfully expanded to more observed data? Instead of using the pathways as is, can this method be used to find new pathways and interactions?

Introduction to Gene Sets Analysis

Introduction to Gene Sets Analysis Introduction to Svitlana Tyekucheva Dana-Farber Cancer Institute May 15, 2012 Introduction Various measurements: gene expression, copy number variation, methylation status, mutation profile, etc. Main

More information

Characteriza*on of Soma*c Muta*ons in Cancer Genomes

Characteriza*on of Soma*c Muta*ons in Cancer Genomes Characteriza*on of Soma*c Muta*ons in Cancer Genomes Ben Raphael Department of Computer Science Center for Computa*onal Molecular Biology Soma*c Muta*ons and Cancer Clonal Theory (Nowell 1976) Passenger

More information

Paradigm. Some slide sources: Josh Stuart (UCSC) AACR 12 slides Carl Edward Rasmussen (Cambridge) Factor graphs slides.

Paradigm. Some slide sources: Josh Stuart (UCSC) AACR 12 slides Carl Edward Rasmussen (Cambridge) Factor graphs slides. Paradigm Some slide sources: Josh Stuart (UCSC) AACR 12 slides Carl Edward Rasmussen (Cambridge) Factor graphs slides ABDBM Ron Shamir 1 Inference of patient-specific pathway activities from multi-dimensional

More information

Structured Association Advanced Topics in Computa8onal Genomics

Structured Association Advanced Topics in Computa8onal Genomics Structured Association 02-715 Advanced Topics in Computa8onal Genomics Structured Association Lasso ACGTTTTACTGTACAATT Gflasso (Kim & Xing, 2009) ACGTTTTACTGTACAATT Greater power Fewer false posi2ves Phenome

More information

Understanding Genotype- Phenotype relations in Cancer via Network Approaches

Understanding Genotype- Phenotype relations in Cancer via Network Approaches AlgoCSB Algorithmic Methods in Computational and Systems Biology Understanding Genotype- Phenotype relations in Cancer via Network Approaches Teresa Przytycka NIH / NLM / NCBI Phenotypes Journal Wisla

More information

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/ CSCI1950 Z Computa4onal Methods for Biology Lecture 18 Ben Raphael April 8, 2009 hip://cs.brown.edu/courses/csci1950 z/ Binary classifica,on Given a set of examples (x i, y i ), where y i = + 1, from unknown

More information

Gene-microRNA network module analysis for ovarian cancer

Gene-microRNA network module analysis for ovarian cancer Gene-microRNA network module analysis for ovarian cancer Shuqin Zhang School of Mathematical Sciences Fudan University Oct. 4, 2016 Outline Introduction Materials and Methods Results Conclusions Introduction

More information

StratomeX Visual Analysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization

StratomeX Visual Analysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization StratomeX Visual Analysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization Alexander Lex 1, Marc Streit 2, Hans-Jörg Schulz 3, Christian Partl 1, Dieter Schmalstieg 1, Peter

More information

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s Using Bayesian Networks to Analyze Expression Data Xu Siwei, s0789023 Muhammad Ali Faisal, s0677834 Tejal Joshi, s0677858 Outline Introduction Bayesian Networks Equivalence Classes Applying to Expression

More information

Outline. What s inside this paper? My expectation. Software Defect Prediction. Traditional Method. What s inside this paper?

Outline. What s inside this paper? My expectation. Software Defect Prediction. Traditional Method. What s inside this paper? Outline A Critique of Software Defect Prediction Models Norman E. Fenton Dongfeng Zhu What s inside this paper? What kind of new technique was developed in this paper? Research area of this technique?

More information

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples

More information

Identifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology

Identifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology Identifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology Yicun Ni (ID#: 9064804041), Jin Ruan (ID#: 9070059457), Ying Zhang (ID#: 9070063723) Abstract As the

More information

Module 3: Pathway and Drug Development

Module 3: Pathway and Drug Development Module 3: Pathway and Drug Development Table of Contents 1.1 Getting Started... 6 1.2 Identifying a Dasatinib sensitive cancer signature... 7 1.2.1 Identifying and validating a Dasatinib Signature... 7

More information

Translational Bioinformatics: Connecting Genes with Drugs

Translational Bioinformatics: Connecting Genes with Drugs Translational Bioinformatics: Connecting Genes with Drugs Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 11/14/2017

More information

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics Copy Number Variations and Association Mapping 02-715 Advanced Topics in Computa8onal Genomics SNP and CNV Genotyping SNP genotyping assumes two copy numbers at each locus (i.e., no CNVs) CNV genotyping

More information

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser Melissa S. Cline 1*, Brian Craft 1, Teresa Swatloski 1, Mary Goldman 1, Singer Ma 1, David Haussler 1, Jingchun Zhu 1 1 Center for Biomolecular

More information

Results and Discussion of Receptor Tyrosine Kinase. Activation

Results and Discussion of Receptor Tyrosine Kinase. Activation Results and Discussion of Receptor Tyrosine Kinase Activation To demonstrate the contribution which RCytoscape s molecular maps can make to biological understanding via exploratory data analysis, we here

More information

Bayesian (Belief) Network Models,

Bayesian (Belief) Network Models, Bayesian (Belief) Network Models, 2/10/03 & 2/12/03 Outline of This Lecture 1. Overview of the model 2. Bayes Probability and Rules of Inference Conditional Probabilities Priors and posteriors Joint distributions

More information

Journal: Nature Methods

Journal: Nature Methods Journal: Nature Methods Article Title: Network-based stratification of tumor mutations Corresponding Author: Trey Ideker Supplementary Item Supplementary Figure 1 Supplementary Figure 2 Supplementary Figure

More information

Bayes Linear Statistics. Theory and Methods

Bayes Linear Statistics. Theory and Methods Bayes Linear Statistics Theory and Methods Michael Goldstein and David Wooff Durham University, UK BICENTENNI AL BICENTENNIAL Contents r Preface xvii 1 The Bayes linear approach 1 1.1 Combining beliefs

More information

Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance *

Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance * Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance * Brian S. Helfer 1, James R. Williamson 1, Benjamin A. Miller 1, Joseph Perricone 1, Thomas F. Quatieri 1 MIT Lincoln

More information

Predicting Breast Cancer Recurrence Using Machine Learning Techniques

Predicting Breast Cancer Recurrence Using Machine Learning Techniques Predicting Breast Cancer Recurrence Using Machine Learning Techniques Umesh D R Department of Computer Science & Engineering PESCE, Mandya, Karnataka, India Dr. B Ramachandra Department of Electrical and

More information

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Supplementary Materials and Methods Phylogenetic tree of the HMT superfamily The phylogeny outlined in the

More information

From mirna regulation to mirna - TF co-regulation: computational

From mirna regulation to mirna - TF co-regulation: computational From mirna regulation to mirna - TF co-regulation: computational approaches and challenges 1,* Thuc Duy Le, 1 Lin Liu, 2 Junpeng Zhang, 3 Bing Liu, and 1,* Jiuyong Li 1 School of Information Technology

More information

Common Data Elements: Making the Mass of NIH Measures More Useful

Common Data Elements: Making the Mass of NIH Measures More Useful Common Data Elements WG Common Data Elements: Making the Mass of NIH Measures More Useful Jerry Sheehan Assistant Director for Policy Development Na?onal Library of Medicine Gene/c Alliance Webinar Series

More information

Package diggitdata. April 11, 2019

Package diggitdata. April 11, 2019 Type Package Title Example data for the diggit package Version 1.14.0 Date 2014-08-29 Author Mariano Javier Alvarez Package diggitdata April 11, 2019 Maintainer Mariano Javier Alvarez

More information

Propensity Score. Overview:

Propensity Score. Overview: Propensity Score Overview: What do we use a propensity score for? How do we construct the propensity score? How do we implement propensity score es

More information

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Manuela Zucknick Division of Biostatistics, German Cancer Research Center Biometry Workshop,

More information

Supplementary Materials for

Supplementary Materials for www.sciencetranslationalmedicine.org/cgi/content/full/7/303/303ra139/dc1 Supplementary Materials for Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular

More information

Predicting Kidney Cancer Survival from Genomic Data

Predicting Kidney Cancer Survival from Genomic Data Predicting Kidney Cancer Survival from Genomic Data Christopher Sauer, Rishi Bedi, Duc Nguyen, Benedikt Bünz Abstract Cancers are on par with heart disease as the leading cause for mortality in the United

More information

Graphical Modeling Approaches for Estimating Brain Networks

Graphical Modeling Approaches for Estimating Brain Networks Graphical Modeling Approaches for Estimating Brain Networks BIOS 516 Suprateek Kundu Department of Biostatistics Emory University. September 28, 2017 Introduction My research focuses on understanding how

More information

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Andy Nguyen, M.D., M.S. Medical Director, Hematopathology, Hematology and Coagulation Laboratory,

More information

Expanded View Figures

Expanded View Figures Solip Park & Ben Lehner Epistasis is cancer type specific Molecular Systems Biology Expanded View Figures A B G C D E F H Figure EV1. Epistatic interactions detected in a pan-cancer analysis and saturation

More information

Recording ac0vity in intact human brain. Recording ac0vity in intact human brain

Recording ac0vity in intact human brain. Recording ac0vity in intact human brain Recording ac0vity in intact human brain Recording ac0vity in intact human brain BIONB 4910 April 8, 2014 BIONB 4910 April 8, 2014 Objec0ves: - - review available recording methods EEG and MEG single unit

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

A Versatile Algorithm for Finding Patterns in Large Cancer Cell Line Data Sets

A Versatile Algorithm for Finding Patterns in Large Cancer Cell Line Data Sets A Versatile Algorithm for Finding Patterns in Large Cancer Cell Line Data Sets James Jusuf, Phillips Academy Andover May 21, 2017 MIT PRIMES The Broad Institute of MIT and Harvard Introduction A quest

More information

Computational Approach for Deriving Cancer Progression Roadmaps from Static Sample Data

Computational Approach for Deriving Cancer Progression Roadmaps from Static Sample Data Computational Approach for Deriving Cancer Progression Roadmaps from Static Sample Data Yijun Sun,2,3,5,, Jin Yao, Le Yang 2, Runpu Chen 2, Norma J. Nowak 4, Steve Goodison 6, Department of Microbiology

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data

A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Method A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Xiao-Gang Ruan, Jin-Lian Wang*, and Jian-Geng Li Institute of Artificial Intelligence and

More information

A probabilistic method for food web modeling

A probabilistic method for food web modeling A probabilistic method for food web modeling Bayesian Networks methodology, challenges, and possibilities Anna Åkesson, Linköping University, Sweden 2 nd international symposium on Ecological Networks,

More information

Session 4 Rebecca Poulos

Session 4 Rebecca Poulos The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 20

More information

CHAPTER 6 HUMAN BEHAVIOR UNDERSTANDING MODEL

CHAPTER 6 HUMAN BEHAVIOR UNDERSTANDING MODEL 127 CHAPTER 6 HUMAN BEHAVIOR UNDERSTANDING MODEL 6.1 INTRODUCTION Analyzing the human behavior in video sequences is an active field of research for the past few years. The vital applications of this field

More information

Bayesian Latent Subgroup Design for Basket Trials

Bayesian Latent Subgroup Design for Basket Trials Bayesian Latent Subgroup Design for Basket Trials Yiyi Chu Department of Biostatistics The University of Texas School of Public Health July 30, 2017 Outline Introduction Bayesian latent subgroup (BLAST)

More information

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),

More information

Learning Deterministic Causal Networks from Observational Data

Learning Deterministic Causal Networks from Observational Data Carnegie Mellon University Research Showcase @ CMU Department of Psychology Dietrich College of Humanities and Social Sciences 8-22 Learning Deterministic Causal Networks from Observational Data Ben Deverett

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

Package pathifier. August 17, 2018

Package pathifier. August 17, 2018 Type Package Title Quantify deregulation of pathways in cancer Version 1.19.0 Date 2013-06-27 Author Yotam Drier Package pathifier August 17, 2018 Maintainer Assif Yitzhaky

More information

Integration of high-throughput biological data

Integration of high-throughput biological data Integration of high-throughput biological data Jean Yang and Vivek Jayaswal School of Mathematics and Statistics University of Sydney Meeting the Challenges of High Dimension: Statistical Methodology,

More information

Simultaneous Identification of Multiple Driver Pathways in Cancer

Simultaneous Identification of Multiple Driver Pathways in Cancer Simultaneous Identification of Multiple Driver Pathways in Cancer Mark D. M. Leiserson 1, Dima Blokh 2, Roded Sharan 2., Benjamin J. Raphael 1. * 1 Department of Computer Science and Center for Computational

More information

56:134 Process Engineering

56:134 Process Engineering 56:134 Process Engineering Homework #2 Solutions Your role as a process analyst is to reduce cycle time of the process in Figure 1. The duration of each activity is as follows: Release PCB1 2 minutes Release

More information

Sets, Logic, and Probability As Used in Decision Support Systems

Sets, Logic, and Probability As Used in Decision Support Systems Sets, Logic, and Probability As Used in Decision Support Systems HAP 752 Advanced Health Informa6on Systems Janusz Wojtusiak, PhD George Mason University Spring 2014 Part of the inhumanity of the computer

More information

Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis

Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis , pp.143-147 http://dx.doi.org/10.14257/astl.2017.143.30 Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis Chang-Wook Han Department of Electrical Engineering, Dong-Eui University,

More information

DawnRank: discovering personalized driver genes in cancer

DawnRank: discovering personalized driver genes in cancer Hou and Ma Genome Medicine 2014, 6:56 METHOD Open Access DawnRank: discovering personalized driver genes in cancer Jack P Hou 1,2 and Jian Ma 1,3* Abstract Large-scale cancer genomic studies have revealed

More information

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Supplement 2. Use of Directed Acyclic Graphs (DAGs) Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to

More information

A review of approaches to identifying patient phenotype cohorts using electronic health records

A review of approaches to identifying patient phenotype cohorts using electronic health records A review of approaches to identifying patient phenotype cohorts using electronic health records Shivade, Raghavan, Fosler-Lussier, Embi, Elhadad, Johnson, Lai Chaitanya Shivade JAMIA Journal Club March

More information

An Efficient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome

An Efficient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome An Efficient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome Han-Saem Park and Sung-Bae Cho Department of Computer Science, Yonsei University 134

More information

Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!)

Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!) Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!) all MVPA really shows is that there are places where, in most people s brain,

More information

INTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA

INTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA Texas Medical Center Library DigitalCommons@TMC UT GSBS Dissertations and Theses (Open Access) Graduate School of Biomedical Sciences 5-2016 INTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA Xuebei

More information

The power of single cells: Building a tumor immune atlas Dana Pe er Department of Biological Science Department of Systems Biology Columbia

The power of single cells: Building a tumor immune atlas Dana Pe er Department of Biological Science Department of Systems Biology Columbia The power of single cells: Building a tumor immune atlas Dana Pe er Department of Biological Science Department of Systems Biology Columbia University The Precision Medicine Initiative Most personalized

More information

Decades of cancer research have demonstrated that. Computational approaches for the identification of cancer genes and pathways

Decades of cancer research have demonstrated that. Computational approaches for the identification of cancer genes and pathways Computational approaches for the identification of cancer genes and pathways Christos M. Dimitrakopoulos 1,2 and Niko Beerenwinkel 1,2 * High-throughput DNA sequencing techniques enable large-scale measurement

More information

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend Table of content -Supplementary methods -Figure S1 -Figure S2 -Figure S3 -Table legend Supplementary methods Yeast two-hybrid bait basal transactivation test Because bait constructs sometimes self-transactivate

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Numerical Integration of Bivariate Gaussian Distribution

Numerical Integration of Bivariate Gaussian Distribution Numerical Integration of Bivariate Gaussian Distribution S. H. Derakhshan and C. V. Deutsch The bivariate normal distribution arises in many geostatistical applications as most geostatistical techniques

More information

Nature Genetics: doi: /ng Supplementary Figure 1

Nature Genetics: doi: /ng Supplementary Figure 1 Supplementary Figure 1 Expression deviation of the genes mapped to gene-wise recurrent mutations in the TCGA breast cancer cohort (top) and the TCGA lung cancer cohort (bottom). For each gene (each pair

More information

Classification and Predication of Breast Cancer Risk Factors Using Id3

Classification and Predication of Breast Cancer Risk Factors Using Id3 The International Journal Of Engineering And Science (IJES) Volume 5 Issue 11 Pages PP 29-33 2016 ISSN (e): 2319 1813 ISSN (p): 2319 1805 Classification and Predication of Breast Cancer Risk Factors Using

More information

Multiplexed Cancer Pathway Analysis

Multiplexed Cancer Pathway Analysis NanoString Technologies, Inc. Multiplexed Cancer Pathway Analysis for Gene Expression Lucas Dennis, Patrick Danaher, Rich Boykin, Joseph Beechem NanoString Technologies, Inc., Seattle WA 98109 v1.0 MARCH

More information

Neurons and neural networks II. Hopfield network

Neurons and neural networks II. Hopfield network Neurons and neural networks II. Hopfield network 1 Perceptron recap key ingredient: adaptivity of the system unsupervised vs supervised learning architecture for discrimination: single neuron perceptron

More information

Network-assisted data analysis

Network-assisted data analysis Network-assisted data analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Protein identification in shotgun proteomics Protein digestion LC-MS/MS Protein

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma.

Nature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma. Supplementary Figure 1 Mutational signatures in BCC compared to melanoma. (a) The effect of transcription-coupled repair as a function of gene expression in BCC. Tumor type specific gene expression levels

More information

Package xseq. R topics documented: September 11, 2015

Package xseq. R topics documented: September 11, 2015 Package xseq September 11, 2015 Title Assessing Functional Impact on Gene Expression of Mutations in Cancer Version 0.2.1 Date 2015-08-25 Author Jiarui Ding, Sohrab Shah Maintainer Jiarui Ding

More information

MolEcular Taxonomy of BReast cancer International Consortium (METABRIC)

MolEcular Taxonomy of BReast cancer International Consortium (METABRIC) PERSPECTIVE 1 LARGE SCALE DATASET EXAMPLES MolEcular Taxonomy of BReast cancer International Consortium (METABRIC) BC Cancer Agency, Vancouver Samuel Aparicio, PhD FRCPath Nan and Lorraine Robertson Chair

More information

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

CS 4365: Artificial Intelligence Recap. Vibhav Gogate

CS 4365: Artificial Intelligence Recap. Vibhav Gogate CS 4365: Artificial Intelligence Recap Vibhav Gogate Exam Topics Search BFS, DFS, UCS, A* (tree and graph) Completeness and Optimality Heuristics: admissibility and consistency CSPs Constraint graphs,

More information

ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data

ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data ASMS 2015 ThP 459 Glioblastoma Multiforme Subtype Classification: Integrated Analysis of Protein and Gene Expression Data Durairaj Renu 1, Vadiraja Bhat 2, Mona Al-Gizawiy 3, Carolina B. Livi 2, Stephen

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Genomic analysis of essentiality within protein networks

Genomic analysis of essentiality within protein networks Genomic analysis of essentiality within protein networks Haiyuan Yu, Dov Greenbaum, Hao Xin Lu, Xiaowei Zhu and Mark Gerstein Department of Molecular Biophysics and Biochemistry, 266 Whitney Avenue, Yale

More information

Using CART to Mine SELDI ProteinChip Data for Biomarkers and Disease Stratification

Using CART to Mine SELDI ProteinChip Data for Biomarkers and Disease Stratification Using CART to Mine SELDI ProteinChip Data for Biomarkers and Disease Stratification Kenna Mawk, D.V.M. Informatics Product Manager Ciphergen Biosystems, Inc. Outline Introduction to ProteinChip Technology

More information

Integrative analysis of survival-associated gene sets in breast cancer

Integrative analysis of survival-associated gene sets in breast cancer Varn et al. BMC Medical Genomics (2015) 8:11 DOI 10.1186/s12920-015-0086-0 RESEARCH ARTICLE Open Access Integrative analysis of survival-associated gene sets in breast cancer Frederick S Varn 1, Matthew

More information

Sbarrato_Supplementary_Fig1

Sbarrato_Supplementary_Fig1 Sbarrato_Supplementary_Fig1 Supplementary Figure 1. Translatome analysis of CLL patients based on IGVH mutational status ) Profile for the translatome of unmutated IGVH CLL versus mutated IGVH CLL pa9ents.

More information

MethylMix An R package for identifying DNA methylation driven genes

MethylMix An R package for identifying DNA methylation driven genes MethylMix An R package for identifying DNA methylation driven genes Olivier Gevaert May 3, 2016 Stanford Center for Biomedical Informatics Department of Medicine 1265 Welch Road Stanford CA, 94305-5479

More information

Section D: The Molecular Biology of Cancer

Section D: The Molecular Biology of Cancer CHAPTER 19 THE ORGANIZATION AND CONTROL OF EUKARYOTIC GENOMES Section D: The Molecular Biology of Cancer 1. Cancer results from genetic changes that affect the cell cycle 2. Oncogene proteins and faulty

More information

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo Micro-RNA web tools UBio Training Courses Gonzalo Gómez//ggomez@cnio.es Introduction mirnas, target prediction, biology Experimental data Network Filtering Pathway interpretation mirs-pathways network

More information

Understanding DNA Copy Number Data

Understanding DNA Copy Number Data Understanding DNA Copy Number Data Adam B. Olshen Department of Epidemiology and Biostatistics Helen Diller Family Comprehensive Cancer Center University of California, San Francisco http://cc.ucsf.edu/people/olshena_adam.php

More information

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data Breast cancer Inferring Transcriptional Module from Breast Cancer Profile Data Breast Cancer and Targeted Therapy Microarray Profile Data Inferring Transcriptional Module Methods CSC 177 Data Warehousing

More information

Computational Investigation of Homologous Recombination DNA Repair Deficiency in Sporadic Breast Cancer

Computational Investigation of Homologous Recombination DNA Repair Deficiency in Sporadic Breast Cancer University of Massachusetts Medical School escholarship@umms Open Access Articles Open Access Publications by UMMS Authors 11-16-2017 Computational Investigation of Homologous Recombination DNA Repair

More information

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA International Journal of Software Engineering and Knowledge Engineering Vol. 13, No. 6 (2003) 579 592 c World Scientific Publishing Company MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION

More information

Network-based pattern recognition models for neuroimaging

Network-based pattern recognition models for neuroimaging Network-based pattern recognition models for neuroimaging Maria J. Rosa Centre for Neuroimaging Sciences, Institute of Psychiatry King s College London, UK Outline Introduction Pattern recognition Network-based

More information

Transcript reconstruction

Transcript reconstruction Transcript reconstruction Summary I Data types, file formats and utilities Annotation: Genomic regions Genes Peaks bedtools Alignment: Map reads BAM/SAM Samtools Aggregation: Summary files Wig (UCSC) TDF

More information

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016 Biosta's'cs Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016 Review key biosta's'cs concepts Understand 2 X 2 tables Objec'ves By the end of this

More information

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE

More information

Comparative Effectiveness Research (CER) and Personalized Medicine: Policy, Science, and Business

Comparative Effectiveness Research (CER) and Personalized Medicine: Policy, Science, and Business Comparative Effectiveness Research (CER) and Personalized Medicine: How a comprehensive CER system can support personalized medicine Amy P. Abernethy, MD October 28, 2009 CER in Cancer Care? 2 Friends

More information

A Trial Implementa.on of a High Density Health Informa.on Exchange Standard: Are We Ready for Coordinated Care in High Impact Condi.ons?

A Trial Implementa.on of a High Density Health Informa.on Exchange Standard: Are We Ready for Coordinated Care in High Impact Condi.ons? A Trial Implementa.on of a High Density Health Informa.on Exchange Standard: Are We Ready for Coordinated Care in High Impact Condi.ons? Michael Hogarth, MD, FACP CMIO, Athena Breast Health Network Professor,

More information

Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure. Christopher Lee, UCLA

Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure. Christopher Lee, UCLA Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure Christopher Lee, UCLA HIV-1 Protease and RT: anti-retroviral drug targets protease RT Protease: responsible for

More information

Patient networks! in cancer:! a platform for data integration

Patient networks! in cancer:! a platform for data integration Anna Goldenberg and The Goldenberg Lab Patient networks! in cancer:! a platform for data integration Outline o Data integra-on problem setup o Pa-ent network representa-on why and how o Similarity Network

More information

Identification of Causal Genetic Drivers of Human Disease through Systems-Level Analysis of Regulatory Networks

Identification of Causal Genetic Drivers of Human Disease through Systems-Level Analysis of Regulatory Networks Biologists are from Venus, Mathematicians are from Mars, They cosegregate on Earth, And conditionally associate to create a DIGGIT. Identification of Causal Genetic Drivers of Human Disease through Systems-Level

More information

How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection

How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods

More information

Title: Pathway-Based Classification of Cancer Subtypes

Title: Pathway-Based Classification of Cancer Subtypes Title: Pathway-Based Classification of Cancer Subtypes Running title: Pathway-based classification of cancer subtypes Shinuk Kim 1, Mark Kon 1,2*, Charles DeLisi 1 1 Bioinformatics program, Boston University,

More information

SUPPLEMENTARY MATERIAL

SUPPLEMENTARY MATERIAL SUPPLEMENTARY MATERIAL Supplementary Figure 1. Recursive partitioning using PFS data in patients with advanced NSCLC with non-squamous histology treated in the placebo pemetrexed arm of LUME-Lung 2. (A)

More information

RchyOptimyx: Gating Hierarchy Optimization for Flow Cytometry

RchyOptimyx: Gating Hierarchy Optimization for Flow Cytometry RchyOptimyx: Gating Hierarchy Optimization for Flow Cytometry Nima Aghaeepour and Adrin Jalali April 4, 2013 naghaeep@bccrc.ca Contents 1 Licensing 1 2 Introduction 1 3 First Example: Preparing Raw Data

More information