Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/

Size: px
Start display at page:

Download "Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/"

Transcription

1 CSCI1950 Z Computa4onal Methods for Biology Lecture 18 Ben Raphael April 8, 2009 hip://cs.brown.edu/courses/csci1950 z/ Binary classifica,on Given a set of examples (x i, y i ), where y i = + 1, from unknown distribu4on D. Design func4on f: R n { 1,+1} that op+mally assigns addi4onal samples x i to one of two classes. Supervised learning (x i, y i ) training data x i (j): feature. R n : feature space. Classifica4on 1

2 Dimensionality Reduc4on Genomic data (e.g. gene expression) o[en highdimensional (n > 5000), but rela4vely few samples available. Reduce dimensionality of data (lower dimensional subspace) to improve performance of the classifier by: Removing features that do not contribute to the classifica4on and may introduce noise. Reducing opportuni4es for overfi]ng. Improving 4me/memory efficiency in algorithms for learning and classifica4on. Feature Construc4on n features l features Linear/nonlinear transforma4on Common method: Principal components analysis. [Whiteboard] 2

3 PCA and Clustering Yeast gene expression data (477 genes) clustered into 7 clusters. First two principal components contain 89% of varia4on in data. Yeung and Ruzzo (Bioinforma4cs 2001) PCA and Clustering Exon and junc4on microarrays detect widespread mouse strain and sex bias expression differences. (Su et al. BMC Genomics 2008) 3

4 Selec4ng l << n features that are informa(ve for classifica4on. Gene expression: subset of genes. Feature Selec4on Feature Selec4on Informa4ve features: Use a measure of associa4on between x i and y i. m r Correla4on: x i y i = k=1 (xi k xi )(yk i yi ) (m 1)s x is y i Chi square (con4ngency table) F (x i )= (xi ) + (x i ) 2 Fischer criterion: s 2 (x i ) + s 2 + (x i ) (x i ) + are elements in + class. t test sta4s4c Mutual informa4on TNoM score (previous lecture) [Whiteboard] 4

5 Feature Selec4on Results Colon Leukemia Feature Selec4on Results Top scoring genes (TNoM < 14) in colon dataset. 5

6 Assessing Performance Feature Selec4on (e.g. TNoM) Build Classifier Test WRONG Cross valida4on Assessing Performance Feature Selec4on (e.g. TNoM) Build Classifier Test Cross valida4on Must assess performance of both steps together! 6

7 Gene Selec4on Results Predictors of Breast Cancer Prognosis 70 gene signature to predict breast cancer pa4ents with metastasis within 5 years (van de Vijver et al. NEJM 2002, van t Veer et al. Nature 2002) Now an FDA approved test: Mammaprint 7

8 Predictors of Breast Cancer Prognosis Step 1: Clustering n = genes 98 tumors. >2 fold change (and p<0.01) in >4 tumors n = 5000 differen+ally expressed genes Hierarchical clustering: genes and samples Predictors of Breast Cancer Prognosis Step 2: Classifica4on n = 5000 genes differen4ally expressed genes in 78 (sporadic lymph node nega4ve) tumors. Compute correla4on coefficient ρ(x i, y i ) Between each gene and prognosis. Choose 231 genes with ρ(x i, y i ) > 0.3. Rank genes by ρ(x i, y i ). 8

9 Predictors of Breast Cancer Prognosis Step 2: Classifica4on n = 5000 genes differen4ally expressed genes in 78 (sporadic lymph node nega4ve) tumors. Compute correla4on coefficient ρ(x i, y i ) Between each gene and prognosis. Choose 231 genes with ρ(x i, y i ) > 0.3. Rank genes by ρ(x i, y i ). Predictors of Breast Cancer Prognosis Step 3: Build a classifier Leave out one sample x. Let R = top 5 genes in list of 231. Compute correla4on coefficients ρ(μ(x R+ ), x R ) and ρ(μ(x R ), x R ), where μ(x R+ ) is mean vector of genes in + class in R. Assign to best class. Add 5 genes to R un4l performance does not improve. 9

10 Predictors of Breast Cancer Prognosis 70 gene classifier 65/78 (83%) of pa4ents predicted correctly. 5 poor and 8 good incorrectly assigned. Changing threshold gave 3 poor and 12 good incorrectly assigned. Discussion Cross valida4on done a[er feature selec4on! Also fixed this problem. Resul4ng 70 gene signature is not unique (Ein Dor et. al 2005: see notes) Drawing biological conclusions from the output of a black box predic4on algorithm is not wise. Correla4on vs. causality. 10

11 Results: Class Discovery with TNoM (ben Dor, Friedman, Yakhini, 2001) Find op4mal labeling L. Solu4on: use heuris4c search Find mul4ple (subop4mal) labelings Solu4on: Peeling: remove previously used genes from set. Results: Class Discovery with TNoM (ben Dor, Friedman, Yakhini, 2001) Leukemia (Golub et al. 1999): 72 expression profiles. 25 AML, 47 ALL genes Lymphoma (Alizadeh et al.): 96 expression profiles, 46 Diffuse large B cell lymphoma (DLBCL) 50 from 8 different 4ssues. Lymphoma DLBCL: subset of 46 of above. 11

12 TNoM Results (ben Dor, Friedman, Yakhini, 2001) % survival 40 pa4ents years 24 pa4ents with low clinical risk. 12

Classification of cancer profiles. ABDBM Ron Shamir

Classification of cancer profiles. ABDBM Ron Shamir Classification of cancer profiles 1 Background: Cancer Classification Cancer classification is central to cancer treatment; Traditional cancer classification methods: location; morphology, cytogenesis;

More information

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA

MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION TO BREAST CANCER DATA International Journal of Software Engineering and Knowledge Engineering Vol. 13, No. 6 (2003) 579 592 c World Scientific Publishing Company MODEL-BASED CLUSTERING IN GENE EXPRESSION MICROARRAYS: AN APPLICATION

More information

Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines

Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines Florian Markowetz and Anja von Heydebreck Max-Planck-Institute for Molecular Genetics Computational Molecular Biology

More information

Introduction to Discrimination in Microarray Data Analysis

Introduction to Discrimination in Microarray Data Analysis Introduction to Discrimination in Microarray Data Analysis Jane Fridlyand CBMB University of California, San Francisco Genentech Hall Auditorium, Mission Bay, UCSF October 23, 2004 1 Case Study: Van t

More information

Structured Association Advanced Topics in Computa8onal Genomics

Structured Association Advanced Topics in Computa8onal Genomics Structured Association 02-715 Advanced Topics in Computa8onal Genomics Structured Association Lasso ACGTTTTACTGTACAATT Gflasso (Kim & Xing, 2009) ACGTTTTACTGTACAATT Greater power Fewer false posi2ves Phenome

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

Sta$s$cs is Easy. Dennis Shasha From a book co- wri7en with Manda Wilson

Sta$s$cs is Easy. Dennis Shasha From a book co- wri7en with Manda Wilson Sta$s$cs is Easy Dennis Shasha From a book co- wri7en with Manda Wilson Is the Coin Fair? You toss a coin 17 $mes and it comes up heads 15 out of 17 $mes. How likely is it that coin is fair? Could look

More information

Comparison of discrimination methods for the classification of tumors using gene expression data

Comparison of discrimination methods for the classification of tumors using gene expression data Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley

More information

Pa#ern recogni,on and neuroimaging in psychiatry

Pa#ern recogni,on and neuroimaging in psychiatry Pa#ern recogni,on and neuroimaging in psychiatry Janaina Mourao-Miranda Machine Learning and Neuroimaging Lab Max Planck UCL Centre for Computa=onal Psychiatry and Ageing Research Outline Supervised learning

More information

MammaPrint, the story of the 70-gene profile

MammaPrint, the story of the 70-gene profile MammaPrint, the story of the 70-gene profile René Bernards Professor of Molecular Carcinogenesis The Netherlands Cancer Institute Amsterdam Chief Scientific Officer Agendia Amsterdam The breast cancer

More information

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23: Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006 Major reasons for few prognostic factors to

More information

Small RNAs and how to analyze them using sequencing

Small RNAs and how to analyze them using sequencing Small RNAs and how to analyze them using sequencing Jakub Orzechowski Westholm (1) Long- term bioinforma=cs support, Science For Life Laboratory Stockholm (2) Department of Biophysics and Biochemistry,

More information

Characteriza*on of Soma*c Muta*ons in Cancer Genomes

Characteriza*on of Soma*c Muta*ons in Cancer Genomes Characteriza*on of Soma*c Muta*ons in Cancer Genomes Ben Raphael Department of Computer Science Center for Computa*onal Molecular Biology Soma*c Muta*ons and Cancer Clonal Theory (Nowell 1976) Passenger

More information

A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION

A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION 5-9 JATIT. All rights reserved. A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION 1 H. Mahmoodian, M. Hamiruce Marhaban, 3 R. A. Rahim, R. Rosli, 5 M. Iqbal Saripan 1 PhD student, Department

More information

Nearest Shrunken Centroid as Feature Selection of Microarray Data

Nearest Shrunken Centroid as Feature Selection of Microarray Data Nearest Shrunken Centroid as Feature Selection of Microarray Data Myungsook Klassen Computer Science Department, California Lutheran University 60 West Olsen Rd, Thousand Oaks, CA 91360 mklassen@clunet.edu

More information

Lecture #4: Overabundance Analysis and Class Discovery

Lecture #4: Overabundance Analysis and Class Discovery 236632 Topics in Microarray Data nalysis Winter 2004-5 November 15, 2004 Lecture #4: Overabundance nalysis and Class Discovery Lecturer: Doron Lipson Scribes: Itai Sharon & Tomer Shiran 1 Differentially

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010

Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 C.J.Vaske et al. May 22, 2013 Presented by: Rami Eitan Complex Genomic

More information

Categories. Represent/store visual objects in terms of categories. What are categories? Why do we need categories?

Categories. Represent/store visual objects in terms of categories. What are categories? Why do we need categories? Represen'ng Objects Categories Represent/store visual objects in terms of categories. What are categories? Why do we need categories? Grouping of objects into sets where sets are called categories! Categories

More information

A hierarchical two-phase framework for selecting genes in cancer datasets with a neuro-fuzzy system

A hierarchical two-phase framework for selecting genes in cancer datasets with a neuro-fuzzy system Technology and Health Care 24 (2016) S601 S605 DOI 10.3233/THC-161187 IOS Press S601 A hierarchical two-phase framework for selecting genes in cancer datasets with a neuro-fuzzy system Jongwoo Lim, Bohyun

More information

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Missing Heritablility How to Analyze Your Own Genome Fall 2013 Missing Heritablility 02-223 How to Analyze Your Own Genome Fall 2013 Heritability Heritability: the propor>on of observed varia>on in a par>cular trait (as height) that can be agributed to inherited gene>c

More information

A review of approaches to identifying patient phenotype cohorts using electronic health records

A review of approaches to identifying patient phenotype cohorts using electronic health records A review of approaches to identifying patient phenotype cohorts using electronic health records Shivade, Raghavan, Fosler-Lussier, Embi, Elhadad, Johnson, Lai Chaitanya Shivade JAMIA Journal Club March

More information

Applica(on of Causal Inference Methods to Improve Treatment of HIV in Resource Limited Se?ngs

Applica(on of Causal Inference Methods to Improve Treatment of HIV in Resource Limited Se?ngs Applica(on of Causal Inference Methods to Improve Treatment of HIV in Resource Limited Se?ngs Maya Petersen works.bepress.com/maya_petersen Divisions of Biosta,s,cs and Epidemiology, University of California,

More information

In this module we will cover Correla4on and Validity.

In this module we will cover Correla4on and Validity. In this module we will cover Correla4on and Validity. A correla4on coefficient is a sta4s4c that is o:en used as an es4mate of measurement, such as validity and reliability. You will learn the strength

More information

Predicting Kidney Cancer Survival from Genomic Data

Predicting Kidney Cancer Survival from Genomic Data Predicting Kidney Cancer Survival from Genomic Data Christopher Sauer, Rishi Bedi, Duc Nguyen, Benedikt Bünz Abstract Cancers are on par with heart disease as the leading cause for mortality in the United

More information

L. Ziaei MS*, A. R. Mehri PhD**, M. Salehi PhD***

L. Ziaei MS*, A. R. Mehri PhD**, M. Salehi PhD*** Received: 1/16/2004 Accepted: 8/1/2005 Original Article Application of Artificial Neural Networks in Cancer Classification and Diagnosis Prediction of a Subtype of Lymphoma Based on Gene Expression Profile

More information

A Ra%onal Perspec%ve on Heuris%cs and Biases. Falk Lieder, Tom Griffiths, & Noah Goodman Computa%onal Cogni%ve Science Lab UC Berkeley

A Ra%onal Perspec%ve on Heuris%cs and Biases. Falk Lieder, Tom Griffiths, & Noah Goodman Computa%onal Cogni%ve Science Lab UC Berkeley A Ra%onal Perspec%ve on Heuris%cs and Biases Falk Lieder, Tom Griffiths, & Noah Goodman Computa%onal Cogni%ve Science Lab UC Berkeley Outline 1. What is a good heuris%c? How good are the heuris%cs that

More information

VL Network Analysis ( ) SS2016 Week 3

VL Network Analysis ( ) SS2016 Week 3 VL Network Analysis (19401701) SS2016 Week 3 Based on slides by J Ruan (U Texas) Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin 1 Motivation 2 Lecture

More information

T. R. Golub, D. K. Slonim & Others 1999

T. R. Golub, D. K. Slonim & Others 1999 T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have

More information

Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!)

Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!) Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!) all MVPA really shows is that there are places where, in most people s brain,

More information

Considera*ons when undergoing personal genotyping

Considera*ons when undergoing personal genotyping Considera*ons when undergoing personal genotyping Kelly Ormond, MS, CGC Louanne Hudgins, MD, FACMG January 19, 2011 GENE 210 Disclosures and introduc*ons Professor Ormond provided paid consulta*on for

More information

Biologic Subtypes and Prognos5c Factors. Claudine Isaacs, MD Georgetown University

Biologic Subtypes and Prognos5c Factors. Claudine Isaacs, MD Georgetown University Biologic Subtypes and Prognos5c Factors Claudine Isaacs, MD Georgetown University Prognos5c Factor Defini5on Predicts outcome in absence of systemic therapy Thus tell us when (not how) to treat a pa5ent

More information

Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers q

Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers q European Journal of Cancer 40 (2004) 1837 1841 European Journal of Cancer www.ejconline.com Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers

More information

Learning Objec1ves. Study Design Considera1ons in Clinical Pharmacy

Learning Objec1ves. Study Design Considera1ons in Clinical Pharmacy 9/28/15 Study Design Considera1ons in Clinical Pharmacy Ludmila Bakhireva, MD, PhD, MPH Pree Sarangarm, PharmD, BCPS Learning Objec1ves Describe the features, advantages and disadvantages of the observa1onal

More information

Learning Objec1ves. Study Design Strategies. Cohort Studies 9/28/15

Learning Objec1ves. Study Design Strategies. Cohort Studies 9/28/15 9/28/15 Learning Objec1ves Describe the features, advantages and disadvantages of the observa1onal study designs Explain why the overall study design is important when evalua1ng studies & applying their

More information

Rank based statistics in analyzing high-throughput genomic data

Rank based statistics in analyzing high-throughput genomic data The Raymond and Beverly Sackler Faculty of Exact Sciences School of Computer Science Rank based statistics in analyzing high-throughput genomic data Thesis submitted in partial fulfillment of the requirements

More information

Dr. Alessio Signori Longitudinal trajectories of EDSS in primary progressive MS pa:ents A latent class approach

Dr. Alessio Signori Longitudinal trajectories of EDSS in primary progressive MS pa:ents A latent class approach Dr. Alessio Signori Longitudinal trajectories of EDSS in primary progressive MS pa:ents A latent class approach Department of Health Sciences Section of Biostatistics University of Genoa, Italy Mul%ple

More information

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016 Biosta's'cs Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016 Review key biosta's'cs concepts Understand 2 X 2 tables Objec'ves By the end of this

More information

Economic outcomes: Method for implementa5on

Economic outcomes: Method for implementa5on Economic outcomes: Method for implementa5on Philippe Beutels Centre for Health Economics Research & Modelling Infec

More information

Tissue Classification Based on Gene Expression Data

Tissue Classification Based on Gene Expression Data Chapter 6 Tissue Classification Based on Gene Expression Data Many diseases result from complex interactions involving numerous genes. Previously, these gene interactions have been commonly studied separately.

More information

A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis

A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis Baljeet Malhotra and Guohui Lin Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8

More information

SubLasso:a feature selection and classification R package with a. fixed feature subset

SubLasso:a feature selection and classification R package with a. fixed feature subset SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced

More information

Research Supervised clustering of genes Marcel Dettling and Peter Bühlmann

Research Supervised clustering of genes Marcel Dettling and Peter Bühlmann http://genomebiology.com/22/3/2/research/69. Research Supervised clustering of genes Marcel Dettling and Peter Bühlmann Address: Seminar für Statistik, Eidgenössische Technische Hochschule (ETH) Zürich,

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Basic Biology of Metastatic Colorectal Cancer (mcrc)

Basic Biology of Metastatic Colorectal Cancer (mcrc) Basic Biology of Metastatic Colorectal Cancer (mcrc) 20 March 2013 Bonus Conference 93 Days to go J. Joshua Smith, M.D., Ph.D. Section of Surgical Sciences Division of Surgical Oncology Nothing to Disclose.....yet

More information

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego

More information

Ascendant Dx mission is to commercialize disruptive diagnostic technologies aiding diagnosis and treatment for diseases of women and children.

Ascendant Dx mission is to commercialize disruptive diagnostic technologies aiding diagnosis and treatment for diseases of women and children. Ascendant Dx mission is to commercialize disruptive diagnostic technologies aiding diagnosis and treatment for diseases of women and children. Our particular focus is on cancer, autoimmune diseases and

More information

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call Use and Interpreta,on of LD Score Regression Brendan Bulik- Sullivan bulik@broadins,tute.org PGC Stat Analysis Call Outline of Talk Intui,on, Theory, Results LD Score regression intercept: dis,nguishing

More information

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection 202 4th International onference on Bioinformatics and Biomedical Technology IPBEE vol.29 (202) (202) IASIT Press, Singapore Efficacy of the Extended Principal Orthogonal Decomposition on DA Microarray

More information

International Journal of Pure and Applied Mathematics

International Journal of Pure and Applied Mathematics Volume 119 No. 12 2018, 12505-12513 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Analysis of Cancer Classification of Gene Expression Data A Scientometric Review 1 Joseph M. De Guia,

More information

Analyzing Mul,- Dimensional Biological Model. Ma6hieu Pichené

Analyzing Mul,- Dimensional Biological Model. Ma6hieu Pichené Analyzing Mul,- Dimensional Biological Model Ma6hieu Pichené Biological problem Design efficient cancerous tumor treatments. Efficient protocol = Op,mize drug quan,ty : - frequency of treatment - choice

More information

Reac%ve and Benign Flow Cytometry findings

Reac%ve and Benign Flow Cytometry findings Reac%ve and Benign Flow Cytometry findings Lymph nodes and other /ssues Sindhu Cherian, MD University of Washington, Sea

More information

Common Data Elements: Making the Mass of NIH Measures More Useful

Common Data Elements: Making the Mass of NIH Measures More Useful Common Data Elements WG Common Data Elements: Making the Mass of NIH Measures More Useful Jerry Sheehan Assistant Director for Policy Development Na?onal Library of Medicine Gene/c Alliance Webinar Series

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Data analysis in microarray experiment

Data analysis in microarray experiment 16 1 004 Chinese Bulletin of Life Sciences Vol. 16, No. 1 Feb., 004 1004-0374 (004) 01-0041-08 100005 Q33 A Data analysis in microarray experiment YANG Chang, FANG Fu-De * (National Laboratory of Medical

More information

SUPPLEMENTARY APPENDIX

SUPPLEMENTARY APPENDIX SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature

More information

Depression Tinnitus Stroke. PowerMAG

Depression Tinnitus Stroke. PowerMAG Depression Tinnitus Stroke PowerMAG Applicaon Magnec smulaon is a highly innovave method that can be used to in uence the electrical acvity of the nerve cells non-invasively and virtually pain-free. The

More information

Optimization problems in Radiation Therapy and Medical Imaging. Introduc*on

Optimization problems in Radiation Therapy and Medical Imaging. Introduc*on Optimization problems in Radiation Therapy and Medical Imaging ì Introduc*on Introduction Radia%on therapy Treatment planning and delivery Radiotherapy planning process 1. Pa*ent is diagnosed 2. CT scan

More information

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics Copy Number Variations and Association Mapping 02-715 Advanced Topics in Computa8onal Genomics SNP and CNV Genotyping SNP genotyping assumes two copy numbers at each locus (i.e., no CNVs) CNV genotyping

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

Blue Cross Blue Shield of Michigan Building a Statewide PCMH Program: Design, Evalua>on Methods, and Results

Blue Cross Blue Shield of Michigan Building a Statewide PCMH Program: Design, Evalua>on Methods, and Results Blue Cross Blue Shield of Michigan Building a Statewide PCMH Program: Design, Evalua>on Methods, and Results Margaret Mason, MHSA Michael Paus6an, PhD, MS Amanda Markovitz, MPH 1 Overview of BCBSM Serving

More information

Propensity Score. Overview:

Propensity Score. Overview: Propensity Score Overview: What do we use a propensity score for? How do we construct the propensity score? How do we implement propensity score es

More information

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline

More information

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene

More information

MammaPrint Improving treatment decisions in breast cancer Support and Involvement of EU

MammaPrint Improving treatment decisions in breast cancer Support and Involvement of EU MammaPrint Improving treatment decisions in breast cancer Support and Involvement of EU 1 Bas van der Baan VP Clinical Affairs Irvine, California Amsterdam, The Netherlands 2 Two Crucial Questions in Cancer

More information

Gene expression profiling predicts clinical outcome of prostate cancer. Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M.

Gene expression profiling predicts clinical outcome of prostate cancer. Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M. SUPPLEMENTARY DATA Gene expression profiling predicts clinical outcome of prostate cancer Gennadi V. Glinsky, Anna B. Glinskii, Andrew J. Stephenson, Robert M. Hoffman, William L. Gerald Table of Contents

More information

Diagnosis of multiple cancer types by shrunken centroids of gene expression

Diagnosis of multiple cancer types by shrunken centroids of gene expression Diagnosis of multiple cancer types by shrunken centroids of gene expression Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu PNAS 99:10:6567-6572, 14 May 2002 Nearest Centroid

More information

RNA- seq Introduc1on. Promises and pi7alls

RNA- seq Introduc1on. Promises and pi7alls RNA- seq Introduc1on Promises and pi7alls DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different func1onal RNAs Which RNAs (and some1mes

More information

Supplementary Figure 1. Biological characteris=cs of Smarcb1 flox/flox ; Rosa26- Cre ERT2 ; lymphomas

Supplementary Figure 1. Biological characteris=cs of Smarcb1 flox/flox ; Rosa26- Cre ERT2 ; lymphomas Supplementary Figure 1. Biological characteris=cs of Smarcb1 flox/flox ; Rosa26- Cre ERT2 ; lymphomas a Tec Kinase Signaling CD28 Signaling in T Helper Cells Role of Macrophages in Rheumatoid Arthri>s

More information

Evaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination

Evaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination Evaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination Committee: Advisor: Dr. Rosemary Renaut Dr. Adrienne C. Scheck Dr. Kenneth Hoober Dr. Bradford Kirkman-Liff John

More information

CS2220 Introduction to Computational Biology

CS2220 Introduction to Computational Biology CS2220 Introduction to Computational Biology WEEK 8: GENOME-WIDE ASSOCIATION STUDIES (GWAS) 1 Dr. Mengling FENG Institute for Infocomm Research Massachusetts Institute of Technology mfeng@mit.edu PLANS

More information

Gene Expression Classifica1on of Colon Cancer: Iden1fica1on of Six Molecular Subtypes with Dis1nct Clinical, Molecular and Survival Characteris1cs

Gene Expression Classifica1on of Colon Cancer: Iden1fica1on of Six Molecular Subtypes with Dis1nct Clinical, Molecular and Survival Characteris1cs CIT Seminar - may 29, 213 Ligue Na1onale Contre le Cancer Gene Expression Classifica1on of Colon Cancer: Iden1fica1on of Six Molecular Subtypes with Dis1nct Clinical, Molecular and Survival Characteris1cs

More information

Nicholas Chiorazzi The Feinstein Ins3tute for Medical Research Northwell Health Manhasset, NY

Nicholas Chiorazzi The Feinstein Ins3tute for Medical Research Northwell Health Manhasset, NY A Somewhat Different View of the Gene3c Portrait of Chronic Lymphocy3c Leukemia Nicholas Chiorazzi The Feinstein Ins3tute for Medical Research Northwell Health Manhasset, NY Acknowledgments Davide Bagnara

More information

Ensemble methods for classification of patients for personalized. medicine with high-dimensional data

Ensemble methods for classification of patients for personalized. medicine with high-dimensional data Ensemble methods for classification of patients for personalized medicine with high-dimensional data Hojin Moon 1, Hongshik Ahn, Ralph L. Kodell 1, Songjoon Baek 1, Chien-Ju Lin 1, Taewon Lee 1 and James

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics. Mike West Duke University

Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics. Mike West Duke University Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics Mike West Duke University Papers, software, many links: www.isds.duke.edu/~mw ABS04 web site: Lecture slides, stats notes, papers,

More information

Wen et al. (1998) PNAS, 95:

Wen et al. (1998) PNAS, 95: Large-scale temporal gene expression mapping of central nervous system development Fluctuations in mrna expression of 2 genes during rat central nervous system development, focusing on the cervical spinal

More information

Lymphoma and Myeloma Kris3ne Kra4s, M.D.

Lymphoma and Myeloma Kris3ne Kra4s, M.D. Lymphoma and Myeloma Kris3ne Kra4s, M.D. Hematologic Malignancies Leukemia Malignancy of hematopoie3c cells Starts in bone marrow, can spread to blood, nodes Myeloid or lymphoid Acute or chronic Lymphoma

More information

Classification with microarray data

Classification with microarray data Classification with microarray data Aron Charles Eklund eklund@cbs.dtu.dk DNA Microarray Analysis - #27612 January 8, 2010 The rest of today Now: What is classification, and why do we do it? How to develop

More information

Selection of Patient Samples and Genes for Outcome Prediction

Selection of Patient Samples and Genes for Outcome Prediction Selection of Patient Samples and Genes for Outcome Prediction Huiqing Liu Jinyan Li Limsoon Wong Institute for Infocomm Research 21 Heng Mui Keng Terrace Singapore 119613 huiqing, jinyan, limsoon @i2r.a-star.edu.sg

More information

Prognostic and predictive biomarkers. Marc Buyse International Drug Development Institute (IDDI) Louvain-la-Neuve, Belgium

Prognostic and predictive biomarkers. Marc Buyse International Drug Development Institute (IDDI) Louvain-la-Neuve, Belgium Prognostic and predictive biomarkers Marc Buyse International Drug Development Institute (IDDI) Louvain-la-Neuve, Belgium marc.buyse@iddi.com 1 Prognostic biomarkers (example: gene signature) 2 PROGNOSTIC

More information

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples

More information

Disclosures 10/10/17. Low Risk, High Success: Prolotherapy Regenera:ve Medicine for Osteoarthri:s. Nothing to disclose.

Disclosures 10/10/17. Low Risk, High Success: Prolotherapy Regenera:ve Medicine for Osteoarthri:s. Nothing to disclose. Low Risk, High Success: Prolotherapy Regenera:ve Medicine for Osteoarthri:s DONNA ALDERMAN, D.O. HEMWALL CENTER FOR ORTHOPEDIC REGENERATIVE MEDICINE WWW.PROLOTHERAPY.COM 28 th Annual Mee1ng, San Diego,

More information

Development and Applica0on of Real- Time Clinical Predic0ve Models

Development and Applica0on of Real- Time Clinical Predic0ve Models Development and Applica0on of Real- Time Clinical Predic0ve Models Ruben Amarasingham, MD, MBA Associate Professor, UT Southwestern Medical Center AHRQ- funded R24 UT Southwestern Center for Pa?ent- Centered

More information

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Karsten Borgwardt February 21 to March 4, 2011 Machine Learning & Computational Biology Research Group MPIs Tübingen Karsten Borgwardt:

More information

3/31/2015. Designing Clinical Research Studies: So You Want to Be an

3/31/2015. Designing Clinical Research Studies: So You Want to Be an Designing Clinical Research Studies: So You Want to Be an Inves@gator Andrea Bonny, MD Ellen Lançon Connor, MD On behalf Of The NASPAG Research CommiPee Objec@ves Learn to design a clinical research project

More information

Valida5on of a Microsatellite Instability Assay by NGS

Valida5on of a Microsatellite Instability Assay by NGS Valida5on of a Microsatellite Instability Assay by NGS Mark R. Miglarese, Ph.D. VP R&D 1 Caris Life Sciences 230,000+ tests performed in 2016 Headquarters: Irving, Texas Laboratory: Phoenix, Arizona 66,000

More information

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics

More information

The Importance of Iden0fying Women at Risk for BRCA1/2 Muta0ons for Referral to Cancer Gene0cs Services

The Importance of Iden0fying Women at Risk for BRCA1/2 Muta0ons for Referral to Cancer Gene0cs Services The Importance of Iden0fying Women at Risk for BRCA1/2 Muta0ons for Referral to Cancer Gene0cs Services Cecelia Bellcross, PhD, MS, CGC Emory University School of Medicine Department of Human Gene0cs Alliance

More information

FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION

FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION SOMAYEH ABBASI, HAMID MAHMOODIAN Department of Electrical Engineering, Najafabad branch, Islamic

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and

More information

Bayesian Prediction Tree Models

Bayesian Prediction Tree Models Bayesian Prediction Tree Models Statistical Prediction Tree Modelling for Clinico-Genomics Clinical gene expression data - expression signatures, profiling Tree models for predictive sub-typing Combining

More information

Agents and State Spaces. CSCI 446: Ar*ficial Intelligence Keith Vertanen

Agents and State Spaces. CSCI 446: Ar*ficial Intelligence Keith Vertanen Agents and State Spaces CSCI 446: Ar*ficial Intelligence Keith Vertanen Overview Agents and environments Ra*onality Agent types Specifying the task environment Performance measure Environment Actuators

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:.38/nature8975 SUPPLEMENTAL TEXT Unique association of HOTAIR with patient outcome To determine whether the expression of other HOX lincrnas in addition to HOTAIR can predict patient outcome, we measured

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Predictive Biomarkers

Predictive Biomarkers Uğur Sezerman Evolutionary Selection of Near Optimal Number of Features for Classification of Gene Expression Data Using Genetic Algorithms Predictive Biomarkers Biomarker: A gene, protein, or other change

More information

Propensity Score Methods for Longitudinal Data Analyses: General background, ra>onale and illustra>ons*

Propensity Score Methods for Longitudinal Data Analyses: General background, ra>onale and illustra>ons* Propensity Score Methods for Longitudinal Data Analyses: General background, ra>onale and illustra>ons* Bob Pruzek, University at Albany SUNY Summary Propensity Score Analysis (PSA) was introduced by Rosenbaum

More information

REPRODUCIBILITY AND RELIABILITY OF REPEATED SEMEN ANALYSES IN MALE PARTNERS OF SUBFERTILE COUPLES

REPRODUCIBILITY AND RELIABILITY OF REPEATED SEMEN ANALYSES IN MALE PARTNERS OF SUBFERTILE COUPLES REPRODUCIBILITY AND RELIABILITY OF REPEATED SEMEN ANALYSES IN MALE PARTNERS OF SUBFERTILE COUPLES Esther Leushuis Jan Willem van der Steeg Pieternel Steures Sjoerd Repping Patrick M.M. Bossuyt Marinus

More information

Identifying Engineering, Clinical and Patient's Metrics for Evaluating and Quantifying Performance of Brain- Machine Interface Systems

Identifying Engineering, Clinical and Patient's Metrics for Evaluating and Quantifying Performance of Brain- Machine Interface Systems Identifying Engineering, Clinical and Patient's Metrics for Evaluating and Quantifying Performance of Brain- Machine Interface Systems Jose Pepe L. Contreras-Vidal, Ph.D. Department of Electrical & Computer

More information

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL MicroRNA expression profiling and functional analysis in prostate cancer Marco Folini s.c. Ricerca Traslazionale DOSL What are micrornas? For almost three decades, the alteration of protein-coding genes

More information