Informatics in Computational Medicine

Size: px
Start display at page:

Download "Informatics in Computational Medicine"

Transcription

1 Informatics in Computational Medicine Inderjit S. Dhillon Director, Center for Big Data Analytics UT Austin Computer Science, Mathematics & ICES ICES Computational Medicine Day May 13, 2014

2 Introduction Computational Medicine Quantitative approach to understanding, detecting and treating diseases Need to understand and control biological processes involved in human diseases via systemic measurements and computational analyses Calzolari D, Bruschi S, Coquin L, Schofield J, Feala JD, et al. (2008) Search Algorithms as a Framework for the Optimization of Drug Combinations. PLoS Comput Biol 4(12): e

3 Medical Genetics Analysis begins at the level of genes study of human genetics & its application to medical care Eventual goals: Gene therapy for cure personalized medicine Winslow, Raimond L., et al. "Computational medicine: translating models to clinical care." Science Translational Medicine (2012) 3

4 Gene Networks Multiple genes collectively influence the likelihood of developing many common and comple diseases. Important to understand how genes interact with each other (coepression gene networks) Xiao, X. et al., Multi-tissue analysis of co-epression networks by Higher-Order generalized singular value decomposition identifies functionally coherent transcriptional modules. PLoS Genetics 10 (1), e

5 Predicting gene-disease links AVPR2? AQP1 Diabetes insipidus? AQP2 AVP MYBL2 Goal: Discover human gene-disease associations Biologists prefer a short list of potentially relevant genes for further studies 5

6 Gene Network AVPR2 APQ6 AQP1 APQ5 AQP2 GK MIP AVP MYBL2 Functional interactions between genes (e.g. HumanNet) 6

7 Gene-Phenotype Networks Abnormal kidney physiology AVPR2 Decreased urine osmolality AQP1 AQP2 MYBL2 Response to salt stress Orthologous phenotypes in other model species can shed light on gene functions and in turn disease-causing genes 7

8 The Prediction Problem?? 8

9 Other data sources: Genomic data Genes ( probes ) Conditions (e.g. diseased vs normal tissue) Abundant epression data available (sources such as BioGPS) Co-epression reveals gene function modules ( Eigengenes ) Langfelder, P., & Horvath, S. (2007). Eigengene networks for studying the relationships between co-epression modules. BMC systems biology, 1(1), 54. 9

10 Other data sources: Tet data Descriptions of diagnosis, clinical features and management in tet articles provide information on diseases Represent diseases by document term frequencies Tet data on diseases (e.g. OMIM web pages) endometrial Disease 1 anti-estrogen Disease 2 heterozygotes 10

11 Other data sources: Co-morbidity Similarity network between diseases Prune Belly Syndrome Neurofibromatosis 0.31 Achondroplasia Similarities can be computed from data available on diseases (such as tet, symptoms, drug responses, etc) 11

12 Problem Formulation Genes Diseases??????? We want to predict missing associations denoted by? 12

13 The Netfli Problem 13

14 Inductive Matri Completion Orthologous phenotypes APQ6 Microarray PCA AQP1 AQP5 Genes Diseases????.? Gene network AQP2 GK MIP? 14

15 Inductive Matri Completion Microarray OMIM pages tf-idf 0.12 NF2 EGBRS Disease similarities ACH Orthologous phenotypes PCA Diseases? Genes???. APQ6 AQP1 AQP5? Gene network AQP2 GK MIP? 15

16 Inductive Matri Completion Microarray OMIM pages tf-idf 0.12 NF2 EGBRS Disease similarities ACH Orthologous phenotypes PCA????. APQ6 AQP1 AQP5? Gene network AQP2 GK MIP Genes? 16

17 Inductive Matri Completion Microarray OMIM pages tf-idf 0.12 NF2 EGBRS Disease similarities ACH Orthologous phenotypes PCA????.? Gene network GK MIP Genes? Natarajan N, Dhillon IS. Inductive Matri Completion for Predicting Gene-Disease Associations. To appear in Bioinformatics,

18 Results on OMIM data P(hidden gene among genes looked at) Inductive Matri Completion CATAPULT Biased Matri Completion ProDiGe [Mordelet et al. 2011] Katz LEML [Yu et al. 2014] Matri Completion (baseline) Number of genes looked at 1. Online Mendelian Inheritance in Man: 2. F. Mordelet and J.-P. Vert. ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled eamples. BMC Bioinformatics Yu, Hsiang-Fu, Prateek Jain, and Inderjit S. Dhillon: Large-scale Multi-label Learning with Missing Labels. ICML

19 Results on OMIM data Inductive Matri Completion CATAPULT Biased Matri Completion P(hidden gene among genes looked at) 0.1 ProDiGe [Mordelet et al. 2011] Katz LEML [Yu et al. 2014] P(hidden gene among genes looked at) Number of genes looked at Number of genes looked at New genes New diseases 19

20 Positive-Unlabeled Learning The prediction problem gives rise to a novel machine learning problem called PU learning - learning in the absence of negative eamples Methods such as biased SVM and biased matri completion shown to perform well empirically Can analyze theoretically using random noise models For biased matri completion, we can show Theorem: Let be the noise rate. With probability at least 1 - δ, solution X to the biased matri completion is close to the true (n n) matri Y: 1 X n 2 (X ij Y ij ) 2 1 = O (1 )n ij 20

21 Conclusions Informatics for Computational Medicine is crucial Need novel statistical techniques Analysis of etremely high-dimensional data from high-throughput studies Learning from diverse information sources Interpretable computational models Can obtain better gene-disease associations than state-of-the-art 21

Literature databases OMIM

Literature databases OMIM Literature databases OMIM Online Mendelian Inheritance in Man OMIM OMIM is a database that catalogues all the known diseases with a genetic component, and when possible links them to the relevant genes

More information

Evaluating Classifiers for Disease Gene Discovery

Evaluating Classifiers for Disease Gene Discovery Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics

More information

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples

More information

Data Mining in Bioinformatics Day 4: Text Mining

Data Mining in Bioinformatics Day 4: Text Mining Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1 What is text mining?

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1

Nature Neuroscience: doi: /nn Supplementary Figure 1 Supplementary Figure 1 Illustration of the working of network-based SVM to confidently predict a new (and now confirmed) ASD gene. Gene CTNND2 s brain network neighborhood that enabled its prediction by

More information

A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION

A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION 5-9 JATIT. All rights reserved. A COMBINATORY ALGORITHM OF UNIVARIATE AND MULTIVARIATE GENE SELECTION 1 H. Mahmoodian, M. Hamiruce Marhaban, 3 R. A. Rahim, R. Rosli, 5 M. Iqbal Saripan 1 PhD student, Department

More information

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY

CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY 64 CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY 3.1 PROBLEM DEFINITION Clinical data mining (CDM) is a rising field of research that aims at the utilization of data mining techniques to extract

More information

Applications of Causal Discovery Methods in Biomedicine

Applications of Causal Discovery Methods in Biomedicine Applications of Causal Discovery Methods in Biomedicine Sisi Ma Sisi.Ma@nyumc.org New York University School of Medicine NYU Center for Health Informatics & Bioinformatics Alexander Statnikov; NYU Psychiatry

More information

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection

Efficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection 202 4th International onference on Bioinformatics and Biomedical Technology IPBEE vol.29 (202) (202) IASIT Press, Singapore Efficacy of the Extended Principal Orthogonal Decomposition on DA Microarray

More information

A framework for the study of diseases and adverse drug reactions

A framework for the study of diseases and adverse drug reactions A framework for the study of diseases and adverse drug reactions Laura I. Furlong IBI group Research Programme on Biomedical Informatics (GRIB) Hospital del Mar Research Institute (IMIM) Information on

More information

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data Breast cancer Inferring Transcriptional Module from Breast Cancer Profile Data Breast Cancer and Targeted Therapy Microarray Profile Data Inferring Transcriptional Module Methods CSC 177 Data Warehousing

More information

Detecting and monitoring foodborne illness outbreaks: Twitter communications and the 2015 U.S. Salmonella outbreak linked to imported cucumbers

Detecting and monitoring foodborne illness outbreaks: Twitter communications and the 2015 U.S. Salmonella outbreak linked to imported cucumbers Detecting and monitoring foodborne illness outbreaks: Twitter communications and the 2015 U.S. Salmonella outbreak linked to imported cucumbers Abstract This research uses Twitter, as a social media device,

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Modeling and Environmental Science: In Conclusion

Modeling and Environmental Science: In Conclusion Modeling and Environmental Science: In Conclusion Environmental Science It sounds like a modern idea, but if you view it broadly, it s a very old idea: Our ancestors survival depended on their knowledge

More information

Research Article Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods

Research Article Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods Hindawi Publishing Corporation BioMed Research International Volume 2015, Article ID 810514, 9 pages http://dx.doi.org/10.1155/2015/810514 Research Article Prediction of MicroRNA-Disease Associations Based

More information

CURRICULUM VITA OF Xiaowen Chen

CURRICULUM VITA OF Xiaowen Chen Xiaowen Chen College of Bioinformatics Science and Technology Harbin Medical University Harbin, 150086, P. R. China Mobile Phone:+86 13263501862 E-mail: hrbmucxw@163.com Education Experience 2008-2011

More information

Personalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection

Personalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection Personalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection Personalis, Inc. 1350 Willow Road, Suite 202, Menlo Park, California 94025

More information

A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm:

A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm: The clustering problem: partition genes into distinct sets with high homogeneity and high separation Hierarchical clustering algorithm: 1. Assign each object to a separate cluster. 2. Regroup the pair

More information

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Introduction RNA splicing is a critical step in eukaryotic gene

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

Corporate Medical Policy

Corporate Medical Policy Corporate Medical Policy Invasive Prenatal (Fetal) Diagnostic Testing File Name: Origination: Last CAP Review: Next CAP Review: Last Review: invasive_prenatal_(fetal)_diagnostic_testing 12/2014 3/2018

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

ChIP-seq data analysis

ChIP-seq data analysis ChIP-seq data analysis Harri Lähdesmäki Department of Computer Science Aalto University November 24, 2017 Contents Background ChIP-seq protocol ChIP-seq data analysis Transcriptional regulation Transcriptional

More information

DEVELOPING THE RESEARCH FRAMEWORK Dr. Noly M. Mascariñas

DEVELOPING THE RESEARCH FRAMEWORK Dr. Noly M. Mascariñas DEVELOPING THE RESEARCH FRAMEWORK Dr. Noly M. Mascariñas Director, BU-CHED Zonal Research Center Bicol University Research and Development Center Legazpi City Research Proposal Preparation Seminar-Writeshop

More information

Introduction to Cancer Bioinformatics and cancer biology. Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) January 20, 2015

Introduction to Cancer Bioinformatics and cancer biology. Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) January 20, 2015 Introduction to Cancer Bioinformatics and cancer biology Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) January 20, 2015 Why cancer bioinformatics? Devastating disease, no cure on the horizon Major

More information

RNA-seq Introduction

RNA-seq Introduction RNA-seq Introduction DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different functional RNAs Which RNAs (and sometimes then translated

More information

ABS04. ~ Inaugural Applied Bayesian Statistics School EXPRESSION

ABS04. ~ Inaugural Applied Bayesian Statistics School EXPRESSION ABS04-2004 Applied Bayesian Statistics School STATISTICS & GENE EXPRESSION GENOMICS: METHODS AND COMPUTATIONS Mike West Duke University Centro Congressi Panorama, Trento,, Italy 15th-19th 19th June 2004

More information

Will now consider in detail the effects of relaxing the assumption of infinite-population size.

Will now consider in detail the effects of relaxing the assumption of infinite-population size. FINITE POPULATION SIZE: GENETIC DRIFT READING: Nielsen & Slatkin pp. 21-27 Will now consider in detail the effects of relaxing the assumption of infinite-population size. Start with an extreme case: a

More information

HYBRID SUPPORT VECTOR MACHINE BASED MARKOV CLUSTERING FOR TUMOR DETECTION FROM BIO-MOLECULAR DATA

HYBRID SUPPORT VECTOR MACHINE BASED MARKOV CLUSTERING FOR TUMOR DETECTION FROM BIO-MOLECULAR DATA HYBRID SUPPORT VECTOR MACHINE BASED MARKOV CLUSTERING FOR TUMOR DETECTION FROM BIO-MOLECULAR DATA S. SubashChandraBose 1 and T. Christopher 2 1 Department of Computer Science, PG and Research Department,

More information

Doing Thousands of Hypothesis Tests at the Same Time. Bradley Efron Stanford University

Doing Thousands of Hypothesis Tests at the Same Time. Bradley Efron Stanford University Doing Thousands of Hypothesis Tests at the Same Time Bradley Efron Stanford University 1 Simultaneous Hypothesis Testing 1980: Simultaneous Statistical Inference (Rupert Miller) 2, 3,, 20 simultaneous

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

A Semi-supervised Approach to Perceived Age Prediction from Face Images

A Semi-supervised Approach to Perceived Age Prediction from Face Images IEICE Transactions on Information and Systems, vol.e93-d, no.10, pp.2875 2878, 2010. 1 A Semi-supervised Approach to Perceived Age Prediction from Face Images Kazuya Ueki NEC Soft, Ltd., Japan Masashi

More information

Non-Mendelian inheritance

Non-Mendelian inheritance Non-Mendelian inheritance Focus on Human Disorders Peter K. Rogan, Ph.D. Laboratory of Human Molecular Genetics Children s Mercy Hospital Schools of Medicine & Computer Science and Engineering University

More information

Biomarkers In Renal Disease (Nova B Iomedical) READ ONLINE

Biomarkers In Renal Disease (Nova B Iomedical) READ ONLINE Biomarkers In Renal Disease (Nova B Iomedical) READ ONLINE hypertensive nephropathy, polycystic kidney disease, great values as biomarkers in different kidney Physiology, Biomedical evidence-based approach

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

What is the inheritance pattern (e.g., autosomal, sex-linked, dominant, recessive, etc.)?

What is the inheritance pattern (e.g., autosomal, sex-linked, dominant, recessive, etc.)? Module I: Introduction to the disease Give a brief introduction to the disease, considering the following: the symptoms that define the syndrome, the range of phenotypes exhibited by individuals with the

More information

UAB P30 CORE A: The Hepato-Renal Fibrocystic Diseases Translational Resource

UAB P30 CORE A: The Hepato-Renal Fibrocystic Diseases Translational Resource PKD Foundation UAB P30 CORE A: The Hepato-Renal Fibrocystic Diseases Translational Resource http://www.arpkdstudies.uab.edu/ Director: Co-Director: Lisa M. Guay-Woodford, MD William E. Grizzle, MD, PhD

More information

Molecular Database Generation for Type 2 Diabetes using Computational Science-Bioinformatics Tools

Molecular Database Generation for Type 2 Diabetes using Computational Science-Bioinformatics Tools Molecular Database Generation for Type Diabetes using Computational Science-Bioinformatics Tools Gagandeep Kaur Grewal Dept of Computer Engineering, UCOE, Punjabi University,Patiala Punjab,India gdeepgrewal@gmail.com

More information

AClass: A Simple, Online Probabilistic Classifier. Vikash K. Mansinghka Computational Cognitive Science Group MIT BCS/CSAIL

AClass: A Simple, Online Probabilistic Classifier. Vikash K. Mansinghka Computational Cognitive Science Group MIT BCS/CSAIL AClass: A Simple, Online Probabilistic Classifier Vikash K. Mansinghka Computational Cognitive Science Group MIT BCS/CSAIL AClass: A Simple, Online Probabilistic Classifier or How I learned to stop worrying

More information

Feature Selection and Extraction Framework for DNA Methylation in Cancer

Feature Selection and Extraction Framework for DNA Methylation in Cancer Feature Selection and Extraction Framework for DNA Methylation in Cancer Abeer A. Raweh Dept. of Computer Science, Faculty of Computers and Information, Cairo University Cairo, Egypt Mohammad Nassef Dept.

More information

Autism Pathways Analysis: A Functional Framework and Clues for Further Investigation. Martha Herbert PhD MD Ya Wen PhD July 2016

Autism Pathways Analysis: A Functional Framework and Clues for Further Investigation. Martha Herbert PhD MD Ya Wen PhD July 2016 Autism Pathways Analysis: A Functional Framework and Clues for Further Investigation Martha Herbert PhD MD Ya Wen PhD July 2016 1 Report on pathway network analyses in autism, based on open-access paper

More information

Decomposition of the Genotypic Value

Decomposition of the Genotypic Value Decomposition of the Genotypic Value 1 / 17 Partitioning of Phenotypic Values We introduced the general model of Y = G + E in the first lecture, where Y is the phenotypic value, G is the genotypic value,

More information

Releasing SNP Data and GWAS Results with Guaranteed Privacy Protection

Releasing SNP Data and GWAS Results with Guaranteed Privacy Protection integrating Data for Analysis, Anonymization, and SHaring Releasing SNP Data and GWAS Results with Guaranteed Privacy Protection Xiaoqian Jiang, PhD and Shuang Wang, PhD Overview Introduction idash healthcare

More information

Modularized Random Walk with Restart for Candidate Disease Genes Prioritization

Modularized Random Walk with Restart for Candidate Disease Genes Prioritization The Third International Symposium on Optimization and Systems Biology (OSB 09) Zhangjiajie, China, September 20 22, 2009 Copyright 2009 ORSC & APORC, pp. 353 360 Modularized Random Walk with Restart for

More information

FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION

FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION FUZZY C-MEANS AND ENTROPY BASED GENE SELECTION BY PRINCIPAL COMPONENT ANALYSIS IN CANCER CLASSIFICATION SOMAYEH ABBASI, HAMID MAHMOODIAN Department of Electrical Engineering, Najafabad branch, Islamic

More information

DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data

DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data Ting Gong, Joseph D. Szustakowski April 30, 2018 1 Introduction Heterogeneous tissues are frequently

More information

Reduction of Mortality Rate Due to AIDS When Treatment Is Considered

Reduction of Mortality Rate Due to AIDS When Treatment Is Considered Pure and Applied Mathematics Journal 216; 5(4): 97-12 http://www.sciencepublishinggroup.com/j/pamj doi: 1.11648/j.pamj.21654.12 ISSN: 2326-979 (Print); ISSN: 2326-9812 (Online) Reduction of Mortality Rate

More information

Does Machine Learning. In a Learning Health System?

Does Machine Learning. In a Learning Health System? Does Machine Learning Have a Place In a Learning Health System? Grand Rounds: Rethinking Clinical Research Friday, December 15, 2017 Michael J. Pencina, PhD Professor of Biostatistics and Bioinformatics,

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course Introduction Biostatistics and Bioinformatics Summer 2017 From Raw Unaligned Reads To Aligned Reads To Counts Differential Expression Differential Expression 3 2 1 0 1

More information

Introduction to Computational Neuroscience

Introduction to Computational Neuroscience Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single

More information

A Network-Based Approach for Identifying Effective Drug Combination

A Network-Based Approach for Identifying Effective Drug Combination The Third International Symposium on Optimization and Systems Biology (OSB 09) Zhangjiajie, China, September 20 22, 2009 Copyright 2009 ORSC & APORC, pp. 207 213 A Network-Based Approach for Identifying

More information

I. Setup. - Note that: autohgpec_v1.0 can work on Windows, Ubuntu and Mac OS.

I. Setup. - Note that: autohgpec_v1.0 can work on Windows, Ubuntu and Mac OS. autohgpec: Automated prediction of novel disease-gene and diseasedisease associations and evidence collection based on a random walk on heterogeneous network Duc-Hau Le 1,*, Trang T.H. Tran 1 1 School

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Inter-session reproducibility measures for high-throughput data sources

Inter-session reproducibility measures for high-throughput data sources Inter-session reproducibility measures for high-throughput data sources Milos Hauskrecht, PhD, Richard Pelikan, MSc Computer Science Department, Intelligent Systems Program, Department of Biomedical Informatics,

More information

CS229 Final Project Report. Predicting Epitopes for MHC Molecules

CS229 Final Project Report. Predicting Epitopes for MHC Molecules CS229 Final Project Report Predicting Epitopes for MHC Molecules Xueheng Zhao, Shanshan Tuo Biomedical informatics program Stanford University Abstract Major Histocompatibility Complex (MHC) plays a key

More information

Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model

Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model Yiqian Zhou 1, Rehman Qureshi 2, and Ahmet Sacan 3 1 Pure Storage, 650 Castro Street, Suite #260,

More information

Data analysis in microarray experiment

Data analysis in microarray experiment 16 1 004 Chinese Bulletin of Life Sciences Vol. 16, No. 1 Feb., 004 1004-0374 (004) 01-0041-08 100005 Q33 A Data analysis in microarray experiment YANG Chang, FANG Fu-De * (National Laboratory of Medical

More information

What can we contribute to cancer research and treatment from Computer Science or Mathematics? How do we adapt our expertise for them

What can we contribute to cancer research and treatment from Computer Science or Mathematics? How do we adapt our expertise for them From Bioinformatics to Health Information Technology Outline What can we contribute to cancer research and treatment from Computer Science or Mathematics? How do we adapt our expertise for them Introduction

More information

Forecasting Potential Diabetes Complications

Forecasting Potential Diabetes Complications Forecasting Potential Diabetes Complications Yang Yang, Jie Tang, Juanzi Li Tsinghua University Walter Luyten, Marie-Francine Moens Katholieke Universiteit Leuven Lu Liu Northwestern University 1 Diabetes

More information

Increasing Efficiency of Microarray Analysis by PCA and Machine Learning Methods

Increasing Efficiency of Microarray Analysis by PCA and Machine Learning Methods 56 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 Increasing Efficiency of Microarray Analysis by PCA and Machine Learning Methods Jing Sun 1, Kalpdrum Passi 1, Chakresh Jain 2 1 Department

More information

Fundamentals of Genetics

Fundamentals of Genetics Fundamentals of Genetics For thousands of years people have known that living things somehow pass on some type of information to their offspring. This was very clear in things that humans selected to breed

More information

Big Image-Omics Data Analytics for Clinical Outcome Prediction

Big Image-Omics Data Analytics for Clinical Outcome Prediction Big Image-Omics Data Analytics for Clinical Outcome Prediction Junzhou Huang, Ph.D. Associate Professor Dept. Computer Science & Engineering University of Texas at Arlington Dept. CSE, UT Arlington Scalable

More information

Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes

Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes Ivan Arreola and Dr. David Han Department of Management of Science and Statistics, University

More information

SubLasso:a feature selection and classification R package with a. fixed feature subset

SubLasso:a feature selection and classification R package with a. fixed feature subset SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced

More information

DETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING SHEARLET WAVE TRANSFORM

DETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING SHEARLET WAVE TRANSFORM DETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING Ms.Saranya.S 1, Priyanga. R 2, Banurekha. B 3, Gayathri.G 4 1 Asst. Professor,Electronics and communication,panimalar Institute of technology, Tamil

More information

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant National Disease Research Interchange Annual Progress Report: 2010 Formula Grant Reporting Period July 1, 2011 June 30, 2012 Formula Grant Overview The National Disease Research Interchange received $62,393

More information

Identifying Potential Prognostic Biomarkers by Analyzing Gene Expression for Different Cancers

Identifying Potential Prognostic Biomarkers by Analyzing Gene Expression for Different Cancers Identifying Potential Prognostic Biomarkers by Analyzing Gene Expression for Different Cancers Pragya Verma Department of Mathematics, Bioinformatics and Computer Applications Maulana Azad National Institute

More information

BALANCE 13 DISORDERS OF WATER DISORDERS CHARACTERISED BY POLYDIPSIA AND POLYURIA. (vasopressin deficiency) 1 [primary] [secondary 6C] insipidus

BALANCE 13 DISORDERS OF WATER DISORDERS CHARACTERISED BY POLYDIPSIA AND POLYURIA. (vasopressin deficiency) 1 [primary] [secondary 6C] insipidus Wit JM, Ranke MB, Kelnar CJH (eds): ESPE classification of paediatric endocrine diagnosis. 13. Disorders of water balance. Horm Res 2007;68(suppl 2):96 97 ESPE Code Diagnosis OMIM ICD10 13 DISORDERS OF

More information

Predicting Kidney Cancer Survival from Genomic Data

Predicting Kidney Cancer Survival from Genomic Data Predicting Kidney Cancer Survival from Genomic Data Christopher Sauer, Rishi Bedi, Duc Nguyen, Benedikt Bünz Abstract Cancers are on par with heart disease as the leading cause for mortality in the United

More information

A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis

A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis A Biclustering Based Classification Framework for Cancer Diagnosis and Prognosis Baljeet Malhotra and Guohui Lin Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8

More information

Nearest Shrunken Centroid as Feature Selection of Microarray Data

Nearest Shrunken Centroid as Feature Selection of Microarray Data Nearest Shrunken Centroid as Feature Selection of Microarray Data Myungsook Klassen Computer Science Department, California Lutheran University 60 West Olsen Rd, Thousand Oaks, CA 91360 mklassen@clunet.edu

More information

Predictive Biomarkers

Predictive Biomarkers Uğur Sezerman Evolutionary Selection of Near Optimal Number of Features for Classification of Gene Expression Data Using Genetic Algorithms Predictive Biomarkers Biomarker: A gene, protein, or other change

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

ANALYTISCHE STRATEGIE Tissue Imaging. Bernd Bodenmiller Institute of Molecular Life Sciences University of Zurich

ANALYTISCHE STRATEGIE Tissue Imaging. Bernd Bodenmiller Institute of Molecular Life Sciences University of Zurich ANALYTISCHE STRATEGIE Tissue Imaging Bernd Bodenmiller Institute of Molecular Life Sciences University of Zurich Quantitative Breast single cancer cell analysis Switzerland Brain Breast Lung Colon-rectum

More information

Biomarker adaptive designs in clinical trials

Biomarker adaptive designs in clinical trials Review Article Biomarker adaptive designs in clinical trials James J. Chen 1, Tzu-Pin Lu 1,2, Dung-Tsa Chen 3, Sue-Jane Wang 4 1 Division of Bioinformatics and Biostatistics, National Center for Toxicological

More information

SNPrints: Defining SNP signatures for prediction of onset in complex diseases

SNPrints: Defining SNP signatures for prediction of onset in complex diseases SNPrints: Defining SNP signatures for prediction of onset in complex diseases Linda Liu, Biomedical Informatics, Stanford University Daniel Newburger, Biomedical Informatics, Stanford University Grace

More information

Natural Selection & People. Descendents of colonizers. Natural selection. Jeanne Sept 9/8/04. P200 Lecture 1

Natural Selection & People. Descendents of colonizers. Natural selection. Jeanne Sept 9/8/04. P200 Lecture 1 Natural Selection & People Human impact on evolutionary process Evolutionary process impact on people Ethical questions Descendents of colonizers Natural selection ONE mechanism for evolutionary change

More information

Machine Learning for Predicting Delayed Onset Trauma Following Ischemic Stroke

Machine Learning for Predicting Delayed Onset Trauma Following Ischemic Stroke Machine Learning for Predicting Delayed Onset Trauma Following Ischemic Stroke Anthony Ma 1, Gus Liu 1 Department of Computer Science, Stanford University, Stanford, CA 94305 Stroke is currently the third

More information

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. SAMPLE REPORT SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. RESULTS SNP Array Copy Number Variations Result: LOSS,

More information

A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data

A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Method A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Xiao-Gang Ruan, Jin-Lian Wang*, and Jian-Geng Li Institute of Artificial Intelligence and

More information

Integrative Methods for Gene Data Analysis and Knowledge Discovery on the Case Study of KEDRI s Brain Gene Ontology. Yuepeng Wang

Integrative Methods for Gene Data Analysis and Knowledge Discovery on the Case Study of KEDRI s Brain Gene Ontology. Yuepeng Wang Integrative Methods for Gene Data Analysis and Knowledge Discovery on the Case Study of KEDRI s Brain Gene Ontology Yuepeng Wang A thesis submitted to Auckland University of Technology in partial fulfilment

More information

Computational Analysis of Genome-Wide DNA Copy Number Changes

Computational Analysis of Genome-Wide DNA Copy Number Changes Computational Analysis of Genome-Wide DNA Copy Number Changes Lei Song Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements

More information

Probability and Punnett Squares

Probability and Punnett Squares Performance Task Probability and Punnett Squares Essential Knowledge 3.A.3 Challenge Area 3.14 Building Block A The chromosomal basis of inheritance provides an understanding of the pattern of passage

More information

Size Matters: the Structural Effect of Social Context

Size Matters: the Structural Effect of Social Context Size Matters: the Structural Effect of Social Context Siwei Cheng Yu Xie University of Michigan Abstract For more than five decades since the work of Simmel (1955), many social science researchers have

More information

Package netclass. July 2, 2014

Package netclass. July 2, 2014 Package netclass Jul 2, 2014 Version 1.2.1 Date 2013-12-03 Title netclass: An R Package for Network-Based Biomarker Discover Author Yupeng Cun Maintainer netclass is an R package for network-based feature

More information

Lithium-induced Tubular Dysfunction. Jun Ki Park 11/30/10

Lithium-induced Tubular Dysfunction. Jun Ki Park 11/30/10 Lithium-induced Tubular Dysfunction Jun Ki Park 11/30/10 Use of Lithium Mid 19 th century: treatment of gout Late 19 th century: used for psychiatric disorders Early 20 th century: sodium substitute to

More information

Rare diseases: People, policy and practice. Dr Caroline Graham Office of Population Health Genomics

Rare diseases: People, policy and practice. Dr Caroline Graham Office of Population Health Genomics Rare diseases: People, policy and practice Dr Caroline Graham Office of Population Health Genomics Background Rare Diseases: People Policy Practice Rare Diseases: People Policy Practice Rare Diseases Individually

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Multi-Scale Entropy: A Framework to Quantify Health Status of Human and Machine

Multi-Scale Entropy: A Framework to Quantify Health Status of Human and Machine 2011-06-28 Multi-Scale Entropy: A Framework to Quantify Health Status of Human and Machine C.-K. Peng, PhD Co-Director, Margret and H.A. Rey Institute for Nonlinear Dynamics in Medicine BIDMC / Harvard

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

Simple Decision Rules for Classifying Human Cancers from Gene Expression Profiles

Simple Decision Rules for Classifying Human Cancers from Gene Expression Profiles Simple Decision Rules for Classifying Human Cancers from Gene Expression Profiles Aik Choon TAN Post-Doc Research Fellow actan@jhu.edu Prof. Raimond L. Winslow rwinslow@jhu.edu, Director, ICM & CCBM, Prof.

More information

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Andy Nguyen, M.D., M.S. Medical Director, Hematopathology, Hematology and Coagulation Laboratory,

More information

NMF-Density: NMF-Based Breast Density Classifier

NMF-Density: NMF-Based Breast Density Classifier NMF-Density: NMF-Based Breast Density Classifier Lahouari Ghouti and Abdullah H. Owaidh King Fahd University of Petroleum and Minerals - Department of Information and Computer Science. KFUPM Box 1128.

More information

Fighting Cancer with Data: Perspectives on MD Anderson Moon Shot Program

Fighting Cancer with Data: Perspectives on MD Anderson Moon Shot Program Fighting Cancer with Data: Perspectives on MD Anderson Moon Shot Program Joxel Garcia, MD, MBA Executive Director, Cancer Prevention and Control Platform Member, Moon Shots Program Leadership Team The

More information

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics Copy Number Variations and Association Mapping 02-715 Advanced Topics in Computa8onal Genomics SNP and CNV Genotyping SNP genotyping assumes two copy numbers at each locus (i.e., no CNVs) CNV genotyping

More information

From Sentiment to Emotion Analysis in Social Networks

From Sentiment to Emotion Analysis in Social Networks From Sentiment to Emotion Analysis in Social Networks Jie Tang Department of Computer Science and Technology Tsinghua University, China 1 From Info. Space to Social Space Info. Space! Revolutionary changes!

More information

Efficient AUC Optimization for Information Ranking Applications

Efficient AUC Optimization for Information Ranking Applications Efficient AUC Optimization for Information Ranking Applications Sean J. Welleck IBM, USA swelleck@us.ibm.com Abstract. Adequate evaluation of an information retrieval system to estimate future performance

More information