Data and text mining applied to the computational study of protein interaction networks

Size: px
Start display at page:

Download "Data and text mining applied to the computational study of protein interaction networks"

Transcription

1 Data and text mining applied to the computational study of protein interaction networks Miguel Andrade Faculty of Biology, Johannes Gutenberg University Institute of Molecular Biology Mainz, Germany

2 PubMed

3 PubMed

4 PubMed

5 MedlineRanker Jean-Fred Fontaine Noun weights: linear naive Bayesian classifier Split-Laplace smoothing scheme to counteract class skew. Ranks MEDLINE according to a topic Fontaine et al. (2009) Nucleic Acids Research

6 MedlineRanker Jean-Fred Fontaine Ranks MEDLINE according to a topic Fontaine et al. (2009) Nucleic Acids Research

7 MedlineRanker Jean-Fred Fontaine Abstract scores: sum of the weights of its nouns. P-values: proportion of abstracts with a higher score within 10,000 recent abstracts.

8 Génie Ranks a set of genes from a whole genome according to a topic Human Fontaine et al. (2011) Nucleic Acids Research

9 Génie P-values for genes: one-sided Fisher s exact test comparing the number of selected abstracts to a set of 10,000 randomly selected abstracts (P<0.01 by default). FDR: Benjamini and Hochberg

10 Alkemio Ranks chemicals according to a topic Fontaine et al. (2014) Nucleic Acids Research

11 Alkemio

12 PESCADOR Adriano Barbosa Extract interactions and filter by concepts Barbosa-Silva et al. (2010) BMC Bioinformatics Barbosa-Silva et al. (2011) BMC Bioinformatics

13 PESCADOR

14 Co-occurrences types PESCADOR Type 1 Term + [Biointeraction] + Term Type 2 [Biointeraction] +Term + Term + [Biointeraction] Type 3 Term + Term Type 4 co-occurrence in abstract

15

16

17

18 HIPPIE Martin Schaefer Schaefer et al (2012) PLoS One Schaefer et al (2013) PLoS Comp Biol

19 HIPPIE Human Integrated Protein- Protein Interaction reference Expert knowledge on experimental reliability Extraction of: Publications Experimental techniques Orthologous interacting protein pairs Databases Automated retrieval of experimental human PPIs and associated meta data Scoring function: Σ(#studies; #techniques; #species) HIPPIE (>90,000 human, experimental PPIs) 19

20 Scoring function: Σ(#studies; #orthologs; #techniques) Score = ws ss + wo so + wt st with ws +wo + wt = 1 s i (0) = 0 and s i ( ) = 1 Optimization: IntAct dataset (28,073 interactions). 109 studies with >10 interactions and more than two PPIs found in multiple studies.. Grid search in the range [0, 3] for the aiand in the range [0, 1] for the wi (step width of 0.1 for both) as= 2.3, ao =1.6, at = 0.2, ws= 0.6, wo = 0.1, wt = 0.3 max(f) = 1.023

21 HIPPIE web frontend

22 HIPPIE web frontend

23

24

25

26

27

28 Function of Martin Schaefer Human Dog Mouse Opossum Chicken Frog Zebrafish Trout Fugu Stickleback Lancelet Capitella Limpet Nematostella Trichoplax Ciona intestinalis Ciona savignyi D. melanogaster D. mojavensis D. sechellia D. erecta D. yakuba D. grimshawi D. pseudoobscura D. persimilis D. ananassae D. willistoni D. virilis in Huntingtin Schaefer et al (2012) Nucleic Acids Res.

29 partners human TFs long non

30 human yeast partners TFs long non TFs long non

31 86 human proteins N-terminal 13 proteins with near polyp 109 regions 54 overlap/near coiled coil coiled coil C-terminal polyp 12 C-term

32 86 human proteins N-terminal 13 proteins with near polyp 40 human /coiled-coil proteins (no polyp) coiled coil 36 N-term C-terminal polyp 12 C-term

33 86 human proteins interacting proteins 49 interactions with another protein (p-value = ) coiled coil coiled coil polyp polyp P-value: simulation, comparison to random set with similar or higher degree

34 86 human proteins Non- interacting proteins coiled coil coiled coil polyp Enrichment (p-value < 2.2e-16) P-value: Fisher s exact test

35 protein unbound N-terminal coiled coil disordered polyp C-terminal

36 protein unbound N-terminal protein protein X coiled coil disordered polyp coiled coil polyp C-terminal

37 protein unbound N-terminal protein bound protein X coiled coil disordered polyp coiled coil polyp C-terminal

38 ATXN1Q82 NT is toxic ATXN1Q82 NT aggregates Spyros Petrakis Petrakis et al. (2012) PLoS Genetics

39 interactors that change ATXN1Q82 NT toxicity

40

41 Normal protein CC disordered CC partner

42 Normal protein CC disordered CC partner

43 Normal protein Toxic protein CC disordered CC partner alpha-helix

44 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix

45 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix beta-aggregates

46 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix beta-aggregates

47 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

48 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

49 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

50 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

51 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

52 Normal protein Toxic protein CC disordered beta-aggregates CC partner alpha-helix increased beta-aggregates non-cc partner

53 HTT network 509 proteins 1319 interactions Brain network 88 proteins 113 interactions Erich Wanker Caudate nucleus network 66 proteins 84 interactions Matthias Futschik HD dysreg 14 proteins 13 interactions David Fournier Stroedicke et al. (2015) Genome Research

54

55 CRMP1 MED15

56 Computational Biology and Data Mining group Enrique Muro Jean-Fred Fontaine Katerina Taṡkova Sweta Talyan Marie Gebhardt (MDC) Jonas Ibn-Salem Desirée Kaufmann Pablo Mier Gregorio Alanis

57 Thank you!

58

59

60

61

62

63

1 Supplementary Figures

1 Supplementary Figures Supplementary Figures D. simulans (dsim) D. sechellia (dsec) D. melanogaster (dmel) S. cerevisiae (scer) S. paradoxus (spar) S. mikatae (smik) S. bayanus (sbay) S. castellii (scas) C. glabrata (cgla) K.

More information

Supplementary figure legends

Supplementary figure legends Supplementary figure legends SUPPLEMENTRY FIGURE S1. Lentiviral construct. Schematic representation of the PCR fragment encompassing the genomic locus of mir-33a that was introduced in the lentiviral construct.

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

Supplemental Materials. for. Conservation of an RNA Regulatory Map between Drosophila and. Mammals

Supplemental Materials. for. Conservation of an RNA Regulatory Map between Drosophila and. Mammals Supplemental Materials for Conservation of an RNA Regulatory Map between Drosophila and Mammals Angela N. Brooks, Li Yang, Michael O. Duff, Kasper Daniel Hansen, Jung W. Park, Sandrine Dudoit, Steven E.

More information

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend Table of content -Supplementary methods -Figure S1 -Figure S2 -Figure S3 -Table legend Supplementary methods Yeast two-hybrid bait basal transactivation test Because bait constructs sometimes self-transactivate

More information

A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm:

A quick review. The clustering problem: Hierarchical clustering algorithm: Many possible distance metrics K-mean clustering algorithm: The clustering problem: partition genes into distinct sets with high homogeneity and high separation Hierarchical clustering algorithm: 1. Assign each object to a separate cluster. 2. Regroup the pair

More information

Facts from text: Automated gene annotation with ontologies and text-mining

Facts from text: Automated gene annotation with ontologies and text-mining 1. Workshop des GI-Arbeitskreises Ontologien in Biomedizin und Lebenswissenschaften (OBML) Facts from text: Automated gene annotation with ontologies and text-mining Conrad Plake Schroeder Group (Bioinformatics),

More information

Expanded View Figures

Expanded View Figures Solip Park & Ben Lehner Epistasis is cancer type specific Molecular Systems Biology Expanded View Figures A B G C D E F H Figure EV1. Epistatic interactions detected in a pan-cancer analysis and saturation

More information

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Introduction RNA splicing is a critical step in eukaryotic gene

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

Data Mining in Bioinformatics Day 4: Text Mining

Data Mining in Bioinformatics Day 4: Text Mining Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1 What is text mining?

More information

Evaluating Classifiers for Disease Gene Discovery

Evaluating Classifiers for Disease Gene Discovery Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics

More information

Huntington s Disease and its therapeutic target genes: A global functional profile based on the HD Research Crossroads database

Huntington s Disease and its therapeutic target genes: A global functional profile based on the HD Research Crossroads database Supplementary Analyses and Figures Huntington s Disease and its therapeutic target genes: A global functional profile based on the HD Research Crossroads database Ravi Kiran Reddy Kalathur, Miguel A. Hernández-Prieto

More information

DSAP : Deep-sequencing Small RNA Analysis Pipeline 長庚生資中心

DSAP : Deep-sequencing Small RNA Analysis Pipeline 長庚生資中心 DSAP : Deep-sequencing Small RNA Analysis Pipeline 黃柏榕 長庚生資中心 Prominent members of the RNA family Nature 451, 414-416(24 January 2008) Formation and function of small RNAs Nature 451, 414-416(24 January

More information

Journal: Nature Methods

Journal: Nature Methods Journal: Nature Methods Article Title: Network-based stratification of tumor mutations Corresponding Author: Trey Ideker Supplementary Item Supplementary Figure 1 Supplementary Figure 2 Supplementary Figure

More information

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and

More information

Integration of high-throughput biological data

Integration of high-throughput biological data Integration of high-throughput biological data Jean Yang and Vivek Jayaswal School of Mathematics and Statistics University of Sydney Meeting the Challenges of High Dimension: Statistical Methodology,

More information

SUPPLEMENTAL INFORMATION

SUPPLEMENTAL INFORMATION SUPPLEMENTAL INFORMATION GO term analysis of differentially methylated SUMIs. GO term analysis of the 458 SUMIs with the largest differential methylation between human and chimp shows that they are more

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1

Nature Neuroscience: doi: /nn Supplementary Figure 1 Supplementary Figure 1 Illustration of the working of network-based SVM to confidently predict a new (and now confirmed) ASD gene. Gene CTNND2 s brain network neighborhood that enabled its prediction by

More information

EXPression ANalyzer and DisplayER

EXPression ANalyzer and DisplayER EXPression ANalyzer and DisplayER Tom Hait Aviv Steiner Igor Ulitsky Chaim Linhart Amos Tanay Seagull Shavit Rani Elkon Adi Maron-Katz Dorit Sagir Eyal David Roded Sharan Israel Steinfeld Yossi Shiloh

More information

Génétique des DCP. Deciphering the molecular bases of ciliopathies. Estelle Escudier, INSERM U 681 Serge Amselem, INSERM U 654

Génétique des DCP. Deciphering the molecular bases of ciliopathies. Estelle Escudier, INSERM U 681 Serge Amselem, INSERM U 654 Génétique des DCP Estelle Escudier, INSERM U 681 Serge Amselem, INSERM U 654 Deciphering the molecular bases of ciliopathies Linkage analyses Candidate gene approaches Chromosome abnormalities Comparative

More information

Supplementary Materials for

Supplementary Materials for www.sciencetranslationalmedicine.org/cgi/content/full/7/283/283ra54/dc1 Supplementary Materials for Clonal status of actionable driver events and the timing of mutational processes in cancer evolution

More information

Automated Social Network Epidemic Data Collector

Automated Social Network Epidemic Data Collector Automated Social Network Epidemic Data Collector Luis F Lopes, João M Zamite, Bruno C Tavares, Francisco M Couto, Fabrício Silva, and Mário J Silva LaSIGE, Universidade de Lisboa epiwork@di.fc.ul.pt Abstract.

More information

Statistical analysis of RIM data (retroviral insertional mutagenesis) Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam

Statistical analysis of RIM data (retroviral insertional mutagenesis) Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam Statistical analysis of RIM data (retroviral insertional mutagenesis) Lodewyk Wessels Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam Viral integration Viral integration Viral

More information

Supplement to SCnorm: robust normalization of single-cell RNA-seq data

Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplementary Note 1: SCnorm does not require spike-ins, since we find that the performance of spike-ins in scrna-seq is often compromised,

More information

Sub-Topic Classification of HIV related Opportunistic Infections. Miguel Anderson and Joseph Fonseca

Sub-Topic Classification of HIV related Opportunistic Infections. Miguel Anderson and Joseph Fonseca Sub-Topic Classification of HIV related Opportunistic Infections Miguel Anderson and Joseph Fonseca Introduction Image collected from the CDC https://www.cdc.gov/hiv/basics/statistics.html Background Info

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Analysis of gene expression in blood before diagnosis of ovarian cancer

Analysis of gene expression in blood before diagnosis of ovarian cancer Analysis of gene expression in blood before diagnosis of ovarian cancer Different statistical methods Note no. Authors SAMBA/10/16 Marit Holden and Lars Holden Date March 2016 Norsk Regnesentral Norsk

More information

cis-regulatory enrichment analysis in human, mouse and fly

cis-regulatory enrichment analysis in human, mouse and fly cis-regulatory enrichment analysis in human, mouse and fly Zeynep Kalender Atak, PhD Laboratory of Computational Biology VIB-KU Leuven Center for Brain & Disease Research Laboratory of Computational Biology

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Last-Chance Ambulatory Care Webinar Thursday, September 5, 2013 Learning Objectives For

More information

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Pharmacotherapy Webinar Review Course Tuesday, September 3, 2013 Descriptive statistics:

More information

Gene and Pathway Analysis of Metabolic Traits in Dairy Cows

Gene and Pathway Analysis of Metabolic Traits in Dairy Cows Gene and Pathway Analysis of Metabolic Traits in Dairy Cows Ngoc-Thuy Ha Animal Breeding and Genetics Group Department of Animal Sciences Georg-August-University Goettingen, Germany 1 1 Motivation Background

More information

A framework for the study of diseases and adverse drug reactions

A framework for the study of diseases and adverse drug reactions A framework for the study of diseases and adverse drug reactions Laura I. Furlong IBI group Research Programme on Biomedical Informatics (GRIB) Hospital del Mar Research Institute (IMIM) Information on

More information

FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342

FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342 FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342 2006 TABLE OF CONTENTS I. Overview... 3 II. Genes... 4 III. Clustal Analysis... 15 IV. Repeat Analysis... 17 V.

More information

A two-step drug repositioning method based on a protein-protein interaction network of genes shared by two diseases and the similarity of drugs

A two-step drug repositioning method based on a protein-protein interaction network of genes shared by two diseases and the similarity of drugs www.bioinformation.net Hypothesis Volume 9(2) A two-step drug repositioning method based on a protein-protein interaction network of genes shared by two diseases and the similarity of drugs Yutaka Fukuoka*,

More information

Unsupervised Identification of Isotope-Labeled Peptides

Unsupervised Identification of Isotope-Labeled Peptides Unsupervised Identification of Isotope-Labeled Peptides Joshua E Goldford 13 and Igor GL Libourel 124 1 Biotechnology institute, University of Minnesota, Saint Paul, MN 55108 2 Department of Plant Biology,

More information

DISCOVERING IMPLICIT ASSOCIATIONS BETWEEN GENES AND HEREDITARY DISEASES

DISCOVERING IMPLICIT ASSOCIATIONS BETWEEN GENES AND HEREDITARY DISEASES DISCOVERING IMPLICIT ASSOCIATIONS BETWEEN GENES AND HEREDITARY DISEASES KAZUHIRO SEKI Graduate School of Science and Technology, Kobe University 1-1 Rokkodai, Nada, Kobe 657-8501, Japan E-mail: seki@cs.kobe-u.ac.jp

More information

Supplementary Figure 1 IL-27 IL

Supplementary Figure 1 IL-27 IL Tim-3 Supplementary Figure 1 Tc0 49.5 0.6 Tc1 63.5 0.84 Un 49.8 0.16 35.5 0.16 10 4 61.2 5.53 10 3 64.5 5.66 10 2 10 1 10 0 31 2.22 10 0 10 1 10 2 10 3 10 4 IL-10 28.2 1.69 IL-27 Supplementary Figure 1.

More information

Department of Chemistry, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, H3C 3J7, Canada.

Department of Chemistry, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, H3C 3J7, Canada. Phosphoproteome dynamics of Saccharomyces cerevisiae under heat shock and cold stress Evgeny Kanshin 1,5, Peter Kubiniok 1,2,5, Yogitha Thattikota 1,3, Damien D Amours 1,3 and Pierre Thibault 1,2,4 * 1

More information

CS 4365: Artificial Intelligence Recap. Vibhav Gogate

CS 4365: Artificial Intelligence Recap. Vibhav Gogate CS 4365: Artificial Intelligence Recap Vibhav Gogate Exam Topics Search BFS, DFS, UCS, A* (tree and graph) Completeness and Optimality Heuristics: admissibility and consistency CSPs Constraint graphs,

More information

User Guide. Association analysis. Input

User Guide. Association analysis. Input User Guide TFEA.ChIP is a tool to estimate transcription factor enrichment in a set of differentially expressed genes using data from ChIP-Seq experiments performed in different tissues and conditions.

More information

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo Micro-RNA web tools UBio Training Courses Gonzalo Gómez//ggomez@cnio.es Introduction mirnas, target prediction, biology Experimental data Network Filtering Pathway interpretation mirs-pathways network

More information

Supplementary Methods

Supplementary Methods Supplementary Methods Analysis of time course gene expression data. The time course data of the expression level of a representative gene is shown in the below figure. The trajectory of longitudinal expression

More information

The FunCluster Package

The FunCluster Package The FunCluster Package October 23, 2007 Version 1.07 Date 2007-10-23 Title Functional Profiling of Microarray Expression Data Author Corneliu Henegar Maintainer Corneliu Henegar

More information

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach) High-Throughput Sequencing Course Gene-Set Analysis Biostatistics and Bioinformatics Summer 28 Section Introduction What is Gene Set Analysis? Many names for gene set analysis: Pathway analysis Gene set

More information

IMPaLA tutorial.

IMPaLA tutorial. IMPaLA tutorial http://impala.molgen.mpg.de/ 1. Introduction IMPaLA is a web tool, developed for integrated pathway analysis of metabolomics data alongside gene expression or protein abundance data. It

More information

Mature microrna identification via the use of a Naive Bayes classifier

Mature microrna identification via the use of a Naive Bayes classifier Mature microrna identification via the use of a Naive Bayes classifier Master Thesis Gkirtzou Katerina Computer Science Department University of Crete 13/03/2009 Gkirtzou K. (CSD UOC) Mature microrna identification

More information

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets.

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets. Supplementary Figure 1 Transcriptional program of the TE and MP CD8 + T cell subsets. (a) Comparison of gene expression of TE and MP CD8 + T cell subsets by microarray. Genes that are 1.5-fold upregulated

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

Early Learning vs Early Variability 1.5 r = p = Early Learning r = p = e 005. Early Learning 0.

Early Learning vs Early Variability 1.5 r = p = Early Learning r = p = e 005. Early Learning 0. The temporal structure of motor variability is dynamically regulated and predicts individual differences in motor learning ability Howard Wu *, Yohsuke Miyamoto *, Luis Nicolas Gonzales-Castro, Bence P.

More information

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts jsci2016 Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Wutthipong Kongburan, Praisan Padungweang, Worarat Krathu, Jonathan H. Chan School of Information Technology King

More information

Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model

Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model Yiqian Zhou 1, Rehman Qureshi 2, and Ahmet Sacan 3 1 Pure Storage, 650 Castro Street, Suite #260,

More information

Watson-Crick Model of B-DNA

Watson-Crick Model of B-DNA Reading: Ch8; 285-290 Ch24; 963-978 Problems: Ch8 (text); 9 Ch8 (study-guide: facts); 3 Ch24 (text); 5,7,9,10,14,16 Ch24 (study-guide: applying); 1 Ch24 (study-guide: facts); 1,2,4 NEXT Reading: Ch1; 29-34

More information

SUPPLEMENTARY APPENDIX

SUPPLEMENTARY APPENDIX SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 Frequency of alternative-cassette-exon engagement with the ribosome is consistent across data from multiple human cell types and from mouse stem cells. Box plots showing AS frequency

More information

COMPARATIVE BEHAVIOURAL ANALYSIS OF MATING BETWEEN YELLOW AND WILD TYPE DROSOPHILA OF MELANOGASTER SPECIES GROUP

COMPARATIVE BEHAVIOURAL ANALYSIS OF MATING BETWEEN YELLOW AND WILD TYPE DROSOPHILA OF MELANOGASTER SPECIES GROUP Journal of Scientific Research Vol. 58, 2014 : 45-50 Banaras Hindu University, Varanasi ISSN : 0447-9483 COMPARATIVE BEHAVIOURAL ANALYSIS OF MATING BETWEEN YELLOW AND WILD TYPE DROSOPHILA OF MELANOGASTER

More information

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis Jian Xu, Ph.D. Children s Research Institute, UTSW Introduction Outline Overview of genomic and next-gen sequencing technologies

More information

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping Melanie Rae Simpson PhD candidate Department of Public Health and General Practice Norwegian University of

More information

Direct memory access using two cues: Finding the intersection of sets in a connectionist model

Direct memory access using two cues: Finding the intersection of sets in a connectionist model Direct memory access using two cues: Finding the intersection of sets in a connectionist model Janet Wiles, Michael S. Humphreys, John D. Bain and Simon Dennis Departments of Psychology and Computer Science

More information

Linear Regression in SAS

Linear Regression in SAS 1 Suppose we wish to examine factors that predict patient s hemoglobin levels. Simulated data for six patients is used throughout this tutorial. data hgb_data; input id age race $ bmi hgb; cards; 21 25

More information

High Specificity - a Necessity for Automated Detection of Lead Reversals in the 12-lead ECG

High Specificity - a Necessity for Automated Detection of Lead Reversals in the 12-lead ECG LU TP 99-24 High Specificity - a Necessity for Automated Detection of Lead Reversals in the 12-lead ECG Mattias Ohlsson, PhD 1, Bo Hedén, MD, PhD 2, Lars Edenbrandt, MD, PhD 2 Departments of 1 Theoretical

More information

Relaxation of Selective Constraints Causes Independent Selenoprotein Extinction in Insect Genomes

Relaxation of Selective Constraints Causes Independent Selenoprotein Extinction in Insect Genomes Relaxation of Selective Constraints Causes Independent Selenoprotein Extinction in Insect Genomes Charles E. Chapple 1, Roderic Guigó 2 * 1 Center for Genomic Regulation, Universitat Pompeu Fabra and Institut

More information

Influenza Virus HA Subtype Numbering Conversion Tool and the Identification of Candidate Cross-Reactive Immune Epitopes

Influenza Virus HA Subtype Numbering Conversion Tool and the Identification of Candidate Cross-Reactive Immune Epitopes Influenza Virus HA Subtype Numbering Conversion Tool and the Identification of Candidate Cross-Reactive Immune Epitopes Brian J. Reardon, Ph.D. J. Craig Venter Institute breardon@jcvi.org Introduction:

More information

Gene Expression Analysis Web Forum. Jonathan Gerstenhaber Field Application Specialist

Gene Expression Analysis Web Forum. Jonathan Gerstenhaber Field Application Specialist Gene Expression Analysis Web Forum Jonathan Gerstenhaber Field Application Specialist Our plan today: Import Preliminary Analysis Statistical Analysis Additional Analysis Downstream Analysis 2 Copyright

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Supplementary Materials RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Junhee Seok 1*, Weihong Xu 2, Ronald W. Davis 2, Wenzhong Xiao 2,3* 1 School of Electrical Engineering,

More information

Bjoern Peters La Jolla Institute for Allergy and Immunology Buenos Aires, Oct 31, 2012

Bjoern Peters La Jolla Institute for Allergy and Immunology Buenos Aires, Oct 31, 2012 www.iedb.org Bjoern Peters bpeters@liai.org La Jolla Institute for Allergy and Immunology Buenos Aires, Oct 31, 2012 Overview 1. Introduction to the IEDB 2. Application: 2009 Swine-origin influenza virus

More information

Using Electronic Health Records to Assess Depression and Cancer Comorbidities

Using Electronic Health Records to Assess Depression and Cancer Comorbidities 236 Informatics for Health: Connected Citizen-Led Wellness and Population Health R. Randell et al. (Eds.) 2017 European Federation for Medical Informatics (EFMI) and IOS Press. This article is published

More information

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data Nucleic Acids Research Advance Access published February 1, 2012 Nucleic Acids Research, 2012, 1 13 doi:10.1093/nar/gkr1291 MATS: a Bayesian framework for flexible detection of differential alternative

More information

A MODIFIED FREQUENCY BASED TERM WEIGHTING APPROACH FOR INFORMATION RETRIEVAL

A MODIFIED FREQUENCY BASED TERM WEIGHTING APPROACH FOR INFORMATION RETRIEVAL Int. J. Chem. Sci.: 14(1), 2016, 449-457 ISSN 0972-768X www.sadgurupublications.com A MODIFIED FREQUENCY BASED TERM WEIGHTING APPROACH FOR INFORMATION RETRIEVAL M. SANTHANAKUMAR a* and C. CHRISTOPHER COLUMBUS

More information

Gene Ontology 2 Function/Pathway Enrichment. Biol4559 Thurs, April 12, 2018 Bill Pearson Pinn 6-057

Gene Ontology 2 Function/Pathway Enrichment. Biol4559 Thurs, April 12, 2018 Bill Pearson Pinn 6-057 Gene Ontology 2 Function/Pathway Enrichment Biol4559 Thurs, April 12, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Function/Pathway enrichment analysis do sets (subsets) of differentially expressed

More information

Analyzing Spammers Social Networks for Fun and Profit

Analyzing Spammers Social Networks for Fun and Profit Chao Yang Robert Harkreader Jialong Zhang Seungwon Shin Guofei Gu Texas A&M University Analyzing Spammers Social Networks for Fun and Profit A Case Study of Cyber Criminal Ecosystem on Twitter Presentation:

More information

SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN

SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN Structural variation (SV) Variants larger than 50bps Affect

More information

Schema-Driven Relationship Extraction from Unstructured Text

Schema-Driven Relationship Extraction from Unstructured Text Wright State University CORE Scholar Kno.e.sis Publications The Ohio Center of Excellence in Knowledge- Enabled Computing (Kno.e.sis) 2007 Schema-Driven Relationship Extraction from Unstructured Text Cartic

More information

Cross species analysis of genomics data. Computational Prediction of mirnas and their targets

Cross species analysis of genomics data. Computational Prediction of mirnas and their targets 02-716 Cross species analysis of genomics data Computational Prediction of mirnas and their targets Outline Introduction Brief history mirna Biogenesis Why Computational Methods? Computational Methods

More information

Automating network meta-analysis

Automating network meta-analysis Automating network meta-analysis Gert van Valkenhoef Department of Epidemiology, University Medical Center Groningen (NL), Faculty of Economics and Business, University of Groningen (NL) Health Economics

More information

INVESTIGATION SLEEPLESS DRIVERS

INVESTIGATION SLEEPLESS DRIVERS 50644_05_ch5_p407-482.qxd 5/10/05 12:08 PM Page 420 420 5.1 COMPARING TWO SAMPLES ON A CATEGORICAL RESPONSE retain the penny and the proportion of female students at this university who would vote to retain

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

An Experimental Study of Diabetes Disease Prediction System Using Classification Techniques

An Experimental Study of Diabetes Disease Prediction System Using Classification Techniques IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 1, Ver. IV (Jan.-Feb. 2017), PP 39-44 www.iosrjournals.org An Experimental Study of Diabetes Disease

More information

IMGT Collier de Perles: the New Look for IgSF and MhcSF in IMGT

IMGT Collier de Perles: the New Look for IgSF and MhcSF in IMGT IMGT Collier de Perles: the New Look for IgSF and MhcSF in IMGT Quentin Kaas and Marie-Paule Lefranc Laboratoire d ImmunoGénétique Moléculaire Université Montpellier, UPR CNRS 1142, IGH Institut Universitaire

More information

Title: Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles

Title: Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles Author's response to reviews Title: Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles Authors: Julia Tchou (julia.tchou@uphs.upenn.edu) Andrew V Kossenkov (akossenkov@wistar.org)

More information

VL Network Analysis ( ) SS2016 Week 3

VL Network Analysis ( ) SS2016 Week 3 VL Network Analysis (19401701) SS2016 Week 3 Based on slides by J Ruan (U Texas) Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin 1 Motivation 2 Lecture

More information

The Meta on Meta-Analysis. Presented by Endia J. Lindo, Ph.D. University of North Texas

The Meta on Meta-Analysis. Presented by Endia J. Lindo, Ph.D. University of North Texas The Meta on Meta-Analysis Presented by Endia J. Lindo, Ph.D. University of North Texas Meta-Analysis What is it? Why use it? How to do it? Challenges and benefits? Current trends? What is meta-analysis?

More information

ARTICLE RESEARCH. Macmillan Publishers Limited. All rights reserved

ARTICLE RESEARCH. Macmillan Publishers Limited. All rights reserved Extended Data Figure 6 Annotation of drivers based on clinical characteristics and co-occurrence patterns. a, Putative drivers affecting greater than 10 patients were assessed for enrichment in IGHV mutated

More information

Strength of functional signature correlates with effect size in autism

Strength of functional signature correlates with effect size in autism Ballouz and Gillis Genome Medicine (217) 9:64 DOI 1.1186/s1373-17-455-8 RESEARCH Open Access Strength of functional signature correlates with effect size in autism Sara Ballouz and Jesse Gillis * Abstract

More information

Computational aspects of ChIP-seq. John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory

Computational aspects of ChIP-seq. John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory Computational aspects of ChIP-seq John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory ChIP-seq Using highthroughput sequencing to investigate DNA

More information

Regression. Page 1. Variables Entered/Removed b Variables. Variables Removed. Enter. Method. Psycho_Dum

Regression. Page 1. Variables Entered/Removed b Variables. Variables Removed. Enter. Method. Psycho_Dum Regression Model Variables Entered/Removed b Variables Entered Variables Removed Method Meds_Dum,. Enter Psycho_Dum a. All requested variables entered. b. Dependent Variable: Beck's Depression Score Model

More information

Pilot Study: Clinical Trial Task Ontology Development. A prototype ontology of common participant-oriented clinical research tasks and

Pilot Study: Clinical Trial Task Ontology Development. A prototype ontology of common participant-oriented clinical research tasks and Pilot Study: Clinical Trial Task Ontology Development Introduction A prototype ontology of common participant-oriented clinical research tasks and events was developed using a multi-step process as summarized

More information

From mirna regulation to mirna - TF co-regulation: computational

From mirna regulation to mirna - TF co-regulation: computational From mirna regulation to mirna - TF co-regulation: computational approaches and challenges 1,* Thuc Duy Le, 1 Lin Liu, 2 Junpeng Zhang, 3 Bing Liu, and 1,* Jiuyong Li 1 School of Information Technology

More information

BREAST CANCER EPIDEMIOLOGY MODEL:

BREAST CANCER EPIDEMIOLOGY MODEL: BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer

More information

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W 5.1.2007 Overview High-quality finished sequence is much more useful for research once it is annotated. Annotation is a fundamental

More information

Selective reporting, quality of reporting and statistical significance chasing in prognostic marker studies

Selective reporting, quality of reporting and statistical significance chasing in prognostic marker studies Selective reporting, quality of reporting and statistical significance chasing in prognostic marker studies Panayiotis A. Kyzas 1,2 1: Research Fellow, Department of Hygiene and Epidemiology, University

More information

Supplementary Information. Supplementary Figures

Supplementary Information. Supplementary Figures Supplementary Information Supplementary Figures.8 57 essential gene density 2 1.5 LTR insert frequency diversity DEL.5 DUP.5 INV.5 TRA 1 2 3 4 5 1 2 3 4 1 2 Supplementary Figure 1. Locations and minor

More information

Genetic Variation Junior Science

Genetic Variation Junior Science 2018 Version Genetic Variation Junior Science http://img.publishthis.com/images/bookmarkimages/2015/05/d/5/c/d5cf017fb4f7e46e1c21b874472ea7d1_bookmarkimage_620x480_xlarge_original_1.jpg Sexual Reproduction

More information

Introduction to Discrimination in Microarray Data Analysis

Introduction to Discrimination in Microarray Data Analysis Introduction to Discrimination in Microarray Data Analysis Jane Fridlyand CBMB University of California, San Francisco Genentech Hall Auditorium, Mission Bay, UCSF October 23, 2004 1 Case Study: Van t

More information

Inference Methods for First Few Hundred Studies

Inference Methods for First Few Hundred Studies Inference Methods for First Few Hundred Studies James Nicholas Walker Thesis submitted for the degree of Master of Philosophy in Applied Mathematics and Statistics at The University of Adelaide (Faculty

More information

Analyse de données de séquençage haut débit

Analyse de données de séquençage haut débit Analyse de données de séquençage haut débit Vincent Lacroix Laboratoire de Biométrie et Biologie Évolutive INRIA ERABLE 9ème journée ITS 21 & 22 novembre 2017 Lyon https://its.aviesan.fr Sequencing is

More information

Edge attributes. Node attributes

Edge attributes. Node attributes HMDB ID input pane Network view Node attributes Edge attributes Supplemental Figure 1. Cytoscape App MetBridge Generator. The left pane is Cytoscape App MetBridge Generator (code name: rsmetabppi). User

More information

Data mining with Ensembl Biomart. Stéphanie Le Gras

Data mining with Ensembl Biomart. Stéphanie Le Gras Data mining with Ensembl Biomart Stéphanie Le Gras (slegras@igbmc.fr) Guidelines Genome data Genome browsers Getting access to genomic data: Ensembl/BioMart 2 Genome Sequencing Example: Human genome 2000:

More information