Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Size: px
Start display at page:

Download "Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition"

Transcription

1 re-mrna rediction Aids plice ite Recognition Gene Expression: Details (Eukaryotes) DNA pre-mrna mrna rotein nucleus gene rotein Donald J. atterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 DNA (chromosome) acific ymposium on Biocomputing University of Washington Computational Molecular Biology Group 1 cell premrna mrna 3 Architecture of a Gene Characteristics of human genes (Nature, 2/2001, Table 21) pre-mrna s transcribed from most genes contain introns, which must be spliced out to form useful mrnas Internal exon Exon number Introns Median 122 bp 7 1,023 bp Mean 145 bp 8.8 3,365 bp ample (size) Refeq alignments to draft genome sequence, with confirmed intron boundaries (43,317 exons) Refeq alignments to finished sequence (3,501 genes) Refeq alignments to finished sequence (27,238 introns) Exons: ' UTR 5' UTR 400 bp 240 bp 770 bp 300 bp Confirmed by mrna or ET on chromo 22 (689) Confirmed by mrna or ET on chromo 22 (463) Introns: a b c re-mrna Coding seq (CD) 1,100 bp 367 aa 1340bp 447 aa elected Refeq entries (1,804)* mrna Genomic extent 14 kb 27 kb elected Refeq entries (1,804)* Encodes a protein 7 * 1,804 selected Refeq entries were those with fulllength unambiguous alignment to finished sequence 8 1

2 Relevance of plice rediction Cell, Vol. 92, , February 6, 1998 Mechanical Devices of the pliceosome: Motors, Clocks, prings, and Things Jonathan. taley and Christine Guthrie 9 plice site prediction is critical to eukaryotic gene prediction. Average human gene has 8.8 exons Genes with over 175 exons known Current primary sequence models do not display the same discriminatory power that cells exhibit in vivo mall per-site error rate compounds 10 ossible acceptor splice sites re-mrna sequences Hypothesis rediction structure contains information useful for predicting splice site location. This information is in addition to primary sequence information. pecific instances of secondary structure variation affecting the splicing process. rimary redictions

3 ossible acceptor splice sites re-mrna sequences rimary rediction redictions Data et Drawn from 462 unrelated, annotated, multiexon human genes with standard splicing. (Reese 97) 1,980 acceptor splice sites (3 end of intron) 1,980 non-sites selected randomly Aligned to an AG consensus Located within 100 bases of an annotated acceptor splice site ossible acceptor splice sites re-mrna sequences rimary rediction What's in the rimary equence? exon 5 redictions intron 15 exon 16 3

4 What's in the rimary equence? A C G T intron acceptor splice site exon Weight Matrix Model (0 th order Markov Model) 17 equence-based Metric 1 st order Weight Array Matrix / Markov Model i (N i ={A,C,G,U} N i-1 ={A,C,G,U} ) Training Generate two conditional probability tables for positions ( 21,+3), one from positive examples and one from negative examples. Testing For each sequence, x, calculate its likelihood ratio: + & ( )# WAM x log 10 $ ' % ( x)!! WAM " 18 ossible acceptor splice sites re-mrna sequences Acceptor plice ite rimary equence rediction Model redictions

5 ossible acceptor splice sites re-mrna sequences rimary rediction redictions ptimal Folding Energy Max Helix score Neighbor airing Correlation Model ptimal Folding Energy...CUGCUUUCUCCCCUCUCAGGGACUUACAGUUUGAGAUGC... equence rediction 2. Max Helix What is the highest probability that a helix will form nearby? Calculate Calculate Htart, x HEnd, x Free Energy kcal/mole Free Energy kcal/mole Free Energy -2.0 kcal/mole 23 Helix MaxHelix i = max ( Htart, x, HEnd, x ) x" ( i! 5, i+ 5) 24 5

6 3. Neighbor airing Correlation Model 3. Neighbor airing Correlation Model Change the premrna alphabet from nucleotides to structural symbols Unpaired base aired base aired and stacked base Change the premrna alphabet from nucleotides to structural symbols Unpaired base aired base aired and stacked base Neighbor airing Correlation Model 2 nd order Markov Model i (N i ={,,} N i-1 ={,,} ^ N i-2 ={,,} ) Training Generate two conditional probability tables for positions ( 50,+3), one from positive examples and one from negative examples. Testing For each sequence, x, calculate its log likelihood ratio: rimary ossible acceptor splice sites re-mrna sequences rediction redictions & log 10 $ % + NCM ' NCM ( x) # ( x)!! "

7 ossible acceptor splice sites re-mrna sequences s Decision Trees Quinlan s C4.5 upport Vector Machines Noble s svm 1.1 Radial Basis Kernel degree 2 Both take a vector of statistics and produce a yes/no binary classifier. 29 rimary rediction redictions 31 Features WAM (baseline) WAM,FE Results (Decision Trees) Mean Accuracy (%) % Error Reduction 5.5 p LLR of Base airing 25% more likely for acceptor splice sites to pair at position -2 WAM,FE,NCM WAM,FE,MH WAM,FE,NCM,MH WAM = Weight Array Matrix (rimary equence Method) FE = ptimal Free Energy MH = Max Helix NCM = Neighbor airing Correlation Matrix Wilcoxon p-value under 10-fold cross-validation

8 LLR of Helix Initiation 35% more likely for acceptor splice sites to initiate a helix at position 2 and -1 LLR of Helix Results Continuation 45% more likely for acceptor splice sites to continue a helix through the splice site

9 Helix Formed at plice ite Acceptor Non-Acceptor r(no Helix) r(helix) r(folds Left) r(folds Right) Conclusions structure statistics correlate with splice site location. ur models (Max Helix, NCM) can represent some of the relevant secondary structure. These models capture correlations that current primary sequence models don t capture Future Work ther organisms ryza sativa (rice) in progress Donor splice sites ther features? More structure models tochastic Context Free Grammars? Acknowledgements Don aterson Ken Yasuhara Jeff toner Kevin Chu More Info 40 UW CE Computational Biology Group 41 9

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Donald J. Patterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 Pacific Symposium on Biocomputing University of Washington Computational

More information

PRE-mRNA SECONDARY STRUCTURE PREDICTION AIDS SPLICE SITE PREDICTION

PRE-mRNA SECONDARY STRUCTURE PREDICTION AIDS SPLICE SITE PREDICTION Pacific Symposium on Biocomputing 2002, Altman, et al. eds., Jan. 2002, World Scientific, pp 223 234 PRE-mRNA SECONDARY STRUCTURE PREDICTION AIDS SPLICE SITE PREDICTION DONALD J. PATTERSON, KEN YASUHARA,

More information

Gene Finding in Eukaryotes

Gene Finding in Eukaryotes Gene Finding in Eukaryotes Jan-Jaap Wesselink jjwesselink@cnio.es Computational and Structural Biology Group, Centro Nacional de Investigaciones Oncológicas Madrid, April 2008 Jan-Jaap Wesselink jjwesselink@cnio.es

More information

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Introduction RNA splicing is a critical step in eukaryotic gene

More information

Genetics. Instructor: Dr. Jihad Abdallah Transcription of DNA

Genetics. Instructor: Dr. Jihad Abdallah Transcription of DNA Genetics Instructor: Dr. Jihad Abdallah Transcription of DNA 1 3.4 A 2 Expression of Genetic information DNA Double stranded In the nucleus Transcription mrna Single stranded Translation In the cytoplasm

More information

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc. Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets

More information

MODULE 3: TRANSCRIPTION PART II

MODULE 3: TRANSCRIPTION PART II MODULE 3: TRANSCRIPTION PART II Lesson Plan: Title S. CATHERINE SILVER KEY, CHIYEDZA SMALL Transcription Part II: What happens to the initial (premrna) transcript made by RNA pol II? Objectives Explain

More information

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing Last update: 05/10/2017 MODULE 4: SPLICING Lesson Plan: Title MEG LAAKSO Removal of introns from messenger RNA by splicing Objectives Identify splice donor and acceptor sites that are best supported by

More information

Sebastian Jaenicke. trnascan-se. Improved detection of trna genes in genomic sequences

Sebastian Jaenicke. trnascan-se. Improved detection of trna genes in genomic sequences Sebastian Jaenicke trnascan-se Improved detection of trna genes in genomic sequences trnascan-se Improved detection of trna genes in genomic sequences 1/15 Overview 1. trnas 2. Existing approaches 3. trnascan-se

More information

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SUPPLEMENTARY FIGURES: Supplementary Figure 1 SUPPLEMENTARY FIGURES: Supplementary Figure 1 Supplementary Figure 1. Glioblastoma 5hmC quantified by paired BS and oxbs treated DNA hybridized to Infinium DNA methylation arrays. Workflow depicts analytic

More information

Studying Alternative Splicing

Studying Alternative Splicing Studying Alternative Splicing Meelis Kull PhD student in the University of Tartu supervisor: Jaak Vilo CS Theory Days Rõuge 27 Overview Alternative splicing Its biological function Studying splicing Technology

More information

SpliceDB: database of canonical and non-canonical mammalian splice sites

SpliceDB: database of canonical and non-canonical mammalian splice sites 2001 Oxford University Press Nucleic Acids Research, 2001, Vol. 29, No. 1 255 259 SpliceDB: database of canonical and non-canonical mammalian splice sites M.Burset,I.A.Seledtsov 1 and V. V. Solovyev* The

More information

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W 5.1.2007 Overview High-quality finished sequence is much more useful for research once it is annotated. Annotation is a fundamental

More information

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq) RNA sequencing (RNA-seq) Module Outline MO 13-Mar-2017 RNA sequencing: Introduction 1 WE 15-Mar-2017 RNA sequencing: Introduction 2 MO 20-Mar-2017 Paper: PMID 25954002: Human genomics. The human transcriptome

More information

Germline mutations in shelterin complex genes are associated with familial chronic lymphocytic leukemia

Germline mutations in shelterin complex genes are associated with familial chronic lymphocytic leukemia UEMEN NOMON ermline mutations in shelterin complex genes are associated with familial chronic lymphocytic leukemia Helen E. peedy 1, Ben Kinnersley 1, Daniel Chubb 1, eter Broderick 1, hilip J. aw 1, Kevin

More information

RECAP (1)! In eukaryotes, large primary transcripts are processed to smaller, mature mrnas.! What was first evidence for this precursorproduct

RECAP (1)! In eukaryotes, large primary transcripts are processed to smaller, mature mrnas.! What was first evidence for this precursorproduct RECAP (1) In eukaryotes, large primary transcripts are processed to smaller, mature mrnas. What was first evidence for this precursorproduct relationship? DNA Observation: Nuclear RNA pool consists of

More information

Pre-mRNA has introns The splicing complex recognizes semiconserved sequences

Pre-mRNA has introns The splicing complex recognizes semiconserved sequences Adding a 5 cap Lecture 4 mrna splicing and protein synthesis Another day in the life of a gene. Pre-mRNA has introns The splicing complex recognizes semiconserved sequences Introns are removed by a process

More information

Gene finding. kuobin/

Gene finding.  kuobin/ Gene finding KUO-BIN LI, PH.D. http://www.bii.a-star.edu.sg/ kuobin/ Bioinformatics Institute 30 Medical Drive, Level 1, IMCB Building Singapore 117609 Republic of Singapore Gene finding (LSM5191) p.1

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

Alternative Splicing and Genomic Stability

Alternative Splicing and Genomic Stability Alternative Splicing and Genomic Stability Kevin Cahill cahill@unm.edu http://dna.phys.unm.edu/ Abstract In a cell that uses alternative splicing, the total length of all the exons is far less than in

More information

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins. Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins. The RNA transcribed from a complex transcription unit

More information

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 1 Abstract A stretch of chimpanzee DNA was annotated using tools including BLAST, BLAT, and Genscan. Analysis of Genscan predicted genes revealed

More information

Prediction of Alternative Splice Sites in Human Genes

Prediction of Alternative Splice Sites in Human Genes San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 2007 Prediction of Alternative Splice Sites in Human Genes Douglas Simmons San Jose State University

More information

Keywords Gene prediction, artificial neural network, donor splice site, acceptor splice site, Markov chain, fuzzy logic

Keywords Gene prediction, artificial neural network, donor splice site, acceptor splice site, Markov chain, fuzzy logic Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Neural Network

More information

RNA Processing in Eukaryotes *

RNA Processing in Eukaryotes * OpenStax-CNX module: m44532 1 RNA Processing in Eukaryotes * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 By the end of this section, you

More information

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Bioinformatics Sequence Analysis: Part III. Pattern Searching and Gene Finding Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18

More information

Polyomaviridae. Spring

Polyomaviridae. Spring Polyomaviridae Spring 2002 331 Antibody Prevalence for BK & JC Viruses Spring 2002 332 Polyoma Viruses General characteristics Papovaviridae: PA - papilloma; PO - polyoma; VA - vacuolating agent a. 45nm

More information

Mechanisms of alternative splicing regulation

Mechanisms of alternative splicing regulation Mechanisms of alternative splicing regulation The number of mechanisms that are known to be involved in splicing regulation approximates the number of splicing decisions that have been analyzed in detail.

More information

Figure 1: Final annotation map of Contig 9

Figure 1: Final annotation map of Contig 9 Introduction With rapid advances in sequencing technology, particularly with the development of second and third generation sequencing, genomes for organisms from all kingdoms and many phyla have been

More information

Applying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem

Applying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem Oral Presentation at MIE 2011 30th August 2011 Oslo Applying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem Kirsi

More information

Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford

Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford Lecture 8 Understanding Transcription RNA-seq analysis Foundations of Computational Systems Biology David K. Gifford 1 Lecture 8 RNA-seq Analysis RNA-seq principles How can we characterize mrna isoform

More information

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network Contents Classification Rule Generation for Bioinformatics Hyeoncheol Kim Rule Extraction from Neural Networks Algorithm Ex] Promoter Domain Hybrid Model of Knowledge and Learning Knowledge refinement

More information

Molecular Biology (BIOL 4320) Exam #2 April 22, 2002

Molecular Biology (BIOL 4320) Exam #2 April 22, 2002 Molecular Biology (BIOL 4320) Exam #2 April 22, 2002 Name SS# This exam is worth a total of 100 points. The number of points each question is worth is shown in parentheses after the question number. Good

More information

1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics

1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics 1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics to gain mechanistic insight! 4. Return to step 2, as

More information

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. SUPPLEMENTAL FIGURE AND TABLE LEGENDS Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. A) Cirbp mrna expression levels in various mouse tissues collected around the clock

More information

Bioinformatics Laboratory Exercise

Bioinformatics Laboratory Exercise Bioinformatics Laboratory Exercise Biology is in the midst of the genomics revolution, the application of robotic technology to generate huge amounts of molecular biology data. Genomics has led to an explosion

More information

TRANSCRIPTION. DNA à mrna

TRANSCRIPTION. DNA à mrna TRANSCRIPTION DNA à mrna Central Dogma Animation DNA: The Secret of Life (from PBS) http://www.youtube.com/watch? v=41_ne5ms2ls&list=pl2b2bd56e908da696&index=3 Transcription http://highered.mcgraw-hill.com/sites/0072507470/student_view0/

More information

Circular RNAs (circrnas) act a stable mirna sponges

Circular RNAs (circrnas) act a stable mirna sponges Circular RNAs (circrnas) act a stable mirna sponges cernas compete for mirnas Ancestal mrna (+3 UTR) Pseudogene RNA (+3 UTR homolgy region) The model holds true for all RNAs that share a mirna binding

More information

Ambient temperature regulated flowering time

Ambient temperature regulated flowering time Ambient temperature regulated flowering time Applications of RNAseq RNA- seq course: The power of RNA-seq June 7 th, 2013; Richard Immink Overview Introduction: Biological research question/hypothesis

More information

Recognition of HIV-1 subtypes and antiretroviral drug resistance using weightless neural networks

Recognition of HIV-1 subtypes and antiretroviral drug resistance using weightless neural networks Recognition of HIV-1 subtypes and antiretroviral drug resistance using weightless neural networks Caio R. Souza 1, Flavio F. Nobre 1, Priscila V.M. Lima 2, Robson M. Silva 2, Rodrigo M. Brindeiro 3, Felipe

More information

Finding subtle mutations with the Shannon human mrna splicing pipeline

Finding subtle mutations with the Shannon human mrna splicing pipeline Finding subtle mutations with the Shannon human mrna splicing pipeline Presentation at the CLC bio Medical Genomics Workshop American Society of Human Genetics Annual Meeting November 9, 2012 Peter K Rogan

More information

DNA codes for RNA, which guides protein synthesis.

DNA codes for RNA, which guides protein synthesis. Section 3: DNA codes for RNA, which guides protein synthesis. K What I Know W What I Want to Find Out L What I Learned Vocabulary Review synthesis New RNA messenger RNA ribosomal RNA transfer RNA transcription

More information

Supplementary Figure 1. CFTR protein structure and domain architecture.

Supplementary Figure 1. CFTR protein structure and domain architecture. A Plasma Membrane NH ₂ COOH Supplementary Figure. CFT protein structure and domain architecture. (A) Open state CFT homology model, ribbon representation from Serohijos et al. 8 PNAS 5:356. CFT domains

More information

Expert-guided Visual Exploration (EVE) for patient stratification. Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland

Expert-guided Visual Exploration (EVE) for patient stratification. Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland Expert-guided Visual Exploration (EVE) for patient stratification Hamid Bolouri, Lue-Ping Zhao, Eric C. Holland Oncoscape.sttrcancer.org Paul Lisa Ken Jenny Desert Eric The challenge Given - patient clinical

More information

An Introduction to Genetics. 9.1 An Introduction to Genetics. An Introduction to Genetics. An Introduction to Genetics. DNA Deoxyribonucleic acid

An Introduction to Genetics. 9.1 An Introduction to Genetics. An Introduction to Genetics. An Introduction to Genetics. DNA Deoxyribonucleic acid An Introduction to Genetics 9.1 An Introduction to Genetics DNA Deoxyribonucleic acid Information blueprint for life Reproduction, development, and everyday functioning of living things Only 2% coding

More information

Hands-On Ten The BRCA1 Gene and Protein

Hands-On Ten The BRCA1 Gene and Protein Hands-On Ten The BRCA1 Gene and Protein Objective: To review transcription, translation, reading frames, mutations, and reading files from GenBank, and to review some of the bioinformatics tools, such

More information

Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features. Tyler Yue Lab

Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features. Tyler Yue Lab Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features Tyler Derr @ Yue Lab tsd5037@psu.edu Background Hi-C is a chromosome conformation capture (3C)

More information

CH 9: The Cell Cycle Overview. Cellular Organization of the Genetic Material. Distribution of Chromosomes During Eukaryotic Cell Division

CH 9: The Cell Cycle Overview. Cellular Organization of the Genetic Material. Distribution of Chromosomes During Eukaryotic Cell Division CH 9: The Cell Cycle Overview The ability of organisms to produce more of their own kind best distinguishes living things from nonliving matter The continuity of life is based on the reproduction of cells,

More information

Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes

Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes Question No. 1 of 10 1. Which of the following statements about gene expression control in eukaryotes is correct? Question #1 (A)

More information

Evaluating Classifiers for Disease Gene Discovery

Evaluating Classifiers for Disease Gene Discovery Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics

More information

Figure mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic).

Figure mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic). Splicing Figure 14.3 mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic). mrna Splicing rrna and trna are also sometimes spliced;

More information

Processing of RNA II Biochemistry 302. February 13, 2006

Processing of RNA II Biochemistry 302. February 13, 2006 Processing of RNA II Biochemistry 302 February 13, 2006 Precursor mrna: introns and exons Intron: Transcribed RNA sequence removed from precursor RNA during the process of maturation (for class II genes:

More information

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. Supplementary Figure Legends Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. A. Screenshot of the UCSC genome browser from normalized RNAPII and RNA-seq ChIP-seq data

More information

Introduction retroposon

Introduction retroposon 17.1 - Introduction A retrovirus is an RNA virus able to convert its sequence into DNA by reverse transcription A retroposon (retrotransposon) is a transposon that mobilizes via an RNA form; the DNA element

More information

Protein Synthesis

Protein Synthesis Protein Synthesis 10.6-10.16 Objectives - To explain the central dogma - To understand the steps of transcription and translation in order to explain how our genes create proteins necessary for survival.

More information

Regulation of Gene Expression in Eukaryotes

Regulation of Gene Expression in Eukaryotes Ch. 19 Regulation of Gene Expression in Eukaryotes BIOL 222 Differential Gene Expression in Eukaryotes Signal Cells in a multicellular eukaryotic organism genetically identical differential gene expression

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Last time we talked about the few steps in viral replication cycle and the un-coating stage:

Last time we talked about the few steps in viral replication cycle and the un-coating stage: Zeina Al-Momani Last time we talked about the few steps in viral replication cycle and the un-coating stage: Un-coating: is a general term for the events which occur after penetration, we talked about

More information

(A) Cell membrane (B) Ribosome (C) DNA (D) Nucleus (E) Plasmids. A. Incorrect! Both prokaryotic and eukaryotic cells have cell membranes.

(A) Cell membrane (B) Ribosome (C) DNA (D) Nucleus (E) Plasmids. A. Incorrect! Both prokaryotic and eukaryotic cells have cell membranes. High School Biology - Problem Drill 03: The Cell No. 1 of 10 1. Which of the following is NOT found in prokaryotic cells? #01 (A) Cell membrane (B) Ribosome (C) DNA (D) Nucleus (E) Plasmids Both prokaryotic

More information

A Machine Learning Model for Discovery of Protein Isoforms as Biomarkers

A Machine Learning Model for Discovery of Protein Isoforms as Biomarkers University of Windsor Scholarship at UWindsor Electronic Theses and Dissertations 2016 A Machine Learning Model for Discovery of Protein Isoforms as Biomarkers Manal Alshehri University of Windsor Follow

More information

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Processing of RNA II Biochemistry 302. February 18, 2004 Bob Kelm

Processing of RNA II Biochemistry 302. February 18, 2004 Bob Kelm Processing of RNA II Biochemistry 302 February 18, 2004 Bob Kelm What s an intron? Transcribed sequence removed during the process of mrna maturation Discovered by P. Sharp and R. Roberts in late 1970s

More information

Supporting Information

Supporting Information upporting Information Hartford et al. 10.1073/pnas.1113524108 I Results tructure of the Mcm9 Locus. Based on available ET and genomic sequence data, Yoshida (1) characterized mouse Mcm9 as encoding a protein

More information

Overview: Conducting the Genetic Orchestra Prokaryotes and eukaryotes alter gene expression in response to their changing environment

Overview: Conducting the Genetic Orchestra Prokaryotes and eukaryotes alter gene expression in response to their changing environment Overview: Conducting the Genetic Orchestra Prokaryotes and eukaryotes alter gene expression in response to their changing environment In multicellular eukaryotes, gene expression regulates development

More information

Supplemental Figure 1. Small RNA size distribution from different soybean tissues.

Supplemental Figure 1. Small RNA size distribution from different soybean tissues. Supplemental Figure 1. Small RNA size distribution from different soybean tissues. The size of small RNAs was plotted versus frequency (percentage) among total sequences (A, C, E and G) or distinct sequences

More information

Long non-coding RNAs

Long non-coding RNAs Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011 Outline De novo prediction of long non-coding RNAs (lncrnas) Genome-wide RNA gene-finding Intrinsic properties

More information

Processing of RNA II Biochemistry 302. February 14, 2005 Bob Kelm

Processing of RNA II Biochemistry 302. February 14, 2005 Bob Kelm Processing of RNA II Biochemistry 302 February 14, 2005 Bob Kelm What s an intron? Transcribed sequence removed during the process of mrna maturation (non proteincoding sequence) Discovered by P. Sharp

More information

Review: Genome assembly Reads

Review: Genome assembly Reads Assembly validation Review: Genome assembly Reads Contigs Scaffolds Chromosome Review: Mate pair data Overlap-Layout-Consensus AMOS project: A Modular Open Source assembler Importing data to an AMOS bank

More information

TITLE: The Role Of Alternative Splicing In Breast Cancer Progression

TITLE: The Role Of Alternative Splicing In Breast Cancer Progression AD Award Number: W81XWH-06-1-0598 TITLE: The Role Of Alternative Splicing In Breast Cancer Progression PRINCIPAL INVESTIGATOR: Klemens J. Hertel, Ph.D. CONTRACTING ORGANIZATION: University of California,

More information

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space Whole genome sequencing Whole exome sequencing BWA alignment to reference transcriptome and genome Convert transcriptome mappings back to genome space genomes Filter on MQ, distance, Cigar string Annotate

More information

Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR)

Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR) Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR) O. Solomon, S. Oren, M. Safran, N. Deshet-Unger, P. Akiva, J. Jacob-Hirsch, K. Cesarkas, R. Kabesa, N. Amariglio, R.

More information

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression Chromatin Array of nuc 1 Transcriptional control in Eukaryotes: Chromatin undergoes structural changes

More information

Breeding scheme, transgenes, histological analysis and site distribution of SB-mutagenized osteosarcoma.

Breeding scheme, transgenes, histological analysis and site distribution of SB-mutagenized osteosarcoma. Supplementary Figure 1 Breeding scheme, transgenes, histological analysis and site distribution of SB-mutagenized osteosarcoma. (a) Breeding scheme. R26-LSL-SB11 homozygous mice were bred to Trp53 LSL-R270H/+

More information

Computational Biology I LSM5191

Computational Biology I LSM5191 Computational Biology I LSM5191 Aylwin Ng, D.Phil Lecture Notes: Transcriptome: Molecular Biology of Gene Expression II TRANSLATION RIBOSOMES: protein synthesizing machines Translation takes place on defined

More information

Insulin mrna to Protein Kit

Insulin mrna to Protein Kit Insulin mrna to Protein Kit A 3DMD Paper BioInformatics and Mini-Toober Folding Activity Student Handout www.3dmoleculardesigns.com Insulin mrna to Protein Kit Contents Becoming Familiar with the Data...

More information

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data Breast cancer Inferring Transcriptional Module from Breast Cancer Profile Data Breast Cancer and Targeted Therapy Microarray Profile Data Inferring Transcriptional Module Methods CSC 177 Data Warehousing

More information

DNA-seq Bioinformatics Analysis: Copy Number Variation

DNA-seq Bioinformatics Analysis: Copy Number Variation DNA-seq Bioinformatics Analysis: Copy Number Variation Elodie Girard elodie.girard@curie.fr U900 institut Curie, INSERM, Mines ParisTech, PSL Research University Paris, France NGS Applications 5C HiC DNA-seq

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training. Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze

More information

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and

More information

For all of the following, you will have to use this website to determine the answers:

For all of the following, you will have to use this website to determine the answers: For all of the following, you will have to use this website to determine the answers: http://blast.ncbi.nlm.nih.gov/blast.cgi We are going to be using the programs under this heading: Answer the following

More information

Contents. Introduction. Helminths. Genomics. APOLLO: gene curation software. Glossary. Further Sources

Contents. Introduction. Helminths. Genomics. APOLLO: gene curation software. Glossary. Further Sources Contents 1 Introduction 3 Helminths 9 Genomics 13 APOLLO: gene curation software 18 Glossary 19 Further Sources Introduction Introduction Project overview The Institute for Research in Schools (IRIS) offers

More information

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, Supplementary Information Supplementary Figures Supplementary Figure 1. a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, gene ID and specifities are provided. Those highlighted

More information

EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS?

EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS? EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS? Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge- Apple, Zhiyao Duan, Wendi Heinzelman University of Rochester,

More information

Study the Evolution of the Avian Influenza Virus

Study the Evolution of the Avian Influenza Virus Designing an Algorithm to Study the Evolution of the Avian Influenza Virus Arti Khana Mentor: Takis Benos Rachel Brower-Sinning Department of Computational Biology University of Pittsburgh Overview Introduction

More information

Genetics and Genomics in Medicine Chapter 6 Questions

Genetics and Genomics in Medicine Chapter 6 Questions Genetics and Genomics in Medicine Chapter 6 Questions Multiple Choice Questions Question 6.1 With respect to the interconversion between open and condensed chromatin shown below: Which of the directions

More information

Chapter 2. What is life? Reproduction. All living things are made of cells

Chapter 2. What is life? Reproduction. All living things are made of cells What is life? Chapter 2 The Nature of Life All living things are made of cells Composed of one or more cells ossess inherited information (DNA) Reproduce Develop respond to the environment Assimilate and

More information

Bio 111 Study Guide Chapter 17 From Gene to Protein

Bio 111 Study Guide Chapter 17 From Gene to Protein Bio 111 Study Guide Chapter 17 From Gene to Protein BEFORE CLASS: Reading: Read the introduction on p. 333, skip the beginning of Concept 17.1 from p. 334 to the bottom of the first column on p. 336, and

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

REGULATED AND NONCANONICAL SPLICING

REGULATED AND NONCANONICAL SPLICING REGULATED AND NONCANONICAL SPLICING RNA Processing Lecture 3, Biological Regulatory Mechanisms, Hiten Madhani Dept. of Biochemistry and Biophysics MAJOR MESSAGES Splice site consensus sequences do have

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Missense damaging predictions as a function of allele frequency

Nature Neuroscience: doi: /nn Supplementary Figure 1. Missense damaging predictions as a function of allele frequency Supplementary Figure 1 Missense damaging predictions as a function of allele frequency Percentage of missense variants classified as damaging by eight different classifiers and a classifier consisting

More information

Classification of EEG signals in an Object Recognition task

Classification of EEG signals in an Object Recognition task Classification of EEG signals in an Object Recognition task Iacob D. Rus, Paul Marc, Mihaela Dinsoreanu, Rodica Potolea Technical University of Cluj-Napoca Cluj-Napoca, Romania 1 rus_iacob23@yahoo.com,

More information

Beta Thalassemia Case Study Introduction to Bioinformatics

Beta Thalassemia Case Study Introduction to Bioinformatics Beta Thalassemia Case Study Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu www.cs.sjsu.edu/faculty/khuri Outline v Hemoglobin v Alpha

More information

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University Role of Chemical lexposure in Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University CNV Discovery Reference Genetic

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

LECTURE PRESENTATIONS

LECTURE PRESENTATIONS LECTURE PRESENTATIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry, Michael L. Cain, Steven A. Wasserman, Peter V. Minorsky, Robert B. Jackson Chapter 12 The Cell Cycle Lectures by Erin

More information

Antibodies and T Cell Receptor Genetics Generation of Antigen Receptor Diversity

Antibodies and T Cell Receptor Genetics Generation of Antigen Receptor Diversity Antibodies and T Cell Receptor Genetics 2008 Peter Burrows 4-6529 peterb@uab.edu Generation of Antigen Receptor Diversity Survival requires B and T cell receptor diversity to respond to the diversity of

More information

BIOLOGY 111. CHAPTER 9: The Links in Life s Chain Genetics and Cell Division

BIOLOGY 111. CHAPTER 9: The Links in Life s Chain Genetics and Cell Division BIOLOGY 111 CHAPTER 9: The Links in Life s Chain Genetics and Cell Division The Links in Life s Chain: Genetics and Cell Division 9.1 An Introduction to Genetics 9.2 An Introduction to Cell Division 9.3

More information

Comparison of discrimination methods for the classification of tumors using gene expression data

Comparison of discrimination methods for the classification of tumors using gene expression data Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley

More information

MEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG)

MEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG) Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG) Ordering Information Acceptable specimen types: Fresh blood sample (3-6 ml EDTA; no time limitations associated with receipt)

More information