Gene finding. kuobin/

Size: px
Start display at page:

Download "Gene finding. kuobin/"

Transcription

1 Gene finding KUO-BIN LI, PH.D. kuobin/ Bioinformatics Institute 30 Medical Drive, Level 1, IMCB Building Singapore Republic of Singapore Gene finding (LSM5191) p.1

2 Gene structure donor: 5 -AG GUAAGU-3 acceptor: 5 -PyPyPyPyPyPyNCAG -3 Gene finding (LSM5191) p.2

3 Coding regions ORF (Open Reading Frame) usually starts with ATG and ends with TAA, TAG or TGA. Thus, the simplest gene prediction algorithm is to search for ORFs that begin with an ATG and end with an Txx: the ORF scanning. Three frames in each of the two directions. The ORF scanning works well for bacterial genome: short intergenic sequences, no overlapping genes,... etc. Eukaryotic genes contains introns, which complicates the ORF scanning Gene finding (LSM5191) p.3

4 Overview Ab initio method: e.g., HMM based Similarity to existing ESTs (Expression Sequence Tag) Similarity to existing gene or proteins Gene prediction method ab initio method or HMM based Similarity to existing ESTs Similarity to existing gene or proteins Limitation poor sensitivity and specifitity, leading to whole genes or exons being missed of wrongly predicted Contaminating ESTs (from unspliced mrna, genomic DNA and nongenic transcription unable to distinguish pseudogenes (non-protein coding) ;novel genes undetected Gene finding (LSM5191) p.4

5 HMM for gene finding HMM for coding regions: could model coding regions of any length HMM model for unspliced genes with the first three states matching a start codon, the last three matching a stop codon Gene finding (LSM5191) p.5

6 HMM for gene finding HMM model for unspliced genes. An x means a state for non-coding DNA, an c a state of coding DNA. Gene finding (LSM5191) p.6

7 GENSCAN GENSCAN predicts complete exon/intron structures of genes in genomic DNA. J. Bio. Mol. 268, 78 94, Gene finding (LSM5191) p.7

8 GENSCAN GENSCAN predicted genes in sequence HUMRASH kb kb kb Key: Initial exon Internal exon Terminal exon Single-exon gene Optimal exon Suboptimal exon Gene finding (LSM5191) p.8

9 GENSCAN GENSCAN 1.0 Date run: 29-May-101 Time: 13:42:29 Sequence HUMRASH : 6453 bp : 68.19% C+G : Isochore 4 ( C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S.Begin...End.Len Fr Ph I/Ac Do/T CodRg P... Tscr Init Intr Intr Term Predicted peptide sequence(s): Predicted coding sequence(s): >HUMRASH GENSCAN_predicted_peptide_1 189_aa MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAG QEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDL AARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPG CMSCKCVLS >HUMRASH GENSCAN_predicted_CDS_1 570_bp atgacggaatataagctggtggtggtgggcgccggcggtgtgggcaagagtgcgctgacc atccagctgatccagaaccattttgtggacgaatacgaccccactatagaggattcctac cggaagcaggtggtcattgatggggagacgtgcctgttggacatcctggataccgccggc caggaggagtacagcgccatgcgggaccagtacatgcgcaccggggagggcttcctgtgt gtgtttgccatcaacaacaccaagtcttttgaggacatccaccagtacagggagcagatc aaacgggtgaaggactcggatgacgtgcccatggtgctggtggggaacaagtgtgacctg Gene finding (LSM5191) p.9

10 GENSCAN gctgcacgcactgtggaatctcggcaggctcaggacctcgcccgaagctacggcatcccc tacatcgagacctcggccaagacccggcagggagtggaggatgccttctacacgttggtg cgtgagatccggcagcacaagctgcggaagctgaaccctcctgatgagagtggccccggc tgcatgagctgcaagtgtgtgctctcctga Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon Intr = Internal exon Term = Terminal exon Sngl = Single-exon gene Prom = Promoter PlyA = poly-a signal S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a codon ending at x is in frame f = x mod 3) Ph : net phase of exon (length mod 3) I/Ac : initiation signal or acceptor splice site score (x 10) Do/T : donor splice site or termination signal score (x 10) CodRg : coding region score (x 10) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Gene finding (LSM5191) p.10

11 MAGPIE MAGPIE is a eukaryotic genome analysis pipeline. Genome Res. 10, , Gene finding (LSM5191) p.11

12 Gene discovery This guide can be found in Does a particular sequence of DNA code for proteins and what may their function be? Is there a protein in organism A homologous to protein X of organism B? Gene finding (LSM5191) p.12

13 Overview 1. Obtaining a sequence of interest: from PubMed, GenBank, or other sites 2. Identify ORFs and translate into protein: use gene prediction tools 3. Find similar sequences in the databases: use BLAST or other tools 4. Do a global alignment of your sequence vs similar sequences: get a better insight about your target sequence 5. Look for gene families: do multiple sequence alignment 6. Look for the presence of specific patterns in your protein: e.g., consensus regions Gene finding (LSM5191) p.13

14 Overview 7. Find similar sequences in other species 8. Determine the putative structure of your protein: predict the secondary structure 9. Obtain information about function of related proteins: literature search at PubMed 10. Input your sequence into an alert server: get alert if a sequence similar to yours has been input in a database Gene finding (LSM5191) p.14

Gene Finding in Eukaryotes

Gene Finding in Eukaryotes Gene Finding in Eukaryotes Jan-Jaap Wesselink jjwesselink@cnio.es Computational and Structural Biology Group, Centro Nacional de Investigaciones Oncológicas Madrid, April 2008 Jan-Jaap Wesselink jjwesselink@cnio.es

More information

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 1 Abstract A stretch of chimpanzee DNA was annotated using tools including BLAST, BLAT, and Genscan. Analysis of Genscan predicted genes revealed

More information

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W

Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W Annotation of Drosophila mojavensis fosmid 8 Priya Srikanth Bio 434W 5.1.2007 Overview High-quality finished sequence is much more useful for research once it is annotated. Annotation is a fundamental

More information

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Bioinformatics Sequence Analysis: Part III. Pattern Searching and Gene Finding Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18

More information

Bacterial Gene Finding CMSC 423

Bacterial Gene Finding CMSC 423 Bacterial Gene Finding CMSC 423 Finding Signals in DNA We just have a long string of A, C, G, Ts. How can we find the signals encoded in it? Suppose you encountered a language you didn t know. How would

More information

Studying Alternative Splicing

Studying Alternative Splicing Studying Alternative Splicing Meelis Kull PhD student in the University of Tartu supervisor: Jaak Vilo CS Theory Days Rõuge 27 Overview Alternative splicing Its biological function Studying splicing Technology

More information

Sebastian Jaenicke. trnascan-se. Improved detection of trna genes in genomic sequences

Sebastian Jaenicke. trnascan-se. Improved detection of trna genes in genomic sequences Sebastian Jaenicke trnascan-se Improved detection of trna genes in genomic sequences trnascan-se Improved detection of trna genes in genomic sequences 1/15 Overview 1. trnas 2. Existing approaches 3. trnascan-se

More information

SpliceDB: database of canonical and non-canonical mammalian splice sites

SpliceDB: database of canonical and non-canonical mammalian splice sites 2001 Oxford University Press Nucleic Acids Research, 2001, Vol. 29, No. 1 255 259 SpliceDB: database of canonical and non-canonical mammalian splice sites M.Burset,I.A.Seledtsov 1 and V. V. Solovyev* The

More information

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression Chromatin Array of nuc 1 Transcriptional control in Eukaryotes: Chromatin undergoes structural changes

More information

Figure 1: Final annotation map of Contig 9

Figure 1: Final annotation map of Contig 9 Introduction With rapid advances in sequencing technology, particularly with the development of second and third generation sequencing, genomes for organisms from all kingdoms and many phyla have been

More information

Bioinformatics Laboratory Exercise

Bioinformatics Laboratory Exercise Bioinformatics Laboratory Exercise Biology is in the midst of the genomics revolution, the application of robotic technology to generate huge amounts of molecular biology data. Genomics has led to an explosion

More information

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc. Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets

More information

Hands-On Ten The BRCA1 Gene and Protein

Hands-On Ten The BRCA1 Gene and Protein Hands-On Ten The BRCA1 Gene and Protein Objective: To review transcription, translation, reading frames, mutations, and reading files from GenBank, and to review some of the bioinformatics tools, such

More information

FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342

FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342 FINAL ANNOTATION REPORT: Drosophila virilis Fosmid 11 (48P14) Robert Carrasquillo Bio 4342 2006 TABLE OF CONTENTS I. Overview... 3 II. Genes... 4 III. Clustal Analysis... 15 IV. Repeat Analysis... 17 V.

More information

Bio 111 Study Guide Chapter 17 From Gene to Protein

Bio 111 Study Guide Chapter 17 From Gene to Protein Bio 111 Study Guide Chapter 17 From Gene to Protein BEFORE CLASS: Reading: Read the introduction on p. 333, skip the beginning of Concept 17.1 from p. 334 to the bottom of the first column on p. 336, and

More information

MODULE 3: TRANSCRIPTION PART II

MODULE 3: TRANSCRIPTION PART II MODULE 3: TRANSCRIPTION PART II Lesson Plan: Title S. CATHERINE SILVER KEY, CHIYEDZA SMALL Transcription Part II: What happens to the initial (premrna) transcript made by RNA pol II? Objectives Explain

More information

High-throughput transcriptome sequencing

High-throughput transcriptome sequencing High-throughput transcriptome sequencing Erik Kristiansson (erik.kristiansson@zool.gu.se) Department of Zoology Department of Neuroscience and Physiology University of Gothenburg, Sweden Outline Genome

More information

Pre-mRNA has introns The splicing complex recognizes semiconserved sequences

Pre-mRNA has introns The splicing complex recognizes semiconserved sequences Adding a 5 cap Lecture 4 mrna splicing and protein synthesis Another day in the life of a gene. Pre-mRNA has introns The splicing complex recognizes semiconserved sequences Introns are removed by a process

More information

Protein Synthesis

Protein Synthesis Protein Synthesis 10.6-10.16 Objectives - To explain the central dogma - To understand the steps of transcription and translation in order to explain how our genes create proteins necessary for survival.

More information

Ambient temperature regulated flowering time

Ambient temperature regulated flowering time Ambient temperature regulated flowering time Applications of RNAseq RNA- seq course: The power of RNA-seq June 7 th, 2013; Richard Immink Overview Introduction: Biological research question/hypothesis

More information

For all of the following, you will have to use this website to determine the answers:

For all of the following, you will have to use this website to determine the answers: For all of the following, you will have to use this website to determine the answers: http://blast.ncbi.nlm.nih.gov/blast.cgi We are going to be using the programs under this heading: Answer the following

More information

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Donald J. Patterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 Pacific Symposium on Biocomputing University of Washington Computational

More information

GENOME-WIDE DETECTION OF ALTERNATIVE SPLICING IN EXPRESSED SEQUENCES USING PARTIAL ORDER MULTIPLE SEQUENCE ALIGNMENT GRAPHS

GENOME-WIDE DETECTION OF ALTERNATIVE SPLICING IN EXPRESSED SEQUENCES USING PARTIAL ORDER MULTIPLE SEQUENCE ALIGNMENT GRAPHS GENOME-WIDE DETECTION OF ALTERNATIVE SPLICING IN EXPRESSED SEQUENCES USING PARTIAL ORDER MULTIPLE SEQUENCE ALIGNMENT GRAPHS C. GRASSO, B. MODREK, Y. XING, C. LEE Department of Chemistry and Biochemistry,

More information

Acceptor splice site prediction

Acceptor splice site prediction Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 5-1-2007 Acceptor splice site prediction Eric Foster Follow this and additional works at: http://scholarworks.rit.edu/theses

More information

DNA codes for RNA, which guides protein synthesis.

DNA codes for RNA, which guides protein synthesis. Section 3: DNA codes for RNA, which guides protein synthesis. K What I Know W What I Want to Find Out L What I Learned Vocabulary Review synthesis New RNA messenger RNA ribosomal RNA transfer RNA transcription

More information

Computational Biology I LSM5191

Computational Biology I LSM5191 Computational Biology I LSM5191 Aylwin Ng, D.Phil Lecture Notes: Transcriptome: Molecular Biology of Gene Expression II TRANSLATION RIBOSOMES: protein synthesizing machines Translation takes place on defined

More information

Biochemistry 2000 Sample Question Transcription, Translation and Lipids. (1) Give brief definitions or unique descriptions of the following terms:

Biochemistry 2000 Sample Question Transcription, Translation and Lipids. (1) Give brief definitions or unique descriptions of the following terms: (1) Give brief definitions or unique descriptions of the following terms: (a) exon (b) holoenzyme (c) anticodon (d) trans fatty acid (e) poly A tail (f) open complex (g) Fluid Mosaic Model (h) embedded

More information

Central Dogma. Central Dogma. Translation (mrna -> protein)

Central Dogma. Central Dogma. Translation (mrna -> protein) Central Dogma Central Dogma Translation (mrna -> protein) mrna code for amino acids 1. Codons as Triplet code 2. Redundancy 3. Open reading frames 4. Start and stop codons 5. Mistakes in translation 6.

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins. Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins. The RNA transcribed from a complex transcription unit

More information

Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes

Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes Question No. 1 of 10 1. Which of the following statements about gene expression control in eukaryotes is correct? Question #1 (A)

More information

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq) RNA sequencing (RNA-seq) Module Outline MO 13-Mar-2017 RNA sequencing: Introduction 1 WE 15-Mar-2017 RNA sequencing: Introduction 2 MO 20-Mar-2017 Paper: PMID 25954002: Human genomics. The human transcriptome

More information

Finding subtle mutations with the Shannon human mrna splicing pipeline

Finding subtle mutations with the Shannon human mrna splicing pipeline Finding subtle mutations with the Shannon human mrna splicing pipeline Presentation at the CLC bio Medical Genomics Workshop American Society of Human Genetics Annual Meeting November 9, 2012 Peter K Rogan

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

Prediction of Alternative Splice Sites in Human Genes

Prediction of Alternative Splice Sites in Human Genes San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 2007 Prediction of Alternative Splice Sites in Human Genes Douglas Simmons San Jose State University

More information

Figure mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic).

Figure mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic). Splicing Figure 14.3 mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic). mrna Splicing rrna and trna are also sometimes spliced;

More information

Genetics. Instructor: Dr. Jihad Abdallah Transcription of DNA

Genetics. Instructor: Dr. Jihad Abdallah Transcription of DNA Genetics Instructor: Dr. Jihad Abdallah Transcription of DNA 1 3.4 A 2 Expression of Genetic information DNA Double stranded In the nucleus Transcription mrna Single stranded Translation In the cytoplasm

More information

Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition re-mrna rediction Aids plice ite Recognition Gene Expression: Details (Eukaryotes) DNA pre-mrna mrna rotein nucleus gene rotein Donald J. atterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 DNA (chromosome)

More information

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods International Journal of Pharmaceutical Science Invention ISSN (Online): 2319 6718, ISSN (Print): 2319 670X Volume 2 Issue 5 May 2013 PP.70-74 Identification of mirnas in Eucalyptus globulus Plant by Computational

More information

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

Accessing and Using ENCODE Data Dr. Peggy J. Farnham 1 William M Keck Professor of Biochemistry Keck School of Medicine University of Southern California How many human genes are encoded in our 3x10 9 bp? C. elegans (worm) 959 cells and 1x10 8 bp 20,000

More information

Life Sciences 1A Midterm Exam 2. November 13, 2006

Life Sciences 1A Midterm Exam 2. November 13, 2006 Name: TF: Section Time Life Sciences 1A Midterm Exam 2 November 13, 2006 Please write legibly in the space provided below each question. You may not use calculators on this exam. We prefer that you use

More information

EST alignments suggest that [secret number]% of Arabidopsis thaliana genes are alternatively spliced

EST alignments suggest that [secret number]% of Arabidopsis thaliana genes are alternatively spliced EST alignments suggest that [secret number]% of Arabidopsis thaliana genes are alternatively spliced Dan Morris Stanford University Robotics Lab Computer Science Department Stanford, CA 94305-9010 dmorris@cs.stanford.edu

More information

Circular RNAs (circrnas) act a stable mirna sponges

Circular RNAs (circrnas) act a stable mirna sponges Circular RNAs (circrnas) act a stable mirna sponges cernas compete for mirnas Ancestal mrna (+3 UTR) Pseudogene RNA (+3 UTR homolgy region) The model holds true for all RNAs that share a mirna binding

More information

Introduction retroposon

Introduction retroposon 17.1 - Introduction A retrovirus is an RNA virus able to convert its sequence into DNA by reverse transcription A retroposon (retrotransposon) is a transposon that mobilizes via an RNA form; the DNA element

More information

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Supplemental Data Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Jianfu Jiang 1, Xinna Liu 1, Guotian Liu, Chonghuih Liu*, Shaohuah Li*, and Lijun

More information

Supplementary Document

Supplementary Document Supplementary Document 1. Supplementary Table legends 2. Supplementary Figure legends 3. Supplementary Tables 4. Supplementary Figures 5. Supplementary References 1. Supplementary Table legends Suppl.

More information

Transcription of the German Cockroach Densovirus BgDNV Genome: Alternative Processing of Viral RNAs

Transcription of the German Cockroach Densovirus BgDNV Genome: Alternative Processing of Viral RNAs ISSN 1607-6729, Doklady Biochemistry and Biophysics, 2008, Vol. 421, pp. 176 180. Pleiades Publishing, Ltd., 2008. Original Russian Text T.V. Kapelinskaya, E.U. Martynova, A.L. Korolev, C. Schal, D.V.

More information

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend

Table of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend Table of content -Supplementary methods -Figure S1 -Figure S2 -Figure S3 -Table legend Supplementary methods Yeast two-hybrid bait basal transactivation test Because bait constructs sometimes self-transactivate

More information

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SUPPLEMENTARY FIGURES: Supplementary Figure 1 SUPPLEMENTARY FIGURES: Supplementary Figure 1 Supplementary Figure 1. Glioblastoma 5hmC quantified by paired BS and oxbs treated DNA hybridized to Infinium DNA methylation arrays. Workflow depicts analytic

More information

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos Fabrice Legeai, Thomas Derrien, Valen3n Wucher, Audrey David, Gael Le Trionnaire and Denis Tagu

More information

GENOME-WIDE COMPUTATIONAL ANALYSIS OF SMALL NUCLEAR RNA GENES OF ORYZA SATIVA (INDICA AND JAPONICA)

GENOME-WIDE COMPUTATIONAL ANALYSIS OF SMALL NUCLEAR RNA GENES OF ORYZA SATIVA (INDICA AND JAPONICA) GENOME-WIDE COMPUTATIONAL ANALYSIS OF SMALL NUCLEAR RNA GENES OF ORYZA SATIVA (INDICA AND JAPONICA) M.SHASHIKANTH, A.SNEHALATHARANI, SK. MUBARAK AND K.ULAGANATHAN Center for Plant Molecular Biology, Osmania

More information

TRANSCRIPTION. DNA à mrna

TRANSCRIPTION. DNA à mrna TRANSCRIPTION DNA à mrna Central Dogma Animation DNA: The Secret of Life (from PBS) http://www.youtube.com/watch? v=41_ne5ms2ls&list=pl2b2bd56e908da696&index=3 Transcription http://highered.mcgraw-hill.com/sites/0072507470/student_view0/

More information

Mechanisms of alternative splicing regulation

Mechanisms of alternative splicing regulation Mechanisms of alternative splicing regulation The number of mechanisms that are known to be involved in splicing regulation approximates the number of splicing decisions that have been analyzed in detail.

More information

1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics

1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics 1. Identify and characterize interesting phenomena! 2. Characterization should stimulate some questions/models! 3. Combine biochemistry and genetics to gain mechanistic insight! 4. Return to step 2, as

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 U1 inhibition causes a shift of RNA-seq reads from exons to introns. (a) Evidence for the high purity of 4-shU-labeled RNAs used for RNA-seq. HeLa cells transfected with control

More information

Beta Thalassemia Case Study Introduction to Bioinformatics

Beta Thalassemia Case Study Introduction to Bioinformatics Beta Thalassemia Case Study Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu www.cs.sjsu.edu/faculty/khuri Outline v Hemoglobin v Alpha

More information

Discovery of a Novel Murine Type C Retrovirus by Data Mining

Discovery of a Novel Murine Type C Retrovirus by Data Mining JOURNAL OF VIROLOGY, Mar. 2001, p. 3053 3057 Vol. 75, No. 6 0022-538X/01/$04.00 0 DOI: 10.1128/JVI.75.6.3053 3057.2001 Copyright 2001, American Society for Microbiology. All Rights Reserved. Discovery

More information

Regulation of Gene Expression in Eukaryotes

Regulation of Gene Expression in Eukaryotes Ch. 19 Regulation of Gene Expression in Eukaryotes BIOL 222 Differential Gene Expression in Eukaryotes Signal Cells in a multicellular eukaryotic organism genetically identical differential gene expression

More information

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data Breast cancer Inferring Transcriptional Module from Breast Cancer Profile Data Breast Cancer and Targeted Therapy Microarray Profile Data Inferring Transcriptional Module Methods CSC 177 Data Warehousing

More information

Molecular Biology (BIOL 4320) Exam #2 April 22, 2002

Molecular Biology (BIOL 4320) Exam #2 April 22, 2002 Molecular Biology (BIOL 4320) Exam #2 April 22, 2002 Name SS# This exam is worth a total of 100 points. The number of points each question is worth is shown in parentheses after the question number. Good

More information

ChIP-seq data analysis

ChIP-seq data analysis ChIP-seq data analysis Harri Lähdesmäki Department of Computer Science Aalto University November 24, 2017 Contents Background ChIP-seq protocol ChIP-seq data analysis Transcriptional regulation Transcriptional

More information

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space Whole genome sequencing Whole exome sequencing BWA alignment to reference transcriptome and genome Convert transcriptome mappings back to genome space genomes Filter on MQ, distance, Cigar string Annotate

More information

Alternative splicing. Biosciences 741: Genomics Fall, 2013 Week 6

Alternative splicing. Biosciences 741: Genomics Fall, 2013 Week 6 Alternative splicing Biosciences 741: Genomics Fall, 2013 Week 6 Function(s) of RNA splicing Splicing of introns must be completed before nuclear RNAs can be exported to the cytoplasm. This led to early

More information

SpliceInfo: an information repository for the modes of mrna alternative splicing in human genome

SpliceInfo: an information repository for the modes of mrna alternative splicing in human genome SpliceInfo: an information repository for the modes of mrna alternative splicing in human genome Hsien-Da Huang 1, Jorng-Tzong Horng 2, 3, *, Feng-Mao Lin 2, Yu-Chung Chang 4, and Chen-Chia Huang 2 1 Department

More information

MCB Chapter 11. Topic E. Splicing mechanism Nuclear Transport Alternative control modes. Reading :

MCB Chapter 11. Topic E. Splicing mechanism Nuclear Transport Alternative control modes. Reading : MCB Chapter 11 Topic E Splicing mechanism Nuclear Transport Alternative control modes Reading : 419-449 Topic E Michal Linial 14 Jan 2004 1 Self-splicing group I introns were the first examples of catalytic

More information

Evidence of a Pathway of Reduction in Bacteria: Reduced Quantities of Restriction Sites Impact trna Activity in a Trial Set

Evidence of a Pathway of Reduction in Bacteria: Reduced Quantities of Restriction Sites Impact trna Activity in a Trial Set Evidence of a of Reduction in Bacteria: Reduced Quantities of Restriction Sites Impact trna Activity in a Trial Set Oliver Bonham-Carter, Lotfollah Najjar, Dhundy Bastola School of Interdisciplinary Informatics

More information

RNA Processing in Eukaryotes *

RNA Processing in Eukaryotes * OpenStax-CNX module: m44532 1 RNA Processing in Eukaryotes * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 By the end of this section, you

More information

Supplementary Figure 1. SC35M polymerase activity in the presence of Bat or SC35M NP encoded from the phw2000 rescue plasmid.

Supplementary Figure 1. SC35M polymerase activity in the presence of Bat or SC35M NP encoded from the phw2000 rescue plasmid. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Supplementary Figure 1. SC35M polymerase activity in the presence of Bat or SC35M NP encoded from the phw2000 rescue plasmid. HEK293T

More information

PALB2 c g>c is. VARIANT OF UNCERTAIN SIGNIFICANCE (VUS) CGI s summary of the available evidence is in Appendices A-C.

PALB2 c g>c is. VARIANT OF UNCERTAIN SIGNIFICANCE (VUS) CGI s summary of the available evidence is in Appendices A-C. Consultation sponsor (may not be the patient): First LastName [Patient identity withheld] Date received by CGI: 2 Sept 2017 Variant Fact Checker Report ID: 0000001.5 Date Variant Fact Checker issued: 12

More information

An Analysis of MDM4 Alternative Splicing and Effects Across Cancer Cell Lines

An Analysis of MDM4 Alternative Splicing and Effects Across Cancer Cell Lines An Analysis of MDM4 Alternative Splicing and Effects Across Cancer Cell Lines Kevin Hu Mentor: Dr. Mahmoud Ghandi 7th Annual MIT PRIMES Conference May 2021, 2017 Outline Introduction MDM4 Isoforms Methodology

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 Frequency of alternative-cassette-exon engagement with the ribosome is consistent across data from multiple human cell types and from mouse stem cells. Box plots showing AS frequency

More information

Insulin mrna to Protein Kit

Insulin mrna to Protein Kit Insulin mrna to Protein Kit A 3DMD Paper BioInformatics and Mini-Toober Folding Activity Student Handout www.3dmoleculardesigns.com Insulin mrna to Protein Kit Contents Becoming Familiar with the Data...

More information

Polyomaviridae. Spring

Polyomaviridae. Spring Polyomaviridae Spring 2002 331 Antibody Prevalence for BK & JC Viruses Spring 2002 332 Polyoma Viruses General characteristics Papovaviridae: PA - papilloma; PO - polyoma; VA - vacuolating agent a. 45nm

More information

Reporting TP53 gene analysis results in CLL

Reporting TP53 gene analysis results in CLL Reporting TP53 gene analysis results in CLL Mutations in TP53 - From discovery to clinical practice in CLL Discovery Validation Clinical practice Variant diversity *Leroy at al, Cancer Research Review

More information

RNA (Ribonucleic acid)

RNA (Ribonucleic acid) RNA (Ribonucleic acid) Structure: Similar to that of DNA except: 1- it is single stranded polunucleotide chain. 2- Sugar is ribose 3- Uracil is instead of thymine There are 3 types of RNA: 1- Ribosomal

More information

Contents. Introduction. Helminths. Genomics. APOLLO: gene curation software. Glossary. Further Sources

Contents. Introduction. Helminths. Genomics. APOLLO: gene curation software. Glossary. Further Sources Contents 1 Introduction 3 Helminths 9 Genomics 13 APOLLO: gene curation software 18 Glossary 19 Further Sources Introduction Introduction Project overview The Institute for Research in Schools (IRIS) offers

More information

TRANSLATION: 3 Stages to translation, can you guess what they are?

TRANSLATION: 3 Stages to translation, can you guess what they are? TRANSLATION: Translation: is the process by which a ribosome interprets a genetic message on mrna to place amino acids in a specific sequence in order to synthesize polypeptide. 3 Stages to translation,

More information

RNA-seq Introduction

RNA-seq Introduction RNA-seq Introduction DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different functional RNAs Which RNAs (and sometimes then translated

More information

Complexity DNA. Genome RNA. Transcriptome. Protein. Proteome. Metabolites. Metabolome

Complexity DNA. Genome RNA. Transcriptome. Protein. Proteome. Metabolites. Metabolome DNA Genome Complexity RNA Transcriptome Systems Biology Linking all the components of a cell in a quantitative and temporal manner Protein Proteome Metabolites Metabolome Where are the functional elements?

More information

a. From the grey navigation bar, mouse over Analyze & Visualize and click Annotate Nucleotide Sequences.

a. From the grey navigation bar, mouse over Analyze & Visualize and click Annotate Nucleotide Sequences. Section D. Custom sequence annotation After this exercise you should be able to use the annotation pipelines provided by the Influenza Research Database (IRD) and Virus Pathogen Resource (ViPR) to annotate

More information

Prediction and Statistical Analysis of Alternatively Spliced Exons

Prediction and Statistical Analysis of Alternatively Spliced Exons Prediction and Statistical Analysis of Alternatively Spliced Exons T.A. Thanaraj 1 and S. Stamm 2 The completion of large genomic sequencing projects revealed that metazoan organisms abundantly use alternative

More information

Cloning and Expression of a Bacterial CGTase and Impacts on Phytoremediation. Sarah J. MacDonald Assistant Professor Missouri Valley College

Cloning and Expression of a Bacterial CGTase and Impacts on Phytoremediation. Sarah J. MacDonald Assistant Professor Missouri Valley College Cloning and Expression of a Bacterial CGTase and Impacts on Phytoremediation Sarah J. MacDonald Assistant Professor Missouri Valley College Phytoremediation of Organic Compounds Phytodegradation: Plants

More information

L I F E S C I E N C E S

L I F E S C I E N C E S 1a L I F E S C I E N C E S 5 -UUA AUA UUC GAA AGC UGC AUC GAA AAC UGU GAA UCA-3 5 -TTA ATA TTC GAA AGC TGC ATC GAA AAC TGT GAA TCA-3 3 -AAT TAT AAG CTT TCG ACG TAG CTT TTG ACA CTT AGT-5 NOVEMBER 2, 2006

More information

Study the Evolution of the Avian Influenza Virus

Study the Evolution of the Avian Influenza Virus Designing an Algorithm to Study the Evolution of the Avian Influenza Virus Arti Khana Mentor: Takis Benos Rachel Brower-Sinning Department of Computational Biology University of Pittsburgh Overview Introduction

More information

Long non-coding RNAs

Long non-coding RNAs Long non-coding RNAs Dominic Rose Bioinformatics Group, University of Freiburg Bled, Feb. 2011 Outline De novo prediction of long non-coding RNAs (lncrnas) Genome-wide RNA gene-finding Intrinsic properties

More information

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. SUPPLEMENTAL FIGURE AND TABLE LEGENDS Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. A) Cirbp mrna expression levels in various mouse tissues collected around the clock

More information

Interaction of NPR1 with basic leucine zipper protein transcription factors that bind sequences required for salicylic acid induction of the PR-1 gene

Interaction of NPR1 with basic leucine zipper protein transcription factors that bind sequences required for salicylic acid induction of the PR-1 gene Interaction of NPR1 with basic leucine zipper protein transcription factors that bind sequences required for salicylic acid induction of the PR-1 gene YUELIN ZHANG, WEIHUA FAN, MARK KINKEMA, XIN LI, AND

More information

REGULATED SPLICING AND THE UNSOLVED MYSTERY OF SPLICEOSOME MUTATIONS IN CANCER

REGULATED SPLICING AND THE UNSOLVED MYSTERY OF SPLICEOSOME MUTATIONS IN CANCER REGULATED SPLICING AND THE UNSOLVED MYSTERY OF SPLICEOSOME MUTATIONS IN CANCER RNA Splicing Lecture 3, Biological Regulatory Mechanisms, H. Madhani Dept. of Biochemistry and Biophysics MAJOR MESSAGES Splice

More information

Beta Thalassemia Sami Khuri Department of Computer Science San José State University Spring 2015

Beta Thalassemia Sami Khuri Department of Computer Science San José State University Spring 2015 Bioinformatics in Medical Product Development SMPD 287 Three Beta Thalassemia Sami Khuri Department of Computer Science San José State University Hemoglobin Outline Anatomy of a gene Hemoglobinopathies

More information

RNA-Seq guided gene therapy for vision loss. Michael H. Farkas

RNA-Seq guided gene therapy for vision loss. Michael H. Farkas RNA-Seq guided gene therapy for vision loss Michael H. Farkas The retina is a complex tissue Many cell types Neural retina vs. RPE Each highly dependent on the other Graw, Nature Reviews Genetics, 2003

More information

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing Last update: 05/10/2017 MODULE 4: SPLICING Lesson Plan: Title MEG LAAKSO Removal of introns from messenger RNA by splicing Objectives Identify splice donor and acceptor sites that are best supported by

More information

BIOH111. o Cell Biology Module o Tissue Module o Integumentary system o Skeletal system o Muscle system o Nervous system o Endocrine system

BIOH111. o Cell Biology Module o Tissue Module o Integumentary system o Skeletal system o Muscle system o Nervous system o Endocrine system BIOH111 o Cell Biology Module o Tissue Module o Integumentary system o Skeletal system o Muscle system o Nervous system o Endocrine system Endeavour College of Natural Health endeavour.edu.au 1 Textbook

More information

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits Next-generation performance in liquid biopsies 2 Accelerating clinical research From liquid biopsy to next-generation

More information

Proposed EPPO validation of plant viral diagnostics using next generation sequencing

Proposed EPPO validation of plant viral diagnostics using next generation sequencing Proposed EPPO validation of plant viral diagnostics using next generation sequencing Ian Adams, Ummey Hany, Rachel Glover, Erin Lewis, Neil Boonham, Adrian Fox Adoption of Next Generation Sequencing for

More information

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data Haipeng Xing, Yifan Mo, Will Liao, Michael Q. Zhang Clayton Davis and Geoffrey

More information

Transcript reconstruction

Transcript reconstruction Transcript reconstruction Summary I Data types, file formats and utilities Annotation: Genomic regions Genes Peaks bedtools Alignment: Map reads BAM/SAM Samtools Aggregation: Summary files Wig (UCSC) TDF

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Heatmap of GO terms for differentially expressed genes. The terms were hierarchically clustered using the GO term enrichment beta. Darker red, higher positive

More information

RECAP (1)! In eukaryotes, large primary transcripts are processed to smaller, mature mrnas.! What was first evidence for this precursorproduct

RECAP (1)! In eukaryotes, large primary transcripts are processed to smaller, mature mrnas.! What was first evidence for this precursorproduct RECAP (1) In eukaryotes, large primary transcripts are processed to smaller, mature mrnas. What was first evidence for this precursorproduct relationship? DNA Observation: Nuclear RNA pool consists of

More information

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004 Molecular Biology (BIOL 4320) Exam #2 May 3, 2004 Name SS# This exam is worth a total of 100 points. The number of points each question is worth is shown in parentheses after the question number. Good

More information

Multiple sequence alignment

Multiple sequence alignment Multiple sequence alignment Bas. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 18 th 2016 Protein alignments We have seen how to create a pairwise alignment of two sequences

More information