Rice in vivo RNA structurome reveals RNA secondary structure conservation and divergence in plants

Similar documents
Supplemental Information. A Highly Sensitive and Robust Method. for Genome-wide 5hmC Profiling. of Rare Cell Populations

Human immunodeficiency virus type 1 splicing at the major splice donor site is controlled by highly conserved RNA sequence and structural elements

Supplementary figure legends

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB

Hao D. H., Ma W. G., Sheng Y. L., Zhang J. B., Jin Y. F., Yang H. Q., Li Z. G., Wang S. S., GONG Ming*

WHITE PAPER. Increasing Ligation Efficiency and Discovery of mirnas for Small RNA NGS Sequencing Library Prep with Plant Samples

Supplementary information

RNA interference (RNAi)

Figure S1, Beyer et al.

Novel PRD-like homeodomain transcription factors and retrotransposon elements in early human

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

SUPPLEMENTAL INFORMATION

If DNA resides in the nucleus, and proteins are made at the ribosomes, how can DNA direct protein production?

levels of genes were separated by their expression levels; 2,000 high, medium, and low

Genetics Unit Bell Work September 27 & 28, 2016

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Supplemental Figures Legends and Supplemental Figures. for. pirna-guided slicing of transposon transcripts enforces their transcriptional

Suppl. Figure 1. T 3 induces autophagic flux in hepatic cells. (A) RFP-GFP-LC3 transfected HepG2/TRα cells were visualized and cells were quantified

Macromolecules Cut & Paste

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets.

D. glycerol and fatty acids 4. Which is an example of an inorganic compound?

Biology Unit 2 Elements & Macromolecules in Organisms Date/Hour

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Experimental design and workflow utilized to generate the WMG Protein Atlas.

Figure 1. Dnmt3b expression in murine and human knee joint cartilage. (A) Representative images

Elements & Macromolecules in Organisms

Activity: Biologically Important Molecules

MODULE 3: TRANSCRIPTION PART II

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods

AMERICAN NATIONAL SCHOOL General Certificate of Education Advanced Level

Elements & Macromolecules in Organisms

Retroviral RNA Processing and stability

Supplemental Figure S1. Tertiles of FKBP5 promoter methylation and internal regulatory region

Cambridge International Examinations Cambridge International Advanced Subsidiary and Advanced Level

Unit 3 Review Game Page 1

FCC2 5CY7 FCC1 5CY8. Actinonin 5CVQ

FARM MICROBIOLOGY 2008 PART 3: BASIC METABOLISM & NUTRITION OF BACTERIA I. General Overview of Microbial Metabolism and Nutritional Requirements.

SUPPLEMENTAL DATA AGING, July 2014, Vol. 6 No. 7

BCH 4054 December 13, 1999

SUPPLEMENTARY INFORMATION

Simulate enzymatic actions. Explain enzymatic specificity. Investigate two types of enzyme inhibitors used in regulating enzymatic activity.

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

Topic 3: The chemistry of life (15 hours)

Genetic information flows from mrna to protein through the process of translation

SUPPLEMENTARY INFORMATION

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Expanded View Figures

Macromolecule Practice Test

Phylogenetic Methods

Chapter 4. Cellular Metabolism

Introduction. Biochemistry: It is the chemistry of living things (matters).

Studying Alternative Splicing

3. Hydrogen bonds form between which atoms? Between an electropositive hydrogen and an electronegative N, O or F.

m 6 A mrna methylation regulates AKT activity to promote the proliferation and tumorigenicity of endometrial cancer

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

RNA-Seq Atlas of Glycine max: A guide to the Soybean Transcriptome

SOUTHERN CONNECTICUT STATE UNIVERSITY CHE 451 Biochemistry II Spring Semester, 2011

Mechanisms of alternative splicing regulation

Chapter 8. Metabolism. Topics in lectures 15 and 16. Chemical foundations Catabolism Biosynthesis

Evidence for an Alternative Glycolytic Pathway in Rapidly Proliferating Cells. Matthew G. Vander Heiden, et al. Science 2010

Lecture Sixteen: METABOLIC ENERGY: [Based on GENERATION Chapter 15

Supplemental Figure S1. Sequence feature and phylogenetic analysis of GmZF351. (A) Amino acid sequence alignment of GmZF351, AtTZF1, SOMNUS, AtTZF5,

RNA-seq Introduction

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Name: Due on Wensday, December 7th Bioinformatics Take Home Exam #9 Pick one most correct answer, unless stated otherwise!

Supporting Information

Supplemental Information For: The genetics of splicing in neuroblastoma

Mnemonic representations of transient stimuli and temporal sequences in the rodent hippocampus in vitro

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Binding capacity of DNA-barcoded MHC multimers and recovery of antigen specificity

Metabolism. Metabolic pathways. BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 11: Metabolic Pathways

Microbial Metabolism (Chapter 5) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College Eastern Campus

Core Lab #3 Investigating Diabetes Mellitus

Introduction to Metabolism Cell Structure and Function

Nature Genetics: doi: /ng.3731

THIS IS A NEW SPECIFICATION MODIFIED LANGUAGE

Supplementary Figure 1 Transcription assay of nine ABA-responsive PP2C. Transcription assay of nine ABA-responsive PP2C genes. Total RNA was isolated

Name Date Period. Macromolecule Virtual Lab. Name: Go to the website:

Metabolism. Topic 11&12 (ch8) Microbial Metabolism. Metabolic Balancing Act. Topics. Catabolism Anabolism Enzymes

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Organic Chemistry Worksheet

Microbial Metabolism

Insulin mrna to Protein Kit

To test the possible source of the HBV infection outside the study family, we searched the Genbank

Ms. Golub & Ms. Sahar Date: Unit 2- Test #1

Functionally important structural elements of U12 snrna

RNA interference induced hepatotoxicity results from loss of the first synthesized isoform of microrna-122 in mice

Expanded View Figures

SUPPLEMENTARY INFORMATION

Chapter 8. An Introduction to Microbial Metabolism

2 3 Carbon Compounds. Proteins. Proteins

In glycolysis, glucose is converted to pyruvate. If the pyruvate is reduced to lactate, the pathway does not require O 2 and is called anaerobic

2 3 Carbon Compounds Slide 1 of 37

Supplementary Materials for

Biology. Slide 1 of 37. End Show. Copyright Pearson Prentice Hall

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

Aim 19: Cellular Respiration

AP Bio Photosynthesis & Respiration

Expanded View Figures

Chapter 9: Cellular Respiration: Harvesting Chemical Energy

Transcription:

Rice in vivo RN structurome reveals RN secondary structure conservation and divergence in plants Hongjing Deng 1,2,,5, Jitender heema 3, Hang Zhang 2, Hugh Woolfenden 2, Matthew Norris 2, Zhenshan Liu 2, Qi Liu 2, Xiaofei Yang 2, Minglei Yang 2, Xian Deng 1, Xiaofeng ao 1,5, Yiliang Ding 2,5 1 State Key Laboratory of Plant enomics and National enter for Plant ene Research, S enter for Excellence in Molecular Plant Sciences, Institute of enetics and Developmental Biology, hinese cademy of Sciences. Beijing 111, hina 2 Department of ell and Developmental Biology, John Innes entre, Norwich Research Park, Norwich NR 7H, nited Kingdom 3 Department of omputational and Systems Biology, John Innes entre, Norwich Research Park, Norwich NR 7H, nited Kingdom ollege of Life Sciences, niversity of hinese cademy of Sciences, 19, Beijing, hina 5 S-JI entre of Excellence for Plant and Microbial Science (EPMS), John Innes entre, Norwich NR 7H, nited Kingdom orresponding author: X.. (xfcao@genetics.ac.cn) or Y.D. (yiliang.ding@jic.ac.uk). Running title: RN secondary structurome in rice 1

15 min DMS - % + +.75 1.5 + + 2.5 ddt dd dd dd () () () () 1251 1253 1257 1261 126 1266 1279 128 128 1285 1289 1292 131 133 135 1311 1317 1318 1321 1322 1326 1327 1328 133 1331 1332 Lane 1 2 3 5 6 7 8 9

Supplemental Figure 1 Preliminary experiment for optimizing concentration of DMS treatment. The fragment from 1 251 nt to 1 33 nt of 18S rrn was analyzed. Lanes 1-5:,.75, 1.5, 2.5, % (v/v) DMS concentration, respectively, for 15 min at 28 ; lanes 6 to 9: /// sequencing lanes. The red and blue stars show the obviously modified and in concentration course, respectively.

B DMS - + 18S 1363 1365 1369-1371 1373-137 1376 1377 1363 2 Reactivity 6 1368 8 1 12 DMS - + P =.6 P = 5.57E-5 1373 156 158 25S 2 161 156 168 173 176 166 171 1381 1386 1389 1391 12 13 1388 1393 1398 18 111 112 113 11 117 118 196 197 1383 127 128 129 Lane 1 2 3 5 6 211 217 219 227 228 232 113 118 12 23 26 13 18 123 128 6 8 1 2 28 P =.89 P < 2.2E-16 186 191 196 21 26 211 216 221 226 231 23 25 26 27 28 29 25 Relative DMS reactivity el reactivity 236 237 238 181 Nucleotide position 1395 1396 1398 Reactivity 176 1378 Nucleotide position Lane 1 2 3 252 5 6 236 21 26 251 Relative DMS reactivity el reactivity

Supplemental Figure 2 High agreement between Structure-seq and conventional gel-based RN secondary structure probing. () Left: gel-based probing. Right: the comparison of Structure-seq (purple bars) and gel-based probing (orange line) from 1 363 nt to 1 3 nt of 18S rrn. (B) Left: gel-based probing. Right: the comparison of Structure-seq (purple bars) and gel-based probing (orange line) from 156 nt to 253 nt of 25S rrn. Lane 1 and 2, without and with 2.5% DMS treatment; lane 3 to 6, /// sequencing lanes. The positions of each well-visible and are coloured by red and blue stars, respectively. P and P values were shown in the corresponding region.

B 1 1 75 F1 score Sensitivity 75 5 5 25 25 P =.995 25 5 PPV 75 1 D 1 25 5 PPV 75 1 1 75 75 F1 score Sensitivity P =.998 5 25 5 25 P =.988 25 5 PPV 75 1 P =.996 25 5 PPV 75 1

Supplemental Figure 3 High correlation of PPV with sensitivity and F1 score. ()-(B) orrelation of PPV with sensitivity () and F1 score (B) in rice. Both P values are less than 2.2E-16. ()-(D) orrelation of PPV with sensitivity () and F1 score (D) in rabidopsis. Both P values are less than 2.2E-16.

ini index DMS reactivity DMS reactivity DMS reactivity DMS reactivity DMS reactivity. P = 2.118E- P < 2.2E-16..3.3.2.1....2.1 B 5 TR DS... DS 3 TR - -2 2 6 8 1-1 -8-6 - -2 2 Position Position Spliced events nspliced events.3.3.2.2.1.1-1 3 end of 5 exon 5 end of 3 exon -9-8 -7-6 -5 - -3-2 -1 1 11 21 31 1 51 61 71 81 91 Position 5 splice site 3 splice site Position D E 5 5 TR DS 3 TR 3 3 Magnitude 2 1 2 3 5 6 Period (nucleotide) Magnitude 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR Magnitude 6 5 3 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR F.7.6.5. P < 2.2E-16 P =.955 P = 2.1E-1 No m 6 Single m 6 Multiple m 6 s..3.2 MM P = 9.E-13 m 6 ontrol.3.1.2.1-3 -2-1 1 2 3 Position

Supplemental Figure lobal pattern of mrn structurome in rice. () verage DMS reactivity in selected regions of mrns with more than one stop per nucleotide. 23 67 mrns filtered by length were used for this analysis. Wilcoxon rank-sum test (when alternative hypothesis was "less") was used for P value calculation between 5 TR and DS, DS and 3 TR. P values were shown in the corresponding region. (B) verage DMS reactivity around the alternative splice sites between 88 72 spliced and 5 862 unspliced events in 5 splice site and 88 72 spliced and 6 35 unspliced events in 3 splice site. ()-(E) Discrete Fourier transform (DFT) of average DMS reactivity of selected regions in all the mrns (), mrns with high and low mrn content (D), mrns with high and low translation efficiency (E). (F) Effect of the number of m 6 peaks on ini index in 3 TR. (Yellow) 7 38 transcripts without m 6 peaks; (orange) 2 6 transcripts with single m 6 peak; (red) 2 75 transcripts with multiple m 6 peaks. P values were calculated by Wilcoxon rank-sum test. () DMS reactivity pattern around the conserved motif (MM, M = or ) identified in m 6 -seq (Li et al., 21). 221 motifs with m 6 and 19 motifs without m 6 were analyzed. P values were calculated by Kolmogorov-Smirnov (KS) test.

DMS reactivity DMS reactivity content content 1 High content Low content 1.8.8.6.....6..2-5 TR DS... DS 3 TR -2 2 6 8 1-1 -8-6 - -2 2 Position Position.2 B.5 High content Low content P =.2657 P =.19 P = 3.26E-7 P = 1.11E-15.5..3.2.....3.2.1 5 TR DS... DS 3 TR - -2 2 6 8 1-1 -8-6 - -2 2 Position Position.1 3 Magnitude 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR

Supplemental Figure 5 Effect of content on the global pattern of mrn structurome in rabidopsis. ()-(B) verage content () and average DMS reactivity (B) of transcripts with the highest 15% and lowest 15% mrn content with more than one stop per nucleotide (11 27 in total) in rabidopsis. P values between two groups were calculated by KS test. () DFT of average DMS reactivity of selected regions in mrns with high and low mrn content.

O. sativa DMS - + B. thaliana DMS - + 12 P =.66 P = 1.385E-6 1 el reactivity O. sativa. thaliana 8 6 2 695-697 698-699 7 77 79 71-711 71-715 717 72 721 723 725 73-75 76-78 711-712 715 717 718-719 722-723 725 728 729 737 739-7 72 7 75 751 752-753 755-756 758 759 76 762 763 76 75 755 757 758 759 76 763 768 776 778 779 78 785 782 787 788 793 79 791 792 795 796 79 795 799 8 799 5 6 11 16 21 26 8 Lane 1 2 3 5 6 31 36 1 6 51 56 61 66 71 76 81 86 91 96 11 16 7 71 72 73 7 75 76 77 71 72 73 7 75 76 77 12 79 8 78 79 8 P =.77 P = 7.389E-1 1 8 6 2 78 - - - 1 6 11 16 21 26 773 773 77 3 6 Sequence identity:.81 768 77 765 2 O. sativa. thaliana Relative DMS reactivity 75 76 77-78 75 1 Nucleotide position 733 738 Lane 1 31 36 1 6 51 56 61 66 71 76 81 86 91 96 11 16 Nucleotide position O. sativa 3 5 8 75 7. thaliana 3 5 8 75 7 Structure-Seq DMS reactivity.8.2 /

Supplemental Figure 6 omparison of 25S rrn fragment in rice and rabidopsis. The fragments are from 695 nt to 8 nt in rice and from 73 nt to 85 nt in rabidopsis. () el-based probing in rice and rabidopsis. Lane 1 and 2, without and with 2.5% DMS treatment; lane 3 to 6, /// sequencing lanes. The positions of each well-visible and are coloured by red and blue, respectively. (B) omparison of gel reactivity (top) and relative DMS reactivity (bottom) in rice and rabidopsis. In the sequence alignment, mismatches are highlighted in a green background, best matches are highlighted in a purple background, and the region with highly conserved in vivo RN secondary structure is marked in white. Pairwise alignments were performed in EMBOSS Needle. P and P value were shown in the corresponding region. () Phylogenetic structure of the fragments coloured by DMS reactivity. The orange line represents the highly conserved structure probing region identified in gel-based probing and Structure-seq analysis. The probed sites using gel-based probing (Supplemental Figure 6 and 6B top) and Structure-seq (Supplemental Figure 6B bottom) show very similar DMS modification patterns between rice and rabidopsis. The pattern of DMS reactivity is consistent with the phylogeny-derived structures (Supplemental Figure 6).

B Reactivity Reactivity 1.5 T23337.1 11 LO_Os2g5696.1 T23337.1 12 13 1 15 16 17 18 19 2 11 12 13 Sequence identity:.8 1 15 16 T1273.1 2.5 Reactivity 2.5 Reactivity LO_Os9g735.1 LO_Os2g5696.1 1.5 17 18 19 2 LO_Os9g735.1 T1273.1 88 89 9 91 92 93 9 95 96 97 87 88 89 Sequence identity:.5 LO_Os2g5696.1 in vivo LO_Os9g735.1 in vivo T23337.1 in vivo T1273.1 in vivo 9 91 92 93 9 95 96

Supplemental Figure 7 Two individual examples for two kinds of relationship between sequence identity and structure similarity. High sequence identity and high structure similarity (group-1) () and low sequence identity and low structure similarity (group-) (B) exhibit two kinds of relationship between sequence identity and structure similarity. The results in rice are coloured with red and the ones in rabidopsis with blue. The reactivity is compared based on the sequence alignment of selected regions. Best matched regions are coloured in a light purple background, the mismatches on and in a green background, and the mismatches on and in a yellow background. rc diagrams were shown to compare in vivo RN secondary structure in the selected regions.

Supplemental Table 1 libraries orrelation of stop counts in different replicates of Structure-seq Pearson correlation coefficient (P) 18S rrn a 25S rrn a (-) DMS replicate 1 vs. (-) DMS replicate 2.96.89 (+) DMS replicate 1 vs. (+) DMS replicate 2.98.98 (-) DMS replicate 1 vs. (+) DMS replicate 1.6.9 (-) DMS replicate 2 vs. (+) DMS replicate 2.5.32 a: ll the P values of P are less than 2.2E-16. 1

Supplemental Table 2 O enrichment of the highest and lowest 1% PPV in rice ID ene set name P value -log1(p value) Fold enrichment enes with the highest 1% PPV O:51171 Regulation of nitrogen compound metabolic process 1.68E-19 18.77 1.82 O:6255 Regulation of macromolecule metabolic process 3.62E-19 18. 1.77 O:168 Regulation of gene expression 2.E-18 17.7 1.77 O:6355 Regulation of transcription, DNdependent 1.51E-9 8.82 1.75 O:1876 Lipid localization 3.77E-6 5.2 3.6 enes with the lowest 1% PPV O:167 RN metabolic process 7.28E-11 1.1 1.65 O:55 ell redox homeostasis.5e-7 6.35 3.37 O:6811 Ion transport 1.E-6 5.98 1.91 O:7165 Signal transduction 9.66E-5.2 1.86 O:15979 Photosynthesis 6.13E-3 2.21 1.88 2

Supplemental Table 3 orrelation of m 6 modification and ini index in rice Region mrn a 5 TR b DS c 3 TR d m 6 enrichment vs. ini index P.22 -.38 -.3 -.27 P value.8196.527.877.188 m 6 peak density vs. ini index P -.1.5 -.5 -.15 P value < 2.2E-16 7.823E-7 6.991E-8 < 2.2E-16 a: 1 722 transcripts were analyzed; b: 282 transcripts were analyzed; c: 2 712 transcripts were analyzed; d: 7 25 transcripts were analyzed. 3

Supplemental Table verage content a in our analyzed populations Region (mrn) (5' TR) (DS) (3' TR) Rice (n = 11,87).5 ±.55.587 ±.95.538 ±.91.1 ±.2. thaliana (n = 7,2).2 ±.2.386 ±.58.59 ±.31.33 ±.38 a: Mean and SD are shown.

Supplemental Table 5 Details of enriched O terms listed in Figure 5B to 5E ID ene set name P value -log1(p value) Fold enrichment High sequence identity, high structure similarity O:3197 hromatin assembly 6.25E-32 31.2 38.9 O:726 Small TPase mediated signal transduction 6.55E-19 18.18 17.35 O:921 Ribonucleoside triphosphate biosynthetic process 3.76E-8 7.2 1.37 O:61 Tricarboxylic acid cycle intermediate metabolic process 1.16E-7 6.9.9 O:613 Translational initiation 1.75E-6 5.76 1.7 O:6511 biquitin-dependent protein catabolic process 1.2E- 3.92 5.31 O:6352 DN-dependent transcription, initiation 3.3E-3 2.8 8.51 High sequence identity, low structure similarity O:16192 Vesicle-mediated transport 1.5E-8 7.81 8.5 O:283 Small molecule biosynthetic process 1.98E-7 6.7 6.9 O:271 ellular nitrogen compound biosynthetic process 6.53E-7 6.19 6.56 O:1651 arbohydrate biosynthetic process 2.1E-6 5.62 7.87 O:925 lucan biosynthetic process 3.55E-5.5 11.53 O:69 luconeogenesis 7.77E- 3.11 9.18 O:15995 hlorophyll biosynthetic process 1.81E-3 2.7 29.51 Low sequence identity, high structure similarity O:6869 Lipid transport 9.81E-9 8.1 9.1 O:51171 Regulation of nitrogen compound metabolic process 2.6E-6 5.69 1.97 O:1556 Regulation of macromolecule biosynthetic process 3.E-6 5.6 1.92 O:89 Regulation of primary metabolic process 6.7E-6 5.17 1.86 O:6351 Transcription, DN-dependent 8.9E-5.9 2.15 O:51252 Regulation of RN metabolic process 1.3E- 3.8 2.13 O:55 ell redox homeostasis 5.89E- 3.23.87 O:318 Nicotianamine biosynthetic process 7.99E- 3.1 57.89 Low sequence identity, low structure similarity O:5992 Trehalose biosynthetic process 1.8E- 3.7 12.6 O:6793 Phosphorus metabolic process 2.6E- 3.61 1.58 O:3687 Post-translational protein modification 3.7E- 3.3 1.56 O:2221 Response to chemical stimulus.88e- 3.31 2.97 O:668 Protein phosphorylation 1.5E-3 2.98 1.5 O:6979 Response to oxidative stress 7.6E-3 2.12 2.86 O:255 ell wall modification 8.3E-3 2.1 6.15 5

6