Rice in vivo RN structurome reveals RN secondary structure conservation and divergence in plants Hongjing Deng 1,2,,5, Jitender heema 3, Hang Zhang 2, Hugh Woolfenden 2, Matthew Norris 2, Zhenshan Liu 2, Qi Liu 2, Xiaofei Yang 2, Minglei Yang 2, Xian Deng 1, Xiaofeng ao 1,5, Yiliang Ding 2,5 1 State Key Laboratory of Plant enomics and National enter for Plant ene Research, S enter for Excellence in Molecular Plant Sciences, Institute of enetics and Developmental Biology, hinese cademy of Sciences. Beijing 111, hina 2 Department of ell and Developmental Biology, John Innes entre, Norwich Research Park, Norwich NR 7H, nited Kingdom 3 Department of omputational and Systems Biology, John Innes entre, Norwich Research Park, Norwich NR 7H, nited Kingdom ollege of Life Sciences, niversity of hinese cademy of Sciences, 19, Beijing, hina 5 S-JI entre of Excellence for Plant and Microbial Science (EPMS), John Innes entre, Norwich NR 7H, nited Kingdom orresponding author: X.. (xfcao@genetics.ac.cn) or Y.D. (yiliang.ding@jic.ac.uk). Running title: RN secondary structurome in rice 1
15 min DMS - % + +.75 1.5 + + 2.5 ddt dd dd dd () () () () 1251 1253 1257 1261 126 1266 1279 128 128 1285 1289 1292 131 133 135 1311 1317 1318 1321 1322 1326 1327 1328 133 1331 1332 Lane 1 2 3 5 6 7 8 9
Supplemental Figure 1 Preliminary experiment for optimizing concentration of DMS treatment. The fragment from 1 251 nt to 1 33 nt of 18S rrn was analyzed. Lanes 1-5:,.75, 1.5, 2.5, % (v/v) DMS concentration, respectively, for 15 min at 28 ; lanes 6 to 9: /// sequencing lanes. The red and blue stars show the obviously modified and in concentration course, respectively.
B DMS - + 18S 1363 1365 1369-1371 1373-137 1376 1377 1363 2 Reactivity 6 1368 8 1 12 DMS - + P =.6 P = 5.57E-5 1373 156 158 25S 2 161 156 168 173 176 166 171 1381 1386 1389 1391 12 13 1388 1393 1398 18 111 112 113 11 117 118 196 197 1383 127 128 129 Lane 1 2 3 5 6 211 217 219 227 228 232 113 118 12 23 26 13 18 123 128 6 8 1 2 28 P =.89 P < 2.2E-16 186 191 196 21 26 211 216 221 226 231 23 25 26 27 28 29 25 Relative DMS reactivity el reactivity 236 237 238 181 Nucleotide position 1395 1396 1398 Reactivity 176 1378 Nucleotide position Lane 1 2 3 252 5 6 236 21 26 251 Relative DMS reactivity el reactivity
Supplemental Figure 2 High agreement between Structure-seq and conventional gel-based RN secondary structure probing. () Left: gel-based probing. Right: the comparison of Structure-seq (purple bars) and gel-based probing (orange line) from 1 363 nt to 1 3 nt of 18S rrn. (B) Left: gel-based probing. Right: the comparison of Structure-seq (purple bars) and gel-based probing (orange line) from 156 nt to 253 nt of 25S rrn. Lane 1 and 2, without and with 2.5% DMS treatment; lane 3 to 6, /// sequencing lanes. The positions of each well-visible and are coloured by red and blue stars, respectively. P and P values were shown in the corresponding region.
B 1 1 75 F1 score Sensitivity 75 5 5 25 25 P =.995 25 5 PPV 75 1 D 1 25 5 PPV 75 1 1 75 75 F1 score Sensitivity P =.998 5 25 5 25 P =.988 25 5 PPV 75 1 P =.996 25 5 PPV 75 1
Supplemental Figure 3 High correlation of PPV with sensitivity and F1 score. ()-(B) orrelation of PPV with sensitivity () and F1 score (B) in rice. Both P values are less than 2.2E-16. ()-(D) orrelation of PPV with sensitivity () and F1 score (D) in rabidopsis. Both P values are less than 2.2E-16.
ini index DMS reactivity DMS reactivity DMS reactivity DMS reactivity DMS reactivity. P = 2.118E- P < 2.2E-16..3.3.2.1....2.1 B 5 TR DS... DS 3 TR - -2 2 6 8 1-1 -8-6 - -2 2 Position Position Spliced events nspliced events.3.3.2.2.1.1-1 3 end of 5 exon 5 end of 3 exon -9-8 -7-6 -5 - -3-2 -1 1 11 21 31 1 51 61 71 81 91 Position 5 splice site 3 splice site Position D E 5 5 TR DS 3 TR 3 3 Magnitude 2 1 2 3 5 6 Period (nucleotide) Magnitude 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR Magnitude 6 5 3 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR F.7.6.5. P < 2.2E-16 P =.955 P = 2.1E-1 No m 6 Single m 6 Multiple m 6 s..3.2 MM P = 9.E-13 m 6 ontrol.3.1.2.1-3 -2-1 1 2 3 Position
Supplemental Figure lobal pattern of mrn structurome in rice. () verage DMS reactivity in selected regions of mrns with more than one stop per nucleotide. 23 67 mrns filtered by length were used for this analysis. Wilcoxon rank-sum test (when alternative hypothesis was "less") was used for P value calculation between 5 TR and DS, DS and 3 TR. P values were shown in the corresponding region. (B) verage DMS reactivity around the alternative splice sites between 88 72 spliced and 5 862 unspliced events in 5 splice site and 88 72 spliced and 6 35 unspliced events in 3 splice site. ()-(E) Discrete Fourier transform (DFT) of average DMS reactivity of selected regions in all the mrns (), mrns with high and low mrn content (D), mrns with high and low translation efficiency (E). (F) Effect of the number of m 6 peaks on ini index in 3 TR. (Yellow) 7 38 transcripts without m 6 peaks; (orange) 2 6 transcripts with single m 6 peak; (red) 2 75 transcripts with multiple m 6 peaks. P values were calculated by Wilcoxon rank-sum test. () DMS reactivity pattern around the conserved motif (MM, M = or ) identified in m 6 -seq (Li et al., 21). 221 motifs with m 6 and 19 motifs without m 6 were analyzed. P values were calculated by Kolmogorov-Smirnov (KS) test.
DMS reactivity DMS reactivity content content 1 High content Low content 1.8.8.6.....6..2-5 TR DS... DS 3 TR -2 2 6 8 1-1 -8-6 - -2 2 Position Position.2 B.5 High content Low content P =.2657 P =.19 P = 3.26E-7 P = 1.11E-15.5..3.2.....3.2.1 5 TR DS... DS 3 TR - -2 2 6 8 1-1 -8-6 - -2 2 Position Position.1 3 Magnitude 2 1 2 3 5 6 Period (nucleotide) High - 5 TR High - DS High - 3 TR Low - 5 TR Low - DS Low - 3 TR
Supplemental Figure 5 Effect of content on the global pattern of mrn structurome in rabidopsis. ()-(B) verage content () and average DMS reactivity (B) of transcripts with the highest 15% and lowest 15% mrn content with more than one stop per nucleotide (11 27 in total) in rabidopsis. P values between two groups were calculated by KS test. () DFT of average DMS reactivity of selected regions in mrns with high and low mrn content.
O. sativa DMS - + B. thaliana DMS - + 12 P =.66 P = 1.385E-6 1 el reactivity O. sativa. thaliana 8 6 2 695-697 698-699 7 77 79 71-711 71-715 717 72 721 723 725 73-75 76-78 711-712 715 717 718-719 722-723 725 728 729 737 739-7 72 7 75 751 752-753 755-756 758 759 76 762 763 76 75 755 757 758 759 76 763 768 776 778 779 78 785 782 787 788 793 79 791 792 795 796 79 795 799 8 799 5 6 11 16 21 26 8 Lane 1 2 3 5 6 31 36 1 6 51 56 61 66 71 76 81 86 91 96 11 16 7 71 72 73 7 75 76 77 71 72 73 7 75 76 77 12 79 8 78 79 8 P =.77 P = 7.389E-1 1 8 6 2 78 - - - 1 6 11 16 21 26 773 773 77 3 6 Sequence identity:.81 768 77 765 2 O. sativa. thaliana Relative DMS reactivity 75 76 77-78 75 1 Nucleotide position 733 738 Lane 1 31 36 1 6 51 56 61 66 71 76 81 86 91 96 11 16 Nucleotide position O. sativa 3 5 8 75 7. thaliana 3 5 8 75 7 Structure-Seq DMS reactivity.8.2 /
Supplemental Figure 6 omparison of 25S rrn fragment in rice and rabidopsis. The fragments are from 695 nt to 8 nt in rice and from 73 nt to 85 nt in rabidopsis. () el-based probing in rice and rabidopsis. Lane 1 and 2, without and with 2.5% DMS treatment; lane 3 to 6, /// sequencing lanes. The positions of each well-visible and are coloured by red and blue, respectively. (B) omparison of gel reactivity (top) and relative DMS reactivity (bottom) in rice and rabidopsis. In the sequence alignment, mismatches are highlighted in a green background, best matches are highlighted in a purple background, and the region with highly conserved in vivo RN secondary structure is marked in white. Pairwise alignments were performed in EMBOSS Needle. P and P value were shown in the corresponding region. () Phylogenetic structure of the fragments coloured by DMS reactivity. The orange line represents the highly conserved structure probing region identified in gel-based probing and Structure-seq analysis. The probed sites using gel-based probing (Supplemental Figure 6 and 6B top) and Structure-seq (Supplemental Figure 6B bottom) show very similar DMS modification patterns between rice and rabidopsis. The pattern of DMS reactivity is consistent with the phylogeny-derived structures (Supplemental Figure 6).
B Reactivity Reactivity 1.5 T23337.1 11 LO_Os2g5696.1 T23337.1 12 13 1 15 16 17 18 19 2 11 12 13 Sequence identity:.8 1 15 16 T1273.1 2.5 Reactivity 2.5 Reactivity LO_Os9g735.1 LO_Os2g5696.1 1.5 17 18 19 2 LO_Os9g735.1 T1273.1 88 89 9 91 92 93 9 95 96 97 87 88 89 Sequence identity:.5 LO_Os2g5696.1 in vivo LO_Os9g735.1 in vivo T23337.1 in vivo T1273.1 in vivo 9 91 92 93 9 95 96
Supplemental Figure 7 Two individual examples for two kinds of relationship between sequence identity and structure similarity. High sequence identity and high structure similarity (group-1) () and low sequence identity and low structure similarity (group-) (B) exhibit two kinds of relationship between sequence identity and structure similarity. The results in rice are coloured with red and the ones in rabidopsis with blue. The reactivity is compared based on the sequence alignment of selected regions. Best matched regions are coloured in a light purple background, the mismatches on and in a green background, and the mismatches on and in a yellow background. rc diagrams were shown to compare in vivo RN secondary structure in the selected regions.
Supplemental Table 1 libraries orrelation of stop counts in different replicates of Structure-seq Pearson correlation coefficient (P) 18S rrn a 25S rrn a (-) DMS replicate 1 vs. (-) DMS replicate 2.96.89 (+) DMS replicate 1 vs. (+) DMS replicate 2.98.98 (-) DMS replicate 1 vs. (+) DMS replicate 1.6.9 (-) DMS replicate 2 vs. (+) DMS replicate 2.5.32 a: ll the P values of P are less than 2.2E-16. 1
Supplemental Table 2 O enrichment of the highest and lowest 1% PPV in rice ID ene set name P value -log1(p value) Fold enrichment enes with the highest 1% PPV O:51171 Regulation of nitrogen compound metabolic process 1.68E-19 18.77 1.82 O:6255 Regulation of macromolecule metabolic process 3.62E-19 18. 1.77 O:168 Regulation of gene expression 2.E-18 17.7 1.77 O:6355 Regulation of transcription, DNdependent 1.51E-9 8.82 1.75 O:1876 Lipid localization 3.77E-6 5.2 3.6 enes with the lowest 1% PPV O:167 RN metabolic process 7.28E-11 1.1 1.65 O:55 ell redox homeostasis.5e-7 6.35 3.37 O:6811 Ion transport 1.E-6 5.98 1.91 O:7165 Signal transduction 9.66E-5.2 1.86 O:15979 Photosynthesis 6.13E-3 2.21 1.88 2
Supplemental Table 3 orrelation of m 6 modification and ini index in rice Region mrn a 5 TR b DS c 3 TR d m 6 enrichment vs. ini index P.22 -.38 -.3 -.27 P value.8196.527.877.188 m 6 peak density vs. ini index P -.1.5 -.5 -.15 P value < 2.2E-16 7.823E-7 6.991E-8 < 2.2E-16 a: 1 722 transcripts were analyzed; b: 282 transcripts were analyzed; c: 2 712 transcripts were analyzed; d: 7 25 transcripts were analyzed. 3
Supplemental Table verage content a in our analyzed populations Region (mrn) (5' TR) (DS) (3' TR) Rice (n = 11,87).5 ±.55.587 ±.95.538 ±.91.1 ±.2. thaliana (n = 7,2).2 ±.2.386 ±.58.59 ±.31.33 ±.38 a: Mean and SD are shown.
Supplemental Table 5 Details of enriched O terms listed in Figure 5B to 5E ID ene set name P value -log1(p value) Fold enrichment High sequence identity, high structure similarity O:3197 hromatin assembly 6.25E-32 31.2 38.9 O:726 Small TPase mediated signal transduction 6.55E-19 18.18 17.35 O:921 Ribonucleoside triphosphate biosynthetic process 3.76E-8 7.2 1.37 O:61 Tricarboxylic acid cycle intermediate metabolic process 1.16E-7 6.9.9 O:613 Translational initiation 1.75E-6 5.76 1.7 O:6511 biquitin-dependent protein catabolic process 1.2E- 3.92 5.31 O:6352 DN-dependent transcription, initiation 3.3E-3 2.8 8.51 High sequence identity, low structure similarity O:16192 Vesicle-mediated transport 1.5E-8 7.81 8.5 O:283 Small molecule biosynthetic process 1.98E-7 6.7 6.9 O:271 ellular nitrogen compound biosynthetic process 6.53E-7 6.19 6.56 O:1651 arbohydrate biosynthetic process 2.1E-6 5.62 7.87 O:925 lucan biosynthetic process 3.55E-5.5 11.53 O:69 luconeogenesis 7.77E- 3.11 9.18 O:15995 hlorophyll biosynthetic process 1.81E-3 2.7 29.51 Low sequence identity, high structure similarity O:6869 Lipid transport 9.81E-9 8.1 9.1 O:51171 Regulation of nitrogen compound metabolic process 2.6E-6 5.69 1.97 O:1556 Regulation of macromolecule biosynthetic process 3.E-6 5.6 1.92 O:89 Regulation of primary metabolic process 6.7E-6 5.17 1.86 O:6351 Transcription, DN-dependent 8.9E-5.9 2.15 O:51252 Regulation of RN metabolic process 1.3E- 3.8 2.13 O:55 ell redox homeostasis 5.89E- 3.23.87 O:318 Nicotianamine biosynthetic process 7.99E- 3.1 57.89 Low sequence identity, low structure similarity O:5992 Trehalose biosynthetic process 1.8E- 3.7 12.6 O:6793 Phosphorus metabolic process 2.6E- 3.61 1.58 O:3687 Post-translational protein modification 3.7E- 3.3 1.56 O:2221 Response to chemical stimulus.88e- 3.31 2.97 O:668 Protein phosphorylation 1.5E-3 2.98 1.5 O:6979 Response to oxidative stress 7.6E-3 2.12 2.86 O:255 ell wall modification 8.3E-3 2.1 6.15 5
6