Nucleosome positioning as a determinant of exon recognition

Similar documents
Genome-wide nucleosome positioning during embryonic stem cell development

EFFECT OF DIETARY ENZYME ON PERFORMANCE OF WEANLING PIGS

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION

Poultry No The replacement value of betaine for DL-methionine and Choline in broiler diets

Alimonti_Supplementary Figure 1. Pten +/- Pten + Pten. Pten hy. β-actin. Pten - wt hy/+ +/- wt hy/+ +/- Pten. Pten. Relative Protein level (% )

Lesions of prefrontal cortex reduce attentional modulation of neuronal responses. and synchrony in V4

Cos7 (3TP) (K): TGFβ1(h): (K)

Whangarei District Council Class 4 Gambling Venue Policy

LHb VTA. VTA-projecting RMTg-projecting overlay. Supplemental Figure 2. Retrograde labeling of LHb neurons. a. VTA-projecting LHb

SUPPLEMENTARY INFORMATION

P AND K IN POTATOES. Donald A Horneck Oregon State University Extension Service

Supplementary Figure 1. Scheme of unilateral pyramidotomy used for detecting compensatory sprouting of intact CST axons.

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION

Learning to see: experience and attention in primary visual cortex

Other Uses for Cluster Sampling

TNFa Signaling Exposes Latent Estrogen Receptor Binding Sites to Alter the Breast Cancer Cell Transcriptome

Plant Physiology Preview. Published on February 21, 2017, as DOI: /pp

Chloride Nutrition Regulates Water Balance in Plants

SUPPLEMENTARY INFORMATION

The Role of Background Statistics in Face Adaptation

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION

PTSE RATES IN PNNI NETWORKS

Mechanisms underlying cross-orientation suppression in cat visual cortex

The GCN5-CITED2-PKA signalling module controls hepatic glucose metabolism through a camp-induced substrate switch

The microrna mir-31 inhibits CD8 + T cell function in chronic viral infection

Interplay of LRRK2 with chaperone-mediated autophagy

Introduction to Study Designs II

Title of Experiment: Author, Institute and address:

Open Access RESEARCH ARTICLE. Genetics Selection Evolution

CAUSES OF DIARRHEA, PNEUMONIA, AND ABORTION IN 1991 CATTLE SUBMISSIONS TO THE KSU VETERINARY DIAGNOSTIC LABORATORY

Neural population coding of sound level adapts to stimulus statistics

Receptive field structure varies with layer in the primary visual cortex

Fates-shifted is an F box protein that targets Bicoid for degradation and regulates developmental fate determination in Drosophila embryos

Effects of exercise training on hepatic steatosis in high fat diet-induced obese mice

Reward expectation differentially modulates attentional behavior and activity in visual area V4

A maternal junk food diet in pregnancy and lactation promotes an exacerbated taste for junk food and a greater propensity for obesity in rat offspring

N6-methyladenosine (m6a) is the most prevalent messenger

SUPPLEMENTARY INFORMATION

Lipid Composition of Egg Yolk and Serum in Laying Hens Fed Diets Containing Black Cumin (Nigella sativa)

Aquaculture (2012) Contents lists available at SciVerse ScienceDirect. Aquaculture

CSE 5311 Notes 2: Binary Search Trees

Maintenance of protein synthesis reading frame by EF-P and m 1 G37-tRNA

The soy isoflavone genistein promotes apoptosis in mammary epithelial cells by inducing the tumor suppressor PTEN

Chapter 7. Control and Coordination

Input from external experts and manufacturer on the 2 nd draft project plan Stool DNA testing for early detection of colorectal cancer

Changing Views of the Role of Superior Colliculus in the Control of Gaze

Supplementary Information

Imaging analysis of clock neurons reveals light buffers the wake-promoting effect of dopamine

Model for processive movement of myosin V and myosin VI

SUPPLEMENTARY INFORMATION

REVIEW Study of the Formation of trans Fatty Acids in Model Oils (triacylglycerols) and Edible Oils during the Heating Process

The effect of manure, zeolite and soil ageing in the dynamics of hexavalent chromium in Cichorium spinosum

CONCENTATION OF MINERAL ELEMENTS IN CALLUS TISSUE CULTURE OF SOME SUNFLOWER INBRED LINES

Neural antecedents of self-initiated actions in secondary motor cortex

RESEARCH ARTICLE. Supplemental Figure 5

SUPPLEMENTARY INFORMATION

Olfactory behavior and physiology are disrupted in prion protein knockout mice

Insulin regulation of heart function in aging fruit flies

In vivo intracellular recording and perturbation of persistent activity in a neural integrator

(% of adherent cells) *** PBL firm adhesion. Frequency (% ) 4 1 L 2 CXCR3 DP-2

Lysine enhances methionine content by modulating the expression of S-adenosylmethionine synthase

ARTICLES. Lateral presynaptic inhibition mediates gain control in an olfactory circuit. Shawn R. Olsen 1 & Rachel I. Wilson 1

Efficient sensory cortical coding optimizes pursuit eye movements

Adaptive echolocation behavior in bats for the analysis of auditory scenes

Mediating Multi-Party Negotiation Through Marker-Based Tracking of Mobile Phones

CS Artificial Intelligence 2007 Semester 2. CompSci 366. Classical Planning: Regression Planning. Part II: Lecture 5 1 of 20

Inhibitory effect of p38 mitogen-activated protein kinase inhibitors on cytokine release from human macrophages

Long-term modification of cortical synapses improves sensory perception

The Hippo/YAP pathway interacts with EGFR signaling and HPV oncoproteins to regulate cervical cancer progression

larvi 2013 Epigenetic regulation of muscle development and growth in Senegalese sole larvae Catarina Campos

Tbp. Per Relative mrna levels Circadian Time. Liver weight/ body weight (%) n.s. Pernull

Supplementary Figure S1

Operating Systems Principles. Page Replacement Algorithms

Effects of Enzyme Inducers in Therapeutic Efficacy of Rosiglitazone: An Antidiabetic Drug in Albino Rats

FRAMEstar. 2-Component PCR Plates

Distribution, recognition and regulation of non-cpg methylation in the adult mammalian brain

Research Article A Comparison of Inflammatory and Oxidative Stress Markers in Adipose Tissue from Weight-Matched Obese Male and Female Mice

Agilent G6825AA MassHunter Pathways to PCDL Software Quick Start Guide

Copy Number ID2 MYCN ID2 MYCN. Copy Number MYCN DDX1 ID2 KIDINS220 MBOAT2 ID2

RESEARCH COMMUNICATION. Interactions Between MTHFR C677T - A1298C Variants and Folic Acid Deficiency Affect Breast Cancer Risk in a Chinese Population

Minimum effective dose of chenic acid for gallstone patients: reduction with bedtime administration and

Proteins from eight eukaryotic cytochrome P-450 families share a segmented region of sequence similarity

Adiabatic CMOS Circuit Design: Principles and Examples

Supplementary Figure 1

Intestine specific MTP deficiency with global ACAT2 gene ablation lowers acute cholesterol absorption with chylomicrons and high density lipoproteins

Concentrations and resorption patterns of 13 nutrients in different plant functional types in the karst region of south-western China

RESEARCH ARTICLE Activity of intestinal carbohydrases responds to multiple dietary signals in nestling house sparrows

Optimisation of diets for Atlantic cod (Gadus morhua) broodstock: effect of arachidonic acid on egg & larval quality

The kinetics and stiffness characteristics of the lower extremity in older adults during vertical jumping

University of Groningen

Input normalization by global feedforward inhibition expands cortical dynamic range

Gibberellins regulate iron deficiency-response by influencing iron transport and translocation in rice seedlings (Oryza sativa)

AJ PUTT. Hematology. Chemistry. Species: Canine Gender: Female Year of Birth: 2013 Client: PUTT

Changes in Protease Activity and Proteins in Naked Oats (Avena nuda L.) during Germination

Association between haloacetic acid degradation and heterotrophic bacteria in water distribution systems

Chow KD CR HFD. Fed Fast Refed

Evolution of metal hyperaccumulation required cis-regulatory changes and triplication of HMA4

Transcription:

Nuleosome positioning s determinnt of exon reognition Hgen Tilgner 1,3, Christoforos Nikolou 1,3, Sonj Althmmer 1, Mihel Smmeth 1, Miguel Beto 1, Jun Vlárel 1,2 & Roderi Guigó 1 200 Nture Ameri, In. All rights reserved. Chromtin struture influenes trnsription, ut its role in susequent RNA proessing is unler. Here we present nlyses of high-throughput dt tht imply reltionship etween nuleosome positioning nd exon definition. First, we hve found stle nuleosome oupny within humn nd Cenorhditis elegns exons tht is stronger in exons with wek splie sites. Conversely, we hve found tht pseudoexons introni sequenes tht re not inluded in mrnas ut re flnked y strong splie sites show nuleosome depletion. Seond, the rtio etween nuleosome oupny within nd upstrem from the exons orreltes with exon-inlusion levels. Third, nuleosomes re positioned entrl to exons rther thn proximl to splie sites. These exoni nuleosoml ptterns re lso oserved in non-expressed genes, suggesting tht nuleosome mrking of exons exists in the sene of trnsription. Our nlysis provides frmework tht ontriutes to the understnding of spliing on the sis of hromtin rhiteture. Eukryoti gene expression relies on the funtion of ttery of omplex moleulr mhineries tht exeute the geneti progrm, from trnsription nd proessing of primry RNAs in the nuleus to mrna trnsltion in the ytoplsm. These proesses re highly oordinted, nd their funtionl interply opens welth of opportunities for gene regultion 1,2. For exmple, there is undnt evidene for the funtionl oupling etween trnsription nd pre-mrna spliing 3,4 nd for role of this oupling in lterntive spliing regultion 5. Diret intertions of RNA proessing ftors with the lrgest suunit of RNA polymerse II (RNAPII) through its C-terminl domin (CTD) provide mehnism for effiient o-trnsriptionl delivery of sl nd regultory spliing ftors on nsent trnsripts. In ddition, some promoter-ssoited trnsription ftors nd o-regultors lso show spliing tivities nd/or reruit omponents of the spliing mhinery. Conversely, pkging of nsent trnsripts with RNA-inding proteins preludes extended RNA-DNA hyrids, filittes trnsription elongtion nd prevents genomi instility,16. In turn, trnsription elongtion ftors, inluding spliing regultors tht tively promote trnsription elongtion 1, n ffet lterntive spliing deisions y modulting the timing t whih ompeting splie sites eome ville in nsent trnsripts 1 21. Eukryoti DNA is wrpped round nuleosomes, the pkging units of hromtin, nd this rhiteture is key determinnt of ll spets of DNA metolism 22. Nuleosomes ontin two sets of four histone moleules. Chromtin remodeling is frequently ssoited with omintoril ode of post-trnsltionl modifitions of the flexile N-terminl histone tils, whih n regulte hromtin omption nd the essiility of ftors responsile for DNA replition, reomintion, repir nd trnsription 23. The tight oupling etween trnsription nd RNA proessing opens the intriguing possiility tht hromtin rhiteture nd dynmis hve role in susequent steps of the gene expression pthwy 24. Indeed, the Brhm suunit of the hromtin-remodeling omplex SWI/SNF hs een shown to intert with spliing ftors, influene the umultion of RNAPII on lterntive exons nd regulte lterntive spliing, proly through lol hnges in trnsription elongtion. A generl link etween nuleosomes nd the gene exon-intron rhiteture hs een proposed y Trifonov nd ollegues 25 2. These uthors oserved tht the distne etween onseutive 5 or 3 splie sites shows periodiity reminisent of the unit length of DNA wrpping round nuleosomes 25, suggesting tht nuleosomes re somehow phsed with the sequenes tht diret intron removl. On the sis of DNA sequene ptterns tht re hrteristi of stly positioned nuleosomes, they predited tht splie sites re frequently loted ner the nuleosome dyd xis, preferene tht they relte to the need to protet splie sites from muttion 26,2. Genome-wide nlyses of nuleosome oupny in worms nd humns hve een reently pulished 2,2. Using these dt, we set out to investigte the reltionship etween gene rhiteture nd stle nuleosome positioning on the genomes of these speies sequenes. Thus, here nd in the ompnying pper y Shwrtz et l. 30, we report tht stly positioned nuleosomes re more frequent in exons thn in the surrounding introns, trend tht is more pronouned in exons flnked y wek splie sites nd tht ontrsts with the deresed oupny oserved in pseudoexons. These oservtions re strongly suggestive of role for hromtin orgniztion in RNA proessing. Our results lso offer n explntion for the exoni enrihment of prtiulr histone modifitions reently reported 31 nd suggest tht 1 Center for Genomi Regultion, Universitt Pompeu Fr, Brelon, Ctloni, Spin, 2 Instituió Ctln de Reer i Estudis Avnçts, Brelon, Ctloni, Spin, 3 These uthors ontriuted eqully to this work. Correspondene should e ddressed to R.G. (roderi.guigo@rg.t). Reeived My; epted 21 July; pulished online 16 August 200; doi:.3/nsm.165 nture struturl & moleulr iology dvne online pulition

200 Nture Ameri, In. All rights reserved. Men SymCurv sore.5.0.5.0.5.0.5.0.5.0.5.0 0.03 0.036 0.034 0.032 0.030 0.02 0.026 0.024 0.022 All 6,450 exons All 4,26 pseudoexons 3,22 wek exons 3,22 strong exons 1,320 strong pseudoexons Nuleosome oupny in internl exons All 6,450 exons All 4,26 pseudoexons 3,22 wek exons 3,22 strong exons 1,320 strong pseudoexons Nuleotides from exon oundry Nuleosome oupny in the SNTB2 lous hr16: 60000 600000 6000 620000 630000 640000 650000 660000 60000 60000 60000 600000 54 1 SNTB2 RefSeq genes SymCurv profile in internl exons Nuleotides from exon oundry the higher GC ontent in exons ould prtilly result from seletion to mintin sequenes filitting positioning of nuleosomes. RESULTS Nuleosome enrihment in humn internl exons We hve nlyzed dt produed y Shones et l. 2 on genome-wide mpping of nuleosome density in resting nd tivted humn CD4 + T ells. These dt were generted y diret Solex high-throughput sequening of DNA purified from nuleosomes otined y mirool nulese (MNse) digestion of hromtin preprtions. We extended the short sequene reds inluding neighoring sequenes to the expeted nuleosome length ( se pirs (p)). The numer of reds mpping to given region (nuleotide) n e ssumed to e mesure of nuleosome oupny in tht region or nuleotide. When ompring the profile of nuleosome oupny in resting CD4 + T ells with the positions of internl exons of protein-oding genes (see Online Methods) tht were lssified s onstitutive y the AStlvist system 32 (Supplementry Methods), we oserved strong nuleosome oupny within the exons of humn genes (Fig. 1,). In ontrst, pseudoexons tht is, nonrepetitive introni regions flnked y strong splie sites ut showing no evidene of inlusion in mrna, s judged y the ville ESTs nd DNAs (Supplementry Methods) show wek nuleosome depletion (Fig. 1). We hve omputed the nuleosome-oupny rtio s the rtio etween the verge nuleosome Figure 1 Oserved nd predited nuleosome oupny. () Nulesome-oupny profile ross humn internl onstitutive exons in resting CD4 + T ells. We hve omputed the numer of extended nuleosome reds overlpping eh nuleotide. Upstrem nd downstrem of n idelized internl exon, we plot the verge numer of nuleosome reds per nuleotide position, with negtive positions reltive to the eptor () site nd positive positions reltive the donor (don) site. Within the exon, reds hve een mpped to 50 identilly sped intervls, irrespetive of the length of the exon (see Online Methods). Strong exons re exons with omined donor nd eptor sore mong the highest 5%; wek exons re the exons with omined sore mong the lowset 5%; pseudoexons re introni sequenes ounded y splie sites; strong pseudoexons re exons with omined sore higher thn the 0% perentile of rel exons. () Nuleosome-oupny profile in the humn SNTB2 lous in hromosome 16. All internl exons re lerly mrked y nuleosome peks from 2. High nuleosome peks mrk the first nd fourth internl exons nd the terminl exon. Notly, the first nd the fourth internl exons re the exons with the wekest omined splie sites sores (first: 16.; seond: 20.60; third: 1.4; fourth:.26; fifth: 20.). () Computtionlly predited nuleosome-oupny profile ross humn eptor sites. The SymCurv sore t eh nuleotide hs een verged over ll exons nd pseudoexons in wy similr to tht used for the nuleosome reds (Supplementry Methods). oupny per nuleotide within the exon nd the verge nuleosome oupny per nuleotide in the -p regions upstrem nd downstrem of the exon. The medin of the logrithm of this rtio is positive for exons (0.23) nd negtive for pseudoexons ( 0.), differene tht is highly signifint (P < 2.2 16, ording to oth the one- nd two-sided Mnn-Whitney U nd the Kolmogorov-Smirnov tests). Notly, the intensity of nuleosome oupny is inversely relted to exon splie site strength. Indeed, we hve omputed the sores of the splie sites nd rnked the exons ording to the sum of the eptor nd donor sores (Supplementry Methods). We hve onsidered the lowest-soring 5% of exons s wek exons nd the highestsoring 5% of exons s strong exons with these terms referring only to the strength of the exons splie sites nd not to their inlusion level. We hve found tht nuleosome oupny is stronger in wek thn in strong exons (Fig. 1). The medin of the logrithm of the nuleosome oupny rtio is 0.3 for wek exons nd 0. for strong exons ( sttistilly signifint differene; P < 2.2 16 ). In exons with strong splie sites, in ontrst, n extended region of nuleosome oupny ours upstrem of the 3 splie site. This region is lso oserved in pseudoexons nd is more entuted in pseudoexons flnked y strong spliing signls. As result, pseudoexons with strong splie sites in whih spliing does not our despite the strength of the sites show pttern of nuleosome oupny tht is the mirror imge of tht oserved on exons with wek sites in whih spliing ours despite the wekness of the sites (Fig. 1). A similr pttern is oserved in tivted ells, leit less shrp (Supplementry Fig. 1). These oservtions strongly suggest reltionship etween dvne online pulition nture struturl & moleulr iology

200 Nture Ameri, In. All rights reserved. Frtion of pseudoexons with 1 red or more 0.22 0.1 0. 0. Nuleosome oupny in internl exons: nontrnsried genes All 6,435 exons 643 exons (wek eptors) 643 exons (strong eptors) 3,2 pseudoexons nuleosome oupny nd exon reognition during pre-mrna spliing. Speifilly, nuleosome oupny within the exon my promote exon inlusion n effet tht is prtiulrly relevnt in exons with wek splie sites, wheres nuleosome depletion within the exon nd stle nuleosome oupny upstrem of the eptor site s oserved in pseudoexons ould hve repressing effet. As n exmple, Figure 1 shows the reltionship etween peks of nuleosome oupny nd internl or 3 exons in the humn syntrophin β2 (SNTB2) gene. Further supporting this reltionship is the pttern of nuleosome oupny omputed theoretilly using the SymCurv lgorithm (http://genome.rg.t/softwre/#symcurv), whih predits nuleosoml sequenes on the sis of the symmetry of the DNA urvture (Supplementry Methods). The SymCurv omputtionl preditions on humn exons losely reprodued the pttern of exon nuleosome oupny tht we oserved experimentlly in CD4 + T ells, inluding the differentil ehvior etween wek nd strong exons (Fig. 1). Exoni nuleosome enrihment is not trnsription-dependent The SymCurv nuleosome-oupny profile is sed exlusively on the struturl properties of the DNA sequene, suggesting tht nuleosomes mrk exons in hromtin in the sene of trnsription. Indeed, fter nlyzing expression dt from resting CD4 + T ells 2 (see Online Methods), we found tht non-expressed genes show n exoni nuleosome-oupny pttern tht is similr, leit less shrp, thn tht oserved in expressed genes (Fig. 2,). Notly, lthough expressed genes show redued frequenies of stly positioned nuleosomes overll (Fig. 2,), this redution is less prominent within wek exons nd upstrem of pseudoexons, onsistent with the hypothesis tht nuleosome positioning hs prtiulrly relevnt role in regulting the spliing of these elements. d Nuleosome oupny in internl exons: trnsried genes All 4,56 exons 2,4 pseudoexons Pseudoexon inlusion: nuleosomes vs RNAseq Frtion pf pseudoexons with 1 red or more 3 2 1 0 1 2 3 2 1 0 1 2 3 Exon to intron nuleosome log rtio Exon to intron SymCurv log rtio 0.1 0.1 0.16 0. 0. 0. Pseudoexon inlusion: SymCurv versus RNAseq Figure 2 Nulesome oupny nd expression of genes nd exons. () Nulesome-oupny profile ross internl eptor sites from genes tht re not expressed in resting CD4 + T ells. Gene expression hs een determined using the Affymetrix pltform 2. We plot the verge numer of nuleosome reds per position in ll exons onsidered together (lk), only in exons with strong (red) nd wek (lue) eptor sites, nd in introni pseudoexons. () Nuleosomeoupny profile ross internl eptor sites from genes expressed in resting CD4 + T ells, shown s in. (,d) Nuleosome-oupny rtio versus inlusion of pseudoexons. Lrge vlues of the nuleosome-oupny rtio re typil of on fide exons. Nuleosome oupny hs een mesured using oth the nuleosome sequene reds otined experimentlly y Shones et l. 2 () nd our SymCurv theoretil preditions (d). In eh se, we hve inned the set of pseudoexons into six lsses depending on this rtio (Supplementry Methods). Within eh lss, we hve omputed the proportion of pseudoexons tht hve evidene of inlusion (t lest one red from one tissue mpping entirely within the pseudoexon) ording to pulished RNAseq dt set 33. we hypothesized tht pseudoexons with nuleosome-oupny pttern similr to tht of on fide exons tht is, nuleosome enrihment within the exons nd depletion in the flnking regions my tully e inluded s rel (lterntive) exons in some ell type. For eh pseudoexon, we hve thus omputed the log rtio of nuleosome oupny within the pseudoexon over the flnking introni regions (Supplementry Methods), using oth SymCurv sores nd sequene reds. We hve inned the set of pseudoexons in six lsses depending on these rtios. Within eh in, we ssessed inlusion rte using RNAseq dt reently otined using the Solex pltform in nine humn tissues 33. Consistent with our hypothesis, the lrger the nuleosome-oupny rtio, the lrger the proportion of pseudoexons with evidene of inlusion (tht is, with t lest one red from one tissue mpping entirely within it; Fig. 2,d). Nuleosome positioning ontriutes to glol exon definition We inned the set of exons with the wekest % of eptor sites ording to their length. Short exons (less thn 0 p long) show wek nuleosome-oupny pek downstrem of the eptor sites. The pek grows with exon length nd moves towrd the enter of the exon, pttern tht losely reprodues the prtitioning of exons in length lsses (Fig. 3). A less shrp pttern is oserved in exons with the strongest % of eptor sites (Fig. 3). These oservtions re more omptile with nuleosome positioning defining the exon glolly rther thn speifilly ffeting the eptor or the donor site. However, our nlysis lso suggests tht nuleosome oupny my hve n dditionl role in the definition of eptor 3 sites. Indeed, the pek nuleosome pttern is ovious downstrem of wek eptor sites of terminl exons (Fig. 3). In ontrst, the pek nuleosome pttern is wek upstrem of the donor sites of initil exons, nd no differenes n e oserved here etween wek nd strong donor sites (Fig. 3d). Nuleosome oupny predits inlusion of exons As our definition of pseudoexons is reltive to the ville trnsript evidene, we nnot rule out tht some of the pseudoexons in our dt set my tully e inluded in some ell type or ondition tht hs not yet een surveyed. In this regrd, s orollry of our oservtions, Exon mrking y histone modifitions nd nuleosomes Trimethyltion of Lys36 in Histone 3 (H3K36me3) hs een reently desried s mrker of exons in expressed genes from C. elegns 31. We hve nlyzed ChIP-Seq dt on this modifition reently otined in CD4 + T ells 34 (see Online Methods). Consistent with nture struturl & moleulr iology dvne online pulition

200 Nture Ameri, In. All rights reserved. the previous oservtion 31, we lso oserved pek of H3K36me3 within humn internl exons from expressed genes (Fig. 4). The pek showed the differenes etween wek nd strong exons, leit less shrp, tht we hd lso oserved for nuleosomes. We therefore performed rude nd qulittive normliztion of the H3K36me3 profile ginst the nuleosome-oupny profile. At eh nuleotide position, we simply divided the numer of H3K36me3 reds y the numer of nuleosome reds overlpping the position (see Online Methods). After proessing the dt in this wy, the H3K36me3 pek within exons essentilly vnished (Fig. 4). This indites tht the H3K36me3 pek mostly reflets underlying nuleosome oupny. For nother epigeneti mrk (H4K20me1), however, nuleosome normliztion unovers potentil ntiorreltion with exon positions tht is not pprent from the rw dt (Fig. 4,d). Nuleosome enrihment in C. elegns exons We hve mpped high-throughput sequene dt on nuleosome positioning otined using the SOLiD pltform in C. elegns 2 ross internl onstitutive exons (see Online Methods). The mpping revels well-defined pek of nuleosome oupny within internl exons from C. elegns genes. (Supplementry Fig. 2). Suh pek is not oserved in the DNA used s ontrol in these experiments, demonstrting tht GC sequening is is not onfounding our oservtions. GC ontent nd nuleosome oupny in humn exons Humn exons tend to e GC-rih when ompred to the surrounding introni regions. Beuse nuleosome sequenes hve lso een 16 Nuleosome oupny in internl exons: wek eptors 1,45 exons (50 nt) 2,54 exons (0 nt) 2,4 exons (0 16 nt) 1,4 exons (10 20 nt) 36 exons (2 24 nt) 300 200 0 +0 +200 +300 Nuleosome oupny in terminl exons All,55 exons 5 exons (wek eptors) 5 exons (strong eptors) d 16 Nuleosome oupny in internl exons: strong eptors 2, exons (50 nt) 2,516 exons (0 nt) 1,26 exons (0 16 nt) exons (10 20 nt) 362 exons (2 24 nt) 300 200 0 +0 +200 +300 Nuleosome oupny in initil exons All,01 exons 1, exons (wek donors) 1, exons (strong donors) 300 200 0 +0 +200 +300 300 200 0 don +0 +200 +300 Figure 3 Nuleosome oupny in internl exons of different lengths, initil exons nd terminl exons. (,) Nuleosome-oupny profiles ross internl eptors for different exon length lsses. We plot the verge numer of nuleosome reds per position. Positions re ligned t the eptor () site. Different profiles re plotted (using different olors) for different exon length lsses. Nuleosome oupny is shown for wek eptors nd strong eptors (). () Nuleosome-oupny profile in terminl exons. Positions hve een ligned t the eptor site. (d) Nuleosome-oupny profile in initil exons. Positions hve een ligned t the donor (don) site. postulted to prefer GC-rih regions 35,36, it is possiility tht nuleosome positioning within exons ould e medited y inresed GC ontent. We therefore omputed the profile of GC ontent in humn internl exons (Supplementry Fig. 3 nd Supplementry Methods). The profile (Supplementry Fig. 3) is indeed mrkedly similr to tht of nuleosome oupny, inluding the higher GC ontent in wek thn in strong exons nd the redued GC ontent within pseudoexons. GC ontent y itself, however, nnot fully explin the pttern of H3K36me3 H4K20me1 6 5 4 3 2 1 0 6 5 4 3 2 1 0 H3K36me3 in internl exons All 4,56 exons 2,4 pseudoexons H4K20me1 in internl exons All 4,56 exons 2,4 pseudoexons H3K36me3 per nuleosome d H4K20me1 per nuleosome 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0. 0.6 0.5 0.4 0.3 0.2 0.1 0.0 H3K36me3 in internl exons: normlized y nuleosomes All 4,56 exons 2,4 pseudoexons H4K20me1 in internl exons: normlized y nuleosomes All 4,56 exons 2,4 pseudoexons Figure 4 Profile of histone modifitions in expressed genes in resting CD4 + T ells. () H3K36me3. Note tht the plteu downstrem from the exoni pek of H3K36me3 is higher thn the plteu upstrem. This is in greement with previous work 34 tht showed tht the levels of H3K36me3 inrese 3 to 5 long the trnsript. Our results indite tht the inrese is not entirely liner ut tht it ours, t lest prtilly, in stepwise fshion with the exons. A, eptor; don, donor. () H3K36me3 normlized y nuleosome oupny. () H4K20me1 without normliztion. (d) H4K20me1 normlized y nuleosome oupny. The plots of the rw dt (,) were generted in similr wy to tht for nuleosomes (Fig. 1 nd Online Methods), ut relying on ChIP-Seq dt for histone modifitions 34. Normlized plots (,d) were otined fter dividing the vlues orresponding to histone modifitions y those orresponding to nuleosome oupny (see Online Methods for detils). dvne online pulition nture struturl & moleulr iology

200 Nture Ameri, In. All rights reserved. nuleosome oupny tht we hve oserved in humn exons. First, the region upstrem of the donor sites of initil exons is more GC-rih thn the region downstrem from eptors of terminl exons (63% on verge versus 51%), ut nuleosome oupny is higher in the ltter (. extended red ounts per nuleotide on verge versus.0; Fig. 3). Seond, we hve omputed the orreltion oeffiient etween the rw nuleosome oupny nd the GC ontent in exons nd pseudoexons. Although the orreltion is positive nd signifint in oth ses, it is muh higher for pseudoexons (0.422) thn for exons (0.12), suggesting tht ftors other thn GC ontent (for instne, ftors involved in spliing) hve stronger influene on nuleosome oupny in exons thn in pseudoexons. Finlly, we hve seleted susets of exons nd pseudoexons tht re lmost identil in terms of their log rtio of GC ontent etween the exon (pseudoexon) nd the flnking introni regions (Supplementry Methods). Even in these susets, exons hve signifintly higher nuleosome oupny log rtio thn pseudoexons do (0.04 versus 0.061, P < 2.2 16 ; Supplementry Fig. 4). An equivlent onlusion n e rehed when ompring wek nd strong exons (Supplementry Fig. 5). Nuleosome enrihment in nonoding exons To further investigte the reltionship etween oding funtion, GC ontent nd nuleosome oupny, we investigted exons from nonoding trnsripts. We onsidered the nonoding genes from the GENCODE nnottion 3 (see Online Methods). Nonoding exons lso hve higher GC ontent thn the surrounding introni res (Supplementry Fig. 6). Notly, they lso show strong nuleosome oupny (Supplementry Fig. 6). DISCUSSION Tken together, we elieve tht our nlyses suggest role for hromtin struture in spliing. More speifilly, the interply etween nuleosome positioning within exons nd upstrem from the eptor sites seems to ontriute to exon reognition: nuleosome positioning within the exon oupled with nuleosome depletion upstrem from the eptor site would promote inlusion of exons with wek splie sites, wheres nuleosome depletion within the exon oupled with stle nuleosome oupny upstrem of the eptor site would hve repressing effet. Although the evidene for the reltionship is onvining, the moleulr mehnisms y whih nuleosome positioning influene splie site reognition remin to e eluidted. A possile mehnism would e medited y hnges in trnsription elongtion rtes used y the presene of positioned nuleosome ner the splie sites. Suh hnges re indeed known to influene splie site seletion 5,20,21. It is oneivle tht the presene of stly positioned nuleosome redues the elongtion rte of the polymerse omplex, nd this, in turn, provides window of opportunity for RNAPII CTD ssoited spliing ftors to intert with splie sites. Alterntively, nuleosomes ould ontriute to speifilly reruit some spliing ftors during trnsription, or to o-trnsriptionlly enhne the moleulr intertions y whih spliing ftors ound t the flnking splie sites stilize eh other. This phenomenon, known s exon definition, hs een linked to the optiml length of internl exons, whih my provide n optiml distne to ommodte diret or indiret intertions etween ftors involved in erly reognition of the flnking 3 nd 5 splie sites 3. In this regrd, it is notle tht the verge length of humn internl exons (1 p, for ll exons, not only those in our size-seleted dt set) is similr to tht of the nuleosome sequenes (pproximtely p) nd tht this similrity is greter nd more onstrined for exons with wek splie sites (men 3 p, s.d. 1 p), where nuleosomes re positioned more stly, thn for exons with strong splie sites (men 164 p, s.d. 3 p). Thus, nuleosome positioning in internl exons n ontriute to the proper positioning of moleulr intertions ross the exon tht hrterize the proess of exon definition. The ft tht the spliing proess n our in vitro on exogenously dded RNA moleules lerly demonstrtes tht nuleosome positioning is not prerequisite for spliing to our. Spliing is, however, sustntilly more effiient when oupled to trnsription 3, nd nuleosome positioning my further inrese this effiy. Our nlyses indite tht positioning of nuleosomes within exons is not dependent on trnsription. Nuleosome orgniztion long the genome would therefore prtilly reflet the underlying exoni struture of genes nd, thus, onstitute ode for spliing present in the DNA ut not in the sequene of the primry trnsript. Our nlyses lso suggest tht the enrihment of ertin histone modifitions, notly H3K36me3, in exons 31 is, t lest prtilly, the refletion of the enrihment of stly positioned nuleosomes within exons of tive genes. Indeed, in humn CD4 + T ells, when normlized with respet to nuleosome oupny, the H3K36me3 pek within exons essentilly disppers. Notly, however, nuleosome enrihment upstrem of the eptor sites tht we oserve in strong exons does not seem to orrelte with depletion of H3K36me3. Also, even fter normlizing for nuleosome oupny, some histone modifitions (H4K20me1, for instne) show hrteristi exoni pttern. These oservtions rgue tht, s suggested 31, histone modifitions my indeed hve role in spliing. In our opinion, however, this role n e fully understood only when the underlying pttern of nuleosome oupny is tken into ount. We hve found tht the pttern of GC ontent on humn exons is mrkedly similr to the pttern of nuleosome oupny ut tht GC ontent lone n not fully explin the pttern of nuleosome oupny oserved within exons. One is tempted to speulte tht the elevted GC ontent of exons my prtilly result from the need to ommodte nuleosome sequenes, whih hve een postulted to lso prefer GC-rih regions 35,36. Indeed, the inresed GC ontent in exons hs often een ttriuted to the odon ises tht resulting from protein-oding funtionlity. Among other hypotheses, GC-rih odons would e preferred euse of the higher reltive undne of the ognte trnas (see, for exmple, refs. 40,41), whih would led to inresed effiieny of trnsltion. However, we hve lso found elevted GC ontent within exons from nonoding RNAs (Supplementry Fig. 6), where odon is nnot e invoked. Notly, the ft tht we hve lso found nuleosome enrihment within nonoding exons (Supplementry Fig. 6) would support the hypothesis tht GC ontent in nonoding exons (s well s in oding exons) ould t lest prtilly e the onsequene of seletion to fvor spliing-relted nuleosome oupny. Further supporting this hypothesis is the striking oservtion of redued GC ontent within pseudoexons (Supplementry Fig. 3). Low GC ontent would explin nuleosome depletion in pseudoexons (Fig. 1), whih in turn would hve repressive effet on their inlusion in mrna sequenes. Although other seletive pressures tht re not relted to trnsltion effiieny ould explin GC enrihment within nonoding exons, suh s DNA stility 42, RNA struture 43 or RNA proesssing 44,45, suh seletive pressures nnot esily explin GC redution within pseudoexons. Severl interesting exmples of onnetions etween hromtin struture nd RNA proessing hve een reported 6,20,21, ut our findings provide generl onept for how the rhiteture of genome pkging n influene pre-mrna spliing. Despite gret progress, the determinnts of splie site identifition re not totlly nture struturl & moleulr iology dvne online pulition

200 Nture Ameri, In. All rights reserved. understood, nd it is not possile to predit from the nlysis of the primry RNA sequene lone the resulting pttern of spliing produts. Our results indite tht some of these determinnts my tully reside outside the primry trnsript, in the hromtin struture itself, nd represent instrutions for spliing enoded in the DNA sequene ut not in the sequene of the primry trnsript. Tking this onept into ount should provide new frmework to understnd fetures of splie site reognition, exon definition nd lterntive spliing on the sis of hromtin rhiteture. Methods Methods nd ny ssoited referenes re ville in the online version of the pper t http://www.nture.om/nsm/. Note: Supplementry informtion is ville on the Nture Struturl & Moleulr Biology wesite. Aknowledgments We thnk D.E. Shones for help with the dt nd its interprettion nd memers of the Guigó lortory, espeilly D. Gonzlez, for help with dt nlysis. This work ws supported y the Spnish Ministry of Siene with fellowships to M.S. nd S.A., nd with grnt numer BIO2006-0330 to R.G. AUTHOR CONTRIBUTION H.T., C.N., S.A. nd M.S. performed omputtionl nlysis; R.G., J.V. nd M.B. desgined the nlysis nd wrote the pper; ll uthors disussed the dt. Pulished online t http://www.nture.om/nsm/. Reprints nd permissions informtion is ville online t http://npg.nture.om/ reprintsndpermissions/. 1. Mnitis, T. & Reed, R. An extensive network of oupling mong gene expression mhines. Nture 416, 4 506 (2002). 2. Moore, M.J. & Proudfoot, N.J. Pre-mRNA proessing rehes k to trnsription nd hed to trnsltion. Cell 6, 6 00 (200). 3. Bentley, D.L. Rules of enggement: o-trnsriptionl reruitment of pre-mrna proessing ftors. Curr. Opin. Cell Biol. 1, 251 256 (2005). 4. Pndit, S., Wng, D. & Fu, X.D. Funtionl integrtion of trnsriptionl nd RNA proessing mhineries. Curr. Opin. Cell Biol. 20, 260 265 (200). 5. Kornlihtt, A.R. Coupling trnsription nd lterntive spliing. Adv. Exp. Med. Biol. 623, 1 (200). 6. Kdener, S. et l. Antgonisti effets of T-Ag nd VP16 revel role for RNA Pol II elongtion on lterntive spliing. EMBO J. 20, 55 56 (2001).. Btshé, E., Yniv, M. & Muhrdt, C. The humn SWI/SNF suunit Brm is regultor of lterntive spliing. Nt. Strut. Mol. Biol., 22 2 (2006).. Sims, R.J., III et l. Reognition of trimethylted histone H3 lysine 4 filittes the reruitment of trnsription postinitition ftors nd pre-mrna spliing. Mol. Cell 2, 665 66 (200).. Shor, I.E., Rsovn, N., Pelish, F., Allo, M. & Kornlihtt, A.R. Neuronl ell depolriztion indues intrgeni hromtin modifitions ffeting NCAM lterntive spliing. Pro. Ntl. Ad. Si. USA 6, 4325 4330 (200).. Ds, R. et l. SR proteins funtion in oupling RNAP II trnsription to pre-mrna spliing. Mol. Cell 26, 6 1 (200).. Phtnni, H.P. & Greenlef, A.L. Phosphoryltion nd funtions of the RNA polymerse II CTD. Genes Dev. 20, 222 236 (2006).. Nogues, G., Kdener, S., Crmer, P., Bentley, D. & Kornlihtt, A.R. Trnsriptionl tivtors differ in their ilities to ontrol lterntive spliing. J. Biol. Chem. 2, 430 434 (2002).. Auoeuf, D., Honig, A., Berget, S.M. & O Mlley, B.W. Coordinte regultion of trnsription nd spliing y steroid reeptor oregultors. Siene 2, 416 41 (2002).. Monslve, M. et l. Diret oupling of trnsription nd mrna proessing through the thermogeni otivtor PGC-1. Mol. Cell 6, 30 316 (2000).. Li, X. & Mnley, J.L. Cotrnsriptionl proesses nd their influene on genome stility. Genes Dev. 20, 13 14 (2006). 16. Lun, R., Gillrd, H., Gonzlez-Aguiler, C. & Aguiler, A. Biogenesis of mrnps: integrting different proesses in the eukryoti nuleus. Chromosom, 31 331 (200). 1. Lin, S., Coutinho-Mnsfield, G., Wng, D., Pndit, S. & Fu, X.D. The spliing ftor SC35 hs n tive role in trnsriptionl elongtion. Nt. Strut. Mol. Biol., 1 26 (200). 1. de l Mt, M. et l. A slow RNA polymerse II ffets lterntive spliing in vivo. Mol. Cell, 525 532 (2003). 1. Howe, K.J., Kne, C.M. & Ares, M. Jr. Perturtion of trnsription elongtion influenes the fidelity of internl exon inlusion in Shromyes erevisie. RNA, 3 06 (2003). 20. Muñoz, M.J. et l. DNA dmge regultes lterntive spliing through inhiition of RNA polymerse II elongtion. Cell, 0 20 (200). 21. Allo, M. et l. Control of lterntive spliing through sirna-medited trnsriptionl gene silening. Nt. Strut. Mol. Biol. 16, 1 24 (200). 22. Frser, P. & Bikmore, W. Nuler orgniztion of the genome nd the potentil for gene regultion. Nture 44, 4 41 (200). 23. Kouzrides, T. Chromtin modifitions nd their funtion. Cell, 63 05 (200). 24. Allemnd, E., Btshe, E. & Muhrdt, C. Spliing, trnsription, nd hromtin: ménge à trois. Curr. Opin. Genet. Dev. 1, 5 1 (200). 25. Bekmnn, J.S. & Trifonov, E.N. Splie juntions follow 205-se ldder. Pro. Ntl. Ad. Si. USA, 230 233 (11). 26. Denisov, D.A., Shpigelmn, E.S. & Trifonov, E.N. Protetive nuleosome entering t splie sites s suggested y sequene-direted mpping of the nuleosomes. Gene 205, 5 (1). 2. Kogn, S. & Trifonov, E.N. Gene splie sites orrelte with nuleosome positions. Gene 352, 5 62 (2005). 2. Shones, D.E. et l. Dynmi regultion of nuleosome positioning in the humn genome. Cell 2, (200). 2. Vlouev, A. et l. A high-resolution, nuleosome position mp of C. elegns revels lk of universl sequene-ditted positioning. Genome Res. 1, 51 63 (200). 30. Shwrtz, S., Meshorer, E. & Ast, G. Chromtin orgniztion mrks exon-intron rhiteture. Nt. Strut. Mol. Biol. dvne online pulition, doi:.3/ nsm.165 (16 August 200). 31. Kolsinsk-Zwierz, P. et l. Differentil hromtin mrking of introns nd expressed exons y H3K36me3. Nt. Genet. 41, 36 31 (200). 32. Smmeth, M., Foiss, S. & Guigo, R. A generl definition nd nomenlture for lterntive spliing events. PLOS Comput. Biol. 4, e00 (200). 33. Wng, E.T. et l. Alterntive isoform regultion in humn tissue trnsriptomes. Nture 456, 40 46 (200). 34. Brski, A. et l. High-resolution profiling of histone methyltions in the humn genome. Cell, 23 3 (200). 35. Khrhenko, P.V., Woo, C.J., Tolstorukov, M.Y., Kingston, R.E. & Prk, P.J. Nuleosome positioning in humn HOX gene lusters. Genome Res. 1, 54 61 (200). 36. Pekhm, H.E. et l. Nuleosome positioning signls in genomi DNA. Genome Res. 1, 0 (200). 3. Hrrow, J. et l. GENCODE: produing referene nnottion for ENCODE. Genome Biol. (Suppl 1), S4 (2006). 3. Berget, S.M. Exon reognition in verterte spliing. J. Biol. Chem. 20, 24 24 (). 3. Ds, R. et l. Funtionl oupling of RNAP II trnsription to splieosome ssemly. Genes Dev. 20, 00 0 (2006). 40. Ikemur, T. Correltion etween the undne of yest trnsfer RNAs nd the ourrene of the respetive odons in protein genes. Differenes in synonymous odon hoie ptterns of yest nd Esherihi oli with referene to the undne of isoepting trnsfer RNAs. J. Mol. Biol., 53 5 (12). 41. Kotlr, D. & Lvner, Y. The tion of seletion on odon is in the humn genome is relted to frequeny, omplexity, nd hronology of mino ids. BMC Genomis, 6 (2006). 42. Jri, K., Cly, O. & Bernrdi, G. GC3 heterogeneity nd ody temperture in vertertes. Gene 31, 161 163 (2003). 43. Ktz, L. & Burge, C.B. Widespred seletion for lol RNA seondry struture in oding regions of teril genes. Genome Res., 2042 2051 (2003). 44. Duret, L. Deteting genomi fetures under wek seletive pressure: the exmple of odon usge in nimls nd plnts. Bioinformtis 1 (Suppl 2), S1 (2002). 45. Willie, E. & Mjewski, J. Evidene for odon is seletion t the pre-mrna level in eukryotes. Trends Genet. 20, 534 53 (2004). dvne online pulition nture struturl & moleulr iology

200 Nture Ameri, In. All rights reserved. ONLINE METHODS Humn exons nd nuleosome dt. We downloded the humn RefSeq trnsripts 46 nd GenBnk mrnas 4 (oth reltive to the hg1 version of the humn genome) from the UCSC tle rowser (http://genome.us.edu/gi-in/hgtles) 4 on August 200. From set of 25,1 RefSeq trnsripts ligning to the humn hromosomes 1 to 22 nd X (inluding only one genomi lotion if the trnsript mpped to multiple lotions), we hose set of 6,450 nonredundnt internl exons tht (i) hd pproprite length for exon definition (etween 50 nt nd 250 nt, little more onservtive thn the 50 300 nt mentioned efore 3 ), (ii) were lssified s onstitutive using the Astlvist frmework 32 (Supplementry Methods) with RefSeq nd mrna exons (n internl exon ws lssified s onstitutive, if nd only if ll nnotted RefSeq trnsripts nd GenBnk mrnas whose trnsript sequenes overlpped the omplete exon show the exon s prt of their nnotted gene struture), (iii) did not hve ny djent intron of U type (using geneid 4,50 ) or of less thn 0 nt nd (iv) hd AG eptors nd GT donors. Consequently, we used initil nd terminl exons in the nlysis if ll nnotted RefSeq trnsripts nd GenBnk mrnas whose trnsript sequenes overlpped the splie site (donor for n initil exon; eptor for terminl exon) hd this splie site s prt of their nnotted gene struture. Furthermore we used no length riterion for initil nd terminl exons. Exons from nonoding RNA. We extrted nonoding RNAs from the GENCODE nnottion (the referene nnottion eing uilt within the frmework of the ENCODE projet 3 ). We onsidered only nonoding RNAs tht were not more thn 1 k wy from the oundries of the losest nnotted protein-oding loi. This resulted in 3,01 trnsripts (orresponding to 2,25 loi), from whih 1,53 hd t lest one internl exon. We extrted internl exons from these trnsripts s for oding exons. We retined exons only from trnsripts tht were lssified s proessed_trnsript or nonoding. This resulted in set of 1,403 internl exons from nonoding trnsripts. Nuleosome nd histone modifition dt. To ddress nuleosome oupny, we extended nuleosoml nd histone modifition reds tht mpped uniquely to the genome 2,34 to the length of full nuleosome ( p). The numer of extended reds ( nuleosome oupny ) overlpping eh genomi position ws lulted t single-nuleotide resolution. To lulte the ggregte vlues for pseudo(exon) representtion, we represented eh tegory (exons, pseudoexons, wek nd strong exons) y series of 50 vlues. We lulted 350 introni vlues for the upstrem intron t eh point s the verge nuleosome oupny of ll (pseudo)exons in tegory. We lulted 350 vlues for the downstrem intron nlogously. To represent n idelized (pseudo)exon with fixed size despite the vrying length of (pseudo)exons, we lulted 50 vlues s follows. The first vlue is the verge nuleosome oupny over ll (pseudo)exons for given tegory in -p window entered t nuleotide 1 of the (pseudo)exon. To the seond vlue, every exon e with l(e) nuleotides ontriutes with the verge nuleosome oupny in -p window entered round nuleotide 1 + 1*l(e)/50, rounded to the next integer. These ontriutions re verged to produe one single vlue. In the sme wy, for i = 3,,50 windows of p eh entered round nuleotide 1+ (i 1)*l(e)/50 (rounded to the next integer) re used. We used the -p window pproh to gurntee tht ll ses in ll exons (up to 250 p) would e tken into ount. Averging within windows mde sure tht no rtifiil overounts were produed when projeting longer exons (up to 250 p) to the idelized length of 50 p. For histone modifition normliztion ginst nuleosomes, we treted oth nuleosomes nd histone-modifition dt s desried in the previous setion, so tht for nuleosomes nd for eh kind of histone modifition 50 vlues (representing the upstrem intron, the exon nd the downstrem intron) were otined. Normliztion for for exmple, H3K36me3 (H3K2me1, H4K20me1 nd so on) ws performed y dividing the 50 vlues for H3K36me3 y the orresponding vlues for nuleosomes. We hose deliertely to perform this t the verge level of ll (pseudo)exons nd not for eh (pseudo)exon seprtely to void ises due to exessive pseudoounts. Trnsript sets of trnsried nd nontrnsried RefSeq trnsripts. A list of nontrnsried ( sent in their terminology) nd trnsried ( present in their terminology) UCSC trnsript sets derived from mirorry nlysis in resting T ells ws provided y D. Shones 2. Using the orrespondene tles provided y the UCSC genome rowser 4, we leled RefSeq gene trnsried (or nontrnsried, respetively) if nd only if ll orresponding UCSC trnsripts were in the list of trnsried (or nontrnsried) UCSC trnsripts. When ompring exons of trnsried versus those of nontrnsried genes, we defined the strength of exons nd pseudoexons using only the strength of the eptor. To otin lrger exon sets, % (insted of the previously used 5%) were hosen for the definition of wek nd strong. Humn internl exons for length nlysis. To investigte length onstrints of exons with wek nd strong splie sites without the rtifiilly imposed miniml nd mximl exon length, we hose set of,25 internl RefSeq exons ounded y AG eptors nd GT donors, regrdless of length, spliing type (onstitutive or lterntive) or the type of their surrounding introns (U2 or U). For ses where n exon n e used y more thn one trnsript, the exon ws ounted only one. Agin we sored ll splie sites nd determined wek nd strong exons ording to the sum of their eptor nd donor sores, s desried efore. Mens nd s.d. of the length distriutions of wek nd strong exons were lulted. Cenorhditis elegns exons nd nuleosome dt. We downloded C. elegns RefSeq trnsripts nd mrnas (oth reltive to the e4 version of the C. elegns genome) from the UCSC tle rowser on 24 Novemer 200. We hosen exons from these trnsripts in similr wy s the humn exons were hosen with the following differenes. (i) Exons hd to e surrounded y introns of miniml length of 0 nt, nd ll introns were ssumed to e of U2 type (euse C. elegns is not known to ontin ny U introns 51,52 ). (ii) No pseudoexons were defined, s most C. elegns introns re too short to hror pseudoexons. C. elegns nuleosoml nd ontrol reds 2 were downloded from the UCSC tle rowser. No extension ws neessry s they were lredy of nuleosome size. We otined nuleosome oupnies nd ontrol oupnies s well s ggregte plots s desried for the humn nuleosome dt for oth nuleosomes nd ontrol. 46. Pruitt, K.D., Ttusov, T. & Mglott, D.R. NCBI referene sequenes (RefSeq): urted non-redundnt sequene dtse of genomes, trnsripts nd proteins. Nulei Aids Res. 35, D61 D65 (200). 4. Benson, D.A., Krsh-Mizrhi, I., Lipmn, D.J., Ostell, J. & Wheeler, D.L. GenBnk. Nulei Aids Res. 36, D25 D30 (200). 4. Kent, W.J. et l. The humn genome rowser t UCSC. Genome Res., 6 06 (2002). 4. Blno, E., Prr, G. & Guigo, R. Using geneid to identify genes. Curr. Proto. Bioinformtis, Chpter 4: Unit 4.3 (200). 50. Prr, G., Blno, E. & Guigo, R. GeneID in Drosophil. Genome Res., 5 5 (2000). 51. Sheth, N. et l. Comprehensive splie-site nlysis using omprtive genomis. Nulei Aids Res. 34, 355 36 (2006). 52. Alioto, T.S. UDB: dtse of orthologous U-type splieosoml introns. Nulei Aids Res. 35, D0 D5 (200). doi:.3/.3/nsm.165 nture struturl & moleulr iology