Glycoproteomic characterization of recombinant mouse α-dystroglycan

Similar documents
Double charge of 33kD peak A1 A2 B1 B2 M2+ M/z. ABRF Proteomics Research Group - Qualitative Proteomics Study Identifier Number 14146

Supplementary Figure 1. PD-L1 is glycosylated in cancer cells. (a) Western blot analysis of PD-L1 in breast cancer cells. (b) Western blot analysis

Structural Elucidation of N-glycans Originating From Ovarian Cancer Cells Using High-Vacuum MALDI Mass Spectrometry

Nature Biotechnology: doi: /nbt Supplementary Figure 1. RNAseq expression profiling of selected glycosyltransferase genes in CHO.

Glycosylation analyses of recombinant proteins by LC-ESI mass spectrometry

SUPPLEMENTAL INFORMATION

TECHNICAL BULLETIN. R 2 GlcNAcβ1 4GlcNAcβ1 Asn

Nature Methods: doi: /nmeth Supplementary Figure 1

Mass Spectrometry. Mass spectrometer MALDI-TOF ESI/MS/MS. Basic components. Ionization source Mass analyzer Detector

Author Manuscript Faculty of Biology and Medicine Publication

Application Note # ET-17 / MT-99 Characterization of the N-glycosylation Pattern of Antibodies by ESI - and MALDI mass spectrometry

Profiling the Distribution of N-Glycosylation in Therapeutic Antibodies using the QTRAP 6500 System

SUPPLEMENTARY INFORMATION

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Significance and Functions of Carbohydrates. Bacterial Cell Walls

The addition of sugar moiety determines the blood group

Electronic Supplementary Information

Improve Protein Analysis with the New, Mass Spectrometry- Compatible ProteasMAX Surfactant

Nature Methods: doi: /nmeth.3177

Supporting Information. Post translational Modifications of Serotonin Type 4 Receptor Heterologously Expressed in. Mouse Rod Cells

Enzymatic Removal of N- and O-glycans using PNGase F or the Protein Deglycosylation Mix

on Non-Consensus Protein Motifs Analytical & Formulation Sciences, Amgen. Seattle, WA

Biological Mass Spectrometry. April 30, 2014

Supporting Information. Lysine Propionylation to Boost Proteome Sequence. Coverage and Enable a Silent SILAC Strategy for

PTM Discovery Method for Automated Identification and Sequencing of Phosphopeptides Using the Q TRAP LC/MS/MS System

Isomeric Separation of Permethylated Glycans by Porous Graphitic Carbon (PGC)-LC-MS/MS at High- Temperatures

Glycosylation analysis of blood plasma proteins

MALDI-TOF. Introduction. Schematic and Theory of MALDI

2. Ionization Sources 3. Mass Analyzers 4. Tandem Mass Spectrometry

Biomolecular Mass Spectrometry

A systematic investigation of CID Q-TOF-MS/MS collision energies to allow N- and O-glycopeptide identification by LC-MS/MS

N-Glycosidase F Deglycosylation Kit

Glycoprotein Deglycosylation Kit Cat. No

Supplementary Materials for

Phosphorylation of proteins Steve Barnes Feb 19th, 2002 in some cases, proteins are found in a stable, hyperphosphorylated state, e.g.

Structural Characterization of Prion-like Conformational Changes of the Neuronal Isoform of Aplysia CPEB

Manja Henze, Dorothee Merker and Lothar Elling. 1. Characteristics of the Recombinant β-glycosidase from Pyrococcus

Analysis of N-Linked Glycans from Coagulation Factor IX, Recombinant and Plasma Derived, Using HILIC UPLC/FLR/QTof MS

Supporting Information for MassyTools-assisted data analysis of total serum N-glycome changes associated with pregnancy

RAPID SAMPLE PREPARATION METHODS FOR THE ANALYSIS OF N-LINKED GLYCANS

Biosynthesis of N and O Glycans

Characterization of Disulfide Linkages in Proteins by 193 nm Ultraviolet Photodissociation (UVPD) Mass Spectrometry. Supporting Information

Supporting information

Chapter 3. Protein Structure and Function

Protein Identification and Phosphorylation Site Determination by de novo sequencing using PepFrag TM MALDI-Sequencing kit

Automating Mass Spectrometry-Based Quantitative Glycomics using Tandem Mass Tag (TMT) Reagents with SimGlycan

Dr Mark Hilliard, NIBRT. Waters THE SCIENCE OF WHAT S POSSIBLE TM

What sort of Science is Glycoscience? (Introductory lecture)

Characterization of the Oligosaccharides Associated with the Human Ovarian Tumor Marker CA125*

Courtship Pheromone Use in a Model Urodele, the Mexican Axolotl (Ambystoma mexicanum)

Mass Spectrometry. - Introduction - Ion sources & sample introduction - Mass analyzers - Basics of biomolecule MS - Applications

Systematic analysis of protein-detergent complexes applying dynamic light scattering to optimize solutions for crystallization trials

Introduction to Peptide Sequencing

Supplementary Figure 1 (previous page). EM analysis of full-length GCGR. (a) Exemplary tilt pair images of the GCGR mab23 complex acquired for Random

Chemical Biology of Protein O-Glycosylation

Protein sequence mapping is commonly used to

Mass Spectrometry at the Laboratory of Food Chemistry. Edwin Bakx Laboratory of Food Chemistry Wageningen University

Oligosaccharide Profiling of O-linked Oligosaccharides Labeled with 2 Aminobenzoic Acid (2-AA)

Trypsin Mass Spectrometry Grade

Protein Trafficking in the Secretory and Endocytic Pathways

The Automation of Glycopeptide Discovery in High Throughput MS/MS Data

Overview of the Expressway Cell-Free Expression Systems. Expressway Mini Cell-Free Expression System

Envelope glycans of immunodeficiency virions are almost entirely oligomannose antigens

Protocol for Gene Transfection & Western Blotting

Supplementary Figure 1: Co-localization of reconstituted L-PTC and dendritic cells

Supplementary Materials for

Tool for Rapid Analysis of glycopeptide by Permethylation (TRAP) via one-pot site mapping and glycan analysis.

4. If a GT uses a one-step mechanism it is retaining or inverting? 5. Name: Gal GlcNAc Neu5Gc Xyl GlcA

Glycoproteins and N-glycans from exosomes

Influenza B Hemagglutinin / HA ELISA Pair Set

Biological Mass spectrometry in Protein Chemistry

Structural vs. nonstructural proteins

Mammalian-type Glycosylation l in LEXSY

Biochemistry: A Short Course

DetergentOUT Tween. DetergentOUT GBS10. OrgoSol DetergentOUT

Throughout biology, the addition of carbohydrates, or

Lecture 3. Tandem MS & Protein Sequencing

Figure S1. Expression efficiency of rd2bpl3 for various expression hosts. (A) BL21(DE3), (B)

Sequence Identification And Spatial Distribution of Rat Brain Tryptic Peptides Using MALDI Mass Spectrometric Imaging

Supporting Information Parsimonious Charge Deconvolution for Native Mass Spectrometry

Glycan and Monosaccharide Workshop Eoin Cosgrave David Wayland Bill Warren

Agilent Protein In-Gel Tryptic Digestion Kit

Supplementary Table 1. Properties of lysates of E. coli strains expressing CcLpxI point mutants

UNIVERSITY OF YORK BIOLOGY. Glycobiology

HCV infects over 170 million people worldwide.

TECHNICAL BULLETIN. Enzymatic Protein Deglycosylation Kit. Catalog Number EDEGLY Storage Temperature 2 8 C

Table S1. Sequence of human and mouse primers used for RT-qPCR measurements.

Comparison of mass spectrometers performances

Supporting Information

Mitochondrial Trifunctional Protein (TFP) Protein Quantity Microplate Assay Kit

Universal sample preparation method for proteome analysis

Chapter 10apter 9. Chapter 10. Summary

One Gene, Many Proteins. Applications of Mass Spectrometry to Proteomics. Why Proteomics? Raghothama Chaerkady, Ph.D.

PNGase F Instruction Manual

Tivadar Orban, Beata Jastrzebska, Sayan Gupta, Benlian Wang, Masaru Miyagi, Mark R. Chance, and Krzysztof Palczewski

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Supplementary Information

Supplementary Figure 1. AdipoR1 silencing and overexpression controls. (a) Representative blots (upper and lower panels) showing the AdipoR1 protein

Supplementary Figure 1.TRIM33 binds β-catenin in the nucleus. a & b, Co-IP of endogenous TRIM33 with β-catenin in HT-29 cells (a) and HEK 293T cells

Detailed Characterization of Antibody Glycan Structure using the N-Glycan Sequencing Kit

Transcription:

Glycobiology vol. 22 no. 5 pp. 662 675, 2012 doi:10.1093/glycob/cws002 Advance Access publication on January 11, 2012 Glycoproteomic characterization of recombinant mouse α-dystroglycan Rebecca Harrison 2, Paul G Hitchen 2, Maria Panico 2, Howard R Morris 2, David Mekhaiel 3, Richard J Pleass 3, Anne Dell 2, Jane E Hewitt 3, and Stuart M Haslam 1,2 2 Division of Molecular Biosciences, Faculty of Natural Sciences, Imperial College London, London SW7 2AZ, UK and 3 Centre for Genetics and Genomics, School of Biology, Queen s Medical Centre, University of Nottingham, Nottingham NG7 2UH, UK Received on August 1, 2011; revised on January 4, 2012; accepted on January 4, 2012 α-dystroglycan (DG) is a key component of the dystrophin glycoprotein complex. Aberrant glycosylation of the protein has been linked to various forms of congenital muscular dystrophy. Unusually α-dg has previously been demonstrated to be modified with both O-N-acetylgalactosamine and O-mannose initiated glycans. In the present study, Fc-tagged recombinant mouse α-dg was expressed and purified from human embryonic kidney 293T cells. α- DG glycopeptides were characterized by glycoproteomic strategies using both nano-liquid chromatography matrixassisted laser desorption ionization and electrospray tandem mass spectrometry. A total of 14 different peptide sequences and 38 glycopeptides were identified which displayed heterogeneous O-glycosylation. These data provide new insights into the complex domain-specific O-glycosylation of α-dg. Keywords: α-dystroglycan / glycoproteomics / mass spectrometry / O-GalNAc / O-mannose Introduction Muscular dystrophies, a group of diseases that cause severe muscle weakness and loss of skeletal muscle mass, are caused by mutations in genes that encode proteins of the dystrophin glycoprotein complex (DGC; Blake et al. 2002; Muntoni et al. 2002, 2004). The DGC is a multimeric, transmembrane protein complex that was first isolated from the sarcolemma of skeletal muscle. The central protein of the sarcolemma DGC is dystroglycan (DG), which binds to dystrophin and proteins within the basal lamina, establishing a critical link between the cytoplasm and the extracellular matrix (ECM; 1 To whom correspondence should be addressed: Tel: +44-20-7594-5222; Fax: +44-20-7225-0458; e-mail: s.haslam@imperial.ac.uk Ervasti and Campbell 1991, 1993; Ibraghimov-Beskrovnaya et al. 1992). The DGC is not confined to skeletal muscle; DG is also expressed in the heart and smooth muscle, as well as many non-muscle tissues including the brain, peripheral nerve, epithelia, retina and kidney (Durbeej et al. 1995; Durbeej, Henry, Ferletta, et al. 1998). The protein composition of the DGC is tissue-dependent (Durbeej and Campbell 1999). The DGC in non-muscle tissue has been found to have roles in synaptogenesis, epithelial morphogenesis and early mouse development (Durbeej, Henry and Campbell 1998). DG is encoded by a single gene: dystrophin-associated glycoprotein 1. The propeptide is cleaved by an unidentified protease into two subunits: α-dg and β-dg. α-dg resides on the outer surface of the cell membrane and binds tightly but non-covalently to the transmembrane subunit, β-dg (Ervasti and Campbell 1991; Ibraghimov-Beskrovnaya et al. 1992). β-dg, in turn, connects intracellularly to dystrophin, which binds to the actin cytoskeleton (Ervasti and Campbell 1993). α-dg completes the link from the cytoskeleton to the ECM by binding to several extracellular ligands, including laminin, agrin, perlecan and neurexin (Gee et al. 1994; Sugiyama et al. 1994; Talts et al. 1999; Sugita et al. 2001). α-dg is made up of two globular domains separated by a central mucin-like domain (Talts et al. 1999). The N-terminal domain of α-dg appears to be commonly processed by furin-like activity, although evidence suggests that it remains associated with the rest of the DG protein in vivo (Barresi and Campbell 2006). Each of the ECM ligands possesses laminin-g-like domains that mediate their high-affinity, calcium-dependent binding to α-dg (Andac et al. 1999; Wizemann et al. 2003). α-dg has extensive and heterogeneous glycosylation that is required for laminin-binding (Ervasti and Campbell 1993). Despite a predicted molecular weight of 72 kda, the mature protein varies in apparent mass from 120 to 156 kda, dependent on tissue source; 156 kda in the rabbit skeletal muscle (Ervasti and Campbell 1993), 120 kda in the rabbit and embryonic chick brain (Ervasti and Campbell 1993) and 140 kda in the rabbit cardiac muscle (Ervasti et al. 1997). The extent of skeletal muscle α-dg glycosylation has also been found to increase during human development (Brown et al. 2004). Although it is evident that the glycan content of α-dg is functionally critical, the nature of α-dg glycosylation is still largely unknown. α-dg contains three potential N-linked glycosylation sites. Enzymatic removal of N-linked glycans alters the molecular weight of α-dg by 4 kda (Ervasti and Campbell 1993). Furthermore, this treatment does not have The Author 2012. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 662

Glycoproteomics of recombinant mouse α-dystroglycan any impact on its activity as an ECM receptor, indicating that the N-linked glycans are not required for ligand binding and therefore indicating that the carbohydrate content of α-dg that mediates binding is O-linked (Ervasti and Campbell 1993). The central domain of α-dg contains a large number of potential O-glycosylation sites ( 50 Ser/Thr residues in a region of around 170 amino acids) and has been assigned the mucin-like domain. α-dg has been found to carry a mixture of O-mannose and mucin-type O-GalNAc (N-acetylgalactosamine)-initiated structures (Chiba et al. 1997; Sasaki et al. 1998; Smalheiser et al. 1998). In humans, recessive mutations in at least six genes result in the hypoglycosylation of α-dg leading to inherited muscular dystrophies (Hewitt 2009). Only the heterocomplex formed by the association of protein O-mannosyl-transferase 1 and 2 (POMT1 and POMT2) and protein O-linked-mannose β-1,2-n-acetylglucosaminyltransferase 1 (POMGnT1) have been experimentally demonstrated to have glycosyltransferase activity, whereas fukutin, fukutin-related protein and LARGE-encoded proteins have homology to glycosyltransferases. In skeletal muscle, these genes are necessary for laminin-binding activity by α-dg, which is associated with immunoreactivity for two antibodies, IIH6 and VIA4 (Hewitt 2009). Two recent studies have reported the site-specific O-glycosylation of α-dg from human and rabbit skeletal muscle (Nilsson et al. 2010; Stalnaker et al. 2010). The human study characterized 25 glycopeptides which corresponded to five tryptic peptides. Four of the peptides were heterogeneously glycosylated with both O-mannose and mucin-type O-GalNAc-initiated structures, whereas the fifth peptide, which contained the O-glycosylation sites at T367 and T369, was only modified by O-mannose glycans. The rabbit study characterized 91 glycopeptides and identified 9 sites with O-mannose structures and 14 sites with O-GalNAc-initiated structures. A third recent investigation expressed rabbit α-dg construct fragments in human embryonic kidney 293T (HEK293T) cells and found that one Thr residue in the mucin-like domain carries an unusual phosphorylated O-mannosyl glycan (Yoshida-Moriguchi et al. 2010). This O-mannosyl phosphorylation was shown to be a component of the post-translational modification of α-dg that is required for binding to laminin. The LARGE protein was shown to act downstream of the addition of this phosphorylation event. In this paper, we describe the design of an Fc-tagged mouse α-dg construct, its expression and purification from HEK293T cells and its glycoproteomic O-glycan sitemapping. Characterization of mouse α-dg glycosylation is of particular interest as the mouse is utilized as a model organism in a number of animal models for the study of dystroglycanopathies. Results Expression and purification of recombinant mouse α-dg in HEK293T cells To obtain recombinant mouse α-dg, we used the migg2a-fc2 vector system. We tested for secretion of the recombinant protein in both Chinese hamster ovary and HEK293 cells, as both systems had been used previously to generate recombinant, glycosylated DG (e.g. Kunz et al. 2001; Patnaik and Stanley 2005). The highest yields were generated from HEK293T cells. Therefore, for mass spectrometry analysis, we purified the α-dgfc fusion protein from the conditioned media of stably transfected HEK293T cells. Cells were grown in media containing ultra-low IgG serum to reduce the co-purification of immunoglobulins. The purified fusion protein, as visualized by silver staining, migrated as a heterogeneous smear of apparent mass 110 150 kda that was positive with the Fc antibody (Figure 1). However, only the fraction of the fusion protein migrating at an apparent mass of 130-kDa was positive for the IIH6 antibody. Thus, not all of the secreted fusion protein contained this LARGE-dependent epitope. This is consistent with previous studies (Kunz et al. 2001; Patnaik and Stanley 2005) and might be a consequence of the high expression levels of the fusion protein. Proteomic characterization of recombinant mouse α-dg by matrix-assisted laser desorption ionization mass spectrometry An aliquot of purified recombinant mouse α-dg was reduced and carboxymethylated, digested with trypsin and subjected to offline nano-liquid chromatography (LC) and matrixassisted laser desorption ionization (MALDI)-time of flight (TOF)/TOF tandem mass spectrometry (MS/MS). Major peaks from the MALDI-TOF mass spectrometry (MS) profiles were selected for collision-induced dissociation (CID) by MS/ MS. Fragment ions detected in the MS/MS spectra were used to search the Swiss Prot database using the Mascot search engine for peptide sequences consistent with fragment ions observed. The Mascot search automatically identified 10 Fig. 1. Analysis of purified α-dgfc fusion protein. Aliquots of protein G-purified protein were separated by 6% sodium dodecyl sulfate polyacrylamide gel electrophoresis and visualized by silver staining or by immunoblotting with either anti-fc antibody or IIH6 (to detect fully glycosylated α-dg). M, spectra molecular weight marker (Fermentas). The purified Fc fusion protein can be seen by silver staining to migrate as a heterogeneous smear of approximate apparent mass of 110 160 kda. All of this material appeared positive with the Fc antibody. However, only a fraction of the fusion protein that migrated at an apparent mass of 130 170 kda was positive for the IIH6 antibody. 663

R Harrison et al. peptides from mouse α-dg. Detailed manual assessment of the data revealed additional peptides which had not been automatically assigned due to non-specific cleavages or peptides that contained a decomposed derivative of S-carboxymethylated methionine, dethiomethyl methionine, a modification that has been previously reported (Jones et al. 1994). An additional peptide 574 GGLSAVDAFEIHVHK 589 was observed in subsequent online nano-lc electrospray Table I. Summary of α-dg peptides identified Identification process Peptides automatically identified by Mascot Manual analysis of the data Peptide sequence Residues Peak observed (m/z) GGEPNQRPELK 481 491 1224.6 VDAWVGTYFEVK 497 508 1413.7 IPSDTFYDNEDTTTDK 509 524 1861.9 IPSDTFYDNEDTTTDKLK 509 526 2103.1 LREQQLVGEK 531 540 1199.7 EQQLVGEK 533 540 930.5 HEYFMHATDK 564 573 1278.6 GGLSAVDAFEIHVHK 574 589 790.9 2+ LAGDPAPVVNDIHK 603 618 1445.8 IALVK 618 622 543.4 LAFAFGDR 624 631 896.5 SNSQLMYGLPDSSHVGK 547 563 1771.9 (1819.8) SQLMYGLPDSSHVGK 549 563 1570.9 (1618.7) HEYFMHATDK 564 573 1230.6 (1278.6) A total of 14 α-dg peptides were detected. Eleven α-dg peptides were identified automatically using Mascot. Three peptides containing dethiomethyl methionine (highlighted in bold) were identified by manual analysis of the data. Expected m/z values of the peptides containing these modified methionine residues are shown in parentheses. Peaks observed are [M + H] +, unless annotated with an alternative charge. (Electrospray)-QSTAR MS experiments (Table I). Six peptides were also observed from the Fc-tag. A summary of the total mapped sequence of recombinant mouse α-dg is presented in Figure 2. Characterization of O-mannosylated glycopeptides by MALDI-MS Detailed assessment of the offline nano-lc-maldi-tof/tof MS/MS data revealed a number of peaks separated by increments corresponding to monosaccharide masses allowing identification as potential α-dg glycopeptides (Tables II VII). For example, m/z 1603.9 is 162 Da (mass of a Hex residue) greater than m/z 1441.9 and further related signals at m/z 1807.0 and m/z 1969.1 correspond to m/z 1603.9 plus N-acetylhexosamine (HexNAc) and HexHexNAc, respectively. MALDI-TOF/TOF MS/MS analyses of these peaks confirmed the presence of glycosylation, via the loss of carbohydrate masses from the molecular ion peak, and revealed the peptide backbone sequence 351 DPVPGKPTVTIR 362 (with an expected Mr of 1278.7; Table II). This peptide contains two potential O-glycosylation sites. The annotated MALDI-TOF/TOF spectra of m/z 1441.9, 1603.9 and 1969.1 are shown in Figure 3. The glycan attachment sites were determined by analysis of the peptide fragment ions. The MS/MS spectra suggest that both Thr-358 and Thr-360 can be glycosylated. This is shown by the presence of y 4 ions at m/z 488 and 650 in the TOF/TOF spectrum of m/z 1441.9 and the absence of m/z 488 in the TOF/TOF spectrum of m/z 1969.1. Further confirmation is provided by the presence of the y 3 ions at m/z 551 and 916 in the fragmentation spectrum of m/z 1969.1. The glycopeptides at m/z 1441.9 and 1603.9 carry a single hexose residue on the Thr residues. There is no evidence for a Hex Hex sequence Fig. 2. Mapped sequences of the recombinant mouse α-dg construct. The construct was expressed in HEK cells and then purified with protein G. Peptides highlighted in light grey correspond to those sequenced using offline nano-lc MALDI-TOF/TOF MS/MS and online nano-lc ES-QTOF MS/MS. Glycosylation sites of O-mannosylation are indicated with bold text and sites detected to be modified by GalNAc are indicated with bold text and an asterisk (*). The peptide highlighted in dark grey corresponds to the peptide containing the novel O-mannose structure (site of attachment is boxed in black). The region of the construct corresponding to the IL2 signal sequence is shown underlined with a dotted line, the Furin cleavage site in bold text with a solid underline and the Mouse IgG2a Fc domain in light grey text. 664

Glycoproteomics of recombinant mouse α-dystroglycan upon the fragmentation of m/z 1603.9. The pattern of monosaccharide losses from the molecular ion and subsequent y-ions in the fragment spectrum of m/z 1969.1 suggest that the glycans attached to the peptide backbone have the compositions Hex and Hex 2 HexNAc, which have been assigned as possible O-mannose glycans Man-Thr and Galβ1-4GlcNAcβ1-2Man-Thr. A second glycopeptide, which maps to 329 IVPTPTSPAIA PPTETMAPPVR 350 with a dethiomethyl methionine (Mr 2194.2), was identified (Table III). Peaks at m/z 3371.8, 3736.0 and 4101.2 have a mass equivalent to the peptide backbone plus glycan compositions of Hex 6 HexNAc, Hex 7 HexNAc 2 and Hex 8 HexNAc 3, respectively. The MALDI-TOF/TOF MS/MS spectrum of the ion at m/z 3736.0 can be seen in Figure 4. The peptide sequence contains five potential O-glycosylation sites, and the glycan compositions suggest that all five sites are at least glycosylated with a single hexose residue and several bear the extended trisaccharide structure (Hex-HexNAc-Hex). The size of the glycopeptides made it difficult to specify which Thr and Ser residues carried the disaccharide structure. Figure 4 shows the major structure of the molecular ion at m/z 3736.0. However, there are fragment ion peaks present that indicate heterogeneity. Characterization of O-GalNAc glycopeptides by MALDI-MS The peptide, 446 TPRPVPR 452, has a theoretical unglycosylated Mr of 821.5 Da. Related peaks at m/z 1187.6 and 1228.7 carry the glycan compositions HexHexNAc and HexNAc 2,respectively (Table IV). The HexHexNAc moiety on the m/z 1187.6 molecular ion was assigned as a Core 1 mucin-type O-glycan (Galβ1,3GalNAc), based on the knowledge of the biosynthetic pathway and the loss of Hex followed by Table II. Glycans attached to Thr-358 and Thr-360 within the glycopeptide 351 DPVPGKPTVTIR 362 Peak observed (m/z) Mr Peptide sequence Glycopeptide minus peptide (Mr) Glycan composition 1441.9 1440.9 351 DPVPGKPTVTIR 362 162.2 Hex 1603.9 1602.9 351 DPVPGKPTVTIR 362 324.2 Hex 2 1807.0 1806.0 351 DPVPGKPTVTIR 362 527.3 Hex 2 HexNAc 1969.1 1968.1 351 DPVPGKPTVTIR 362 689.4 Hex 3 HexNAc Four glycoforms of the peptide, 351 DPVPGKPTVTIR 362 (Mr = 1278.7), were identified using offline MALDI-TOF/TOF MS/MS. Potential O-glycosylation sites are underlined and peaks observed are [M + H] +. The Mr values correspond to non-protonated species. HexNAc in the MS/MS spectrum (Figure 5). The MALDI-TOF/TOF MS/MS spectrum for the molecular ion at m/z 1228.7 (data not shown) suggests the addition of a HexNAc 2 moiety onto the N-terminal Thr residue. The HexNAc HexNAc structure was putatively assigned as a core 3 mucin-type O-glycan (GlcNAcβ1,3GalNAc). It is important to note that the same composition could correspond to core 5, core 6 or core 7 structures; however, these are rare in mouse and human tissue (Wopereis et al. 2006). Characterization of Wisteria floribunda agglutinin lectin chromatography glycopeptides by MALDI-MS Yoshida-Moriguchi et al. have demonstrated that a phosphorylated O-linked mannose structure is necessary for α-dg binding to laminin. Yoshida-Moriguchi et al. (2010) used the plant lectin, Wisteria floribunda agglutinin (WFA), to enrich glycopeptides carrying this glycan. WFA is a plant lectin that binds with high affinity to glycans containing terminal β-linked GalNAc residues (Torres et al. 1988). Peptides/glycopeptides from recombinant mouse α-dg were treated with 48% aqueous HF, conditions known to be specific for the cleavage of phosphodiester linkages. HF-treated peptides/glycopeptides were subsequently enriched using WFA lectin chromatography and the resulting bound and wash fractions were analyzed. After WFA enrichment, offline nano-lc and MALDI-TOF MS analysis of the bound fraction revealed three peaks related by m/z 365 (HexHexNAc) increments. These are m/z 2555.2, 2920.3 and 3285.4. Evidence from MALDI-TOF/TOF MS/ MS spectra confirmed that these peaks correspond to the glycopeptides 365 GAIIQTPTLGPITR 380 bearing the glycan HexNAc 2 Hex, which is fully consistent with the novel trisaccharide GalNAcβ1,3GlcNAc β1,4-mannitol (Yoshida- Moriguchi et al. 2010; Table V). Figure 6 shows the MALDI-TOF/TOF MS/MS spectrum of the glycopeptide at m/z 2555.2. The absence of HexNAc 2 Hex on the y 3,y 4,y 6, y 8,y 7 and y 10 ions suggests that Thr-370 of the construct is modified with the novel trisaccharide. Characterization of WFA lectin chromatography glycopeptides by online nano-lc ES-QSTAR MS An aliquot of bound glycopeptides was also analyzed by online nano-lc ES-QSTAR MS and MS/MS. Three glycopeptides were found in the ES-MS spectra. MS data between the ion retention times 40.0 and 41.6 min were summed and examined in detail. Doubly and triply charged ions separated by mass intervals consistent with monosaccharide masses Table III. Glycans attached to Thr-332, Thr-334, Thr-335, Thr-342 and Thr-344 within the glycopeptide 329 IVPTPTSPAIAPPTETMAPPVR 350 Peak observed (m/z) Mr Peptide sequence Glycopeptide minus peptide (Mr) Glycan composition 3371.8 3370.8 3736.0 3735.0 3784.9 3783.9 4101.2 4100.2 329 IVPTPTSPAIAPPTETMAPPVR 350 1176.6 Hex 6 HexNAc 329 IVPTPTSPAIAPPTETMAPPVR 350 1540.8 Hex 7 HexNAc 2 329 IVPTPTSPAIAPPTETMAPPVR 350 1589.7 Hex 7 HexNAc 2 329 IVPTPTSPAIAPPTETMAPPVR 350 1906.0 Hex 8 HexNAc 3 Four glycoforms of the peptides, 329 IVPTPTSPAIAPPTETMAPPVR 350 (Mr = 2242.2) and 329 IVPTPTSPAIAPPTETMAPPVR 350 (Mr = 2194.2), were identified using offline MALDI-TOF/TOF MS/MS. Potential O-glycosylation sites are underlined and bold methionine residues indicate dethiomethyl methionine residues. The Mr values correspond to non-protonated species. 665

R Harrison et al. were observed (Supplementary data, Figure S1 and Table S1). Subtracting the mass of the assigned glycan compositions from the calculated molecular weight of the ions gives a peptide backbone mass of 1661.9 Da. This mass can be assigned to the novel trisaccharide-bearing tryptic peptide seen in the MALDI-TOF/TOF MS/MS experiments; 365 GAIIQTPTLGPITR 380. Further evidence for this peptide backbone sequence was gained from CID-ES-MS/MS data. The ions at m/z 825.5 and 1071.3 were among those selected for data-dependent CID-ES-MS/MS, and the resulting spectra Table IV. Glycans attached to Thr-446 within the glycopeptide 446 TPRPVPR 452 Peak observed (m/z) Mr Peptide sequence Glycopeptide minus peptide (Mr) Glycan composition 1187.6 1186.6 446 TPRPVPR 452 365.1 HexHexNAc 1228.7 1227.7 446 TPRPVPR 452 406.2 HexNAc 2 Two glycoforms of the peptide, 446 TPRPVPR 452 (Mr = 821.5), were identified using offline MALDI-TOF/TOF MS/MS. Potential O-glycosylation sites are underlined. The Mr values correspond to non-protonated species. can be seen in Figure 7. The fragmentation of these molecular ions provides definite evidence for glycosylation through the presence of major low mass, singly charged signals corresponding to glycan fragments. Characteristic fragment ions are observed at m/z 204 (HexNAc + ), m/z 366 (HexHexNAc + ) and m/z 274 (NeuAc + ). ES-MS/MS can also provide additional information about the peptide backbone sequence. The series of b-ions (b 3,b 4,b 5 and b 6 ) in both spectra provide evidence for the C-terminal sequence, GAIIQT, whereas the y-ions suggest the N-terminal sequence PTLGPIQPTR. y-ions such as those at m/z 1241.55 and 1403.84 (top panel) are evidence for the peptide plus carbohydrate compositions comprising a single hexose and two hexose residues, respectively (Table V). A second glycopeptide was identified in the online nano-lc ES-MS and MS/MS analysis of the bound α-dgfc. The glycopeptide eluted between 38.4 and 40.0 min. Two triply charged signals were observed (m/z 958.8 3+ and 1055.9 3+ ) consistent with the peptide backbone 509 IPSDTFYDNEDTTTDKLK 526 carrying HexHexNAc 3 and NeuAcHexHexNAc 3. This peptide contains five potential O-glycosylation sites. The ES-MS/MS data generated does not allow the assignment of the glycan components to specific Table V. Glycans attached to Thr-370, Thr-372 and Thr-379 within the glycopeptide 365 GAIIQTPTLGPIQPTR 380 Peak observed (m/z) Mr Peptide sequence Glycopeptide minus peptide (Mr) Glycan composition 912.5 2+ 1823.0 993.6 2+ 1985.2 1074.6 2+ 2147.2 1176.7 2+ 2351.4 825.5 3+ 2554.2 1278.3 2+ 2555.2 1359.3 2+ 2716.6 974.3 3+ 1460.8 2+ 2919.3 2920.3 1504.8 2+ 3007.6 1095.7 3+ 1541.9 2+ 3081.8 1071.3 3+ 1606.4 2+ 3210.8 1095.7 3+ 1643.5 2+ 3284.4 3285.4 1192.7 3+ 3575.1 1289.7 3+ 3866.1 365 GAIIQTPTLGPIQPTR 380 161.1 Hex 365 GAIIQTPTLGPIQPTR 380 323.2 Hex 2 365 GAIIQTPTLGPIQPTR 380 485.3 Hex 3 365 GAIIQTPTLGPIQPTR 380 689.5 Hex 3 HexNAc 365 GAIIQTPTLGPIQPTR 380 892.3 Hex 3 HexNAc 2 365 GAIIQTPTLGPIQPTR 380 1054.7 Hex 4 HexNAc 2 365 GAIIQTPTLGPIQPTR 380 1257.4 Hex 4 HexNAc 3 365 GAIIQTPTLGPIQPTR 380 1346.0 NeuAcHex 4 HexNAc 2 365 GAIIQTPTLGPIQPTR 380 1419.9 Hex 5 HexNAc 3 365 GAIIQTPTLGPIQPTR 380 1548.9 NeuAcHex 4 HexNAc 3 365 GAIIQTPTLGPIQPTR 380 1622.5 Hex 5 HexNAc 4 365 GAIIQTPTLGPIQPTR 380 1913.6 NeuAcHex 5 HexNAc 4 365 GAIIQTPTLGPIQPTR 380 2204.2 NeuAc 2 Hex 5 HexNAc 4 Thirteen glycoforms of the peptide, 365 GAIIQTPTLGPIQPTR 380 (Mr = 1661.9), were identified in the bound fraction after WFA purification using both offline MALDI-TOF/TOF MS/MS and online ES-QSTAR MS/MS. Potential O-glycosylation sites are underlined and peaks observed are [M + H] +, unless annotated with an alternative charge. The Mr values correspond to non-protonated species. Table VI. Glycan compositions attached to Ser-511 and Thr-513, Thr-520, Thr-521 and Thr-522 within the peptide 509 IPSDTFYDNEDTTTDKLK 526 Peak observed (m/z) Mr Peptide sequence Glycopeptide minus peptide (Mr) Glycan composition 958.8 3+ 2873.4 1055.9 3+ 3164.7 509 IPSDTFYDNEDTTTDKLK 526 771.5 HexHexNAc 3 509 IPSDTFYDNEDTTTDKLK 526 1062.8 NeuAcHexHexNAc 3 Two glycoforms of the peptide, 509 IPSDTFYDNEDTTTDKLK 526 (Mr = 2101.9), were identified in the bound fraction after WFA purification using online ES-QSTAR MS/MS. Potential O-glycosylation sites are underlined and peaks observed are [M + H] +, unless annotated with an alternative charge. The Mr values correspond to non-protonated species. 666

Glycoproteomics of recombinant mouse α-dystroglycan Table VII. Glycan compositions attached to Ser-466 and Thr-464 and Thr-469 within the peptide, 462 LETASPPTR 470 Purification fraction Peak (m/z) Mr Potential peptide sequence Glycopeptide minus peptide (Mr) Glycan composition Bound fraction 770.4 2+ 1538.8 790.9 2+ 1579.8 871.9 2+ 1741.8 953.0 2+ 1904.0 973.5 2+ 1945.0 703.4 3+ 2107.0 1054.5 2+ 800.4 3+ 2398.2 Wash fraction 587.8 2+ 1173.7 1174.7 668.8 2+ 1335.7 1336.7 689.4 2+ 1376.8 770.4 2+ 1538.8 790.9 2+ 1579.8 851.5 2+ 1701.0 871.9 2+ 1741.8 915.5 2+ 1829.0 953.0 2+ 1904.0 1034.0 2+ 2066.0 462 LETASPPTR 470 568.3 HexHexNAc 2 462 LETASPPTR 470 609.3 HexNAc 3 462 LETASPPTR 470 771.3 HexHexNAc 3 462 LETASPPTR 470 933.5 Hex 2 HexNAc 3 462 LETASPPTR 470 974.5 HexHexNAc 4 462 LETASPPTR 470 1136.5 Hex 2 HexNAc 4 462 LETASPPTR 470 1427.7 NeuAcHex 2 HexNAc 4 462 LETASPPTR 470 203.2 HexNAc 462 LETASPPTR 470 365.2 HexHexNAc 462 LETASPPTR 470 406.3 HexNAc 2 462 LETASPPTR 470 568.3 HexHexNAc 2 462 LETASPPTR 470 609.3 HexNAc 3 462 LETASPPTR 470 730.5 Hex 2 HexNAc 2 462 LETASPPTR 470 771.3 HexHexNAc 3 462 LETASPPTR 470 858.5 NeuAcHex HexNAc 2 462 LETASPPTR 470 933.5 Hex 2 HexNAc 3 462 LETASPPTR 470 1095.5 Hex 3 HexNAc 3 Thirteen glycoforms of the peptide, 462 LETASPPTR 470 (Mr = 970.5), were identified in both the bound and wash fractions after WFA purification using online ES-QSTAR MS/MS. Potential O-glycosylation sites are underlined and peaks observed are [M + H] +, unless annotated with an alternative charge. The Mr values correspond to non-protonated species. Shaded rows indicate glycoforms that were detected in the wash fraction only. sites but the peak at m/z 407 indicates the presence of the structure, HexNAc HexNAc, which most likely corresponds to a core 3 O-glycan (GlcNAcα1,3GalNAcα1-Ser/Thr). Binding to the WFA lectin column could be explained by the presence of the Tn antigen (GalNAcα1-Ser/Thr), unusual core structures or non-specific charged sialic acid interactions (Table VI). Examination of the α-dg WFA bound fraction ES-MS data eluting between 27.1 and 28.5 min revealed the presence of a third glycopeptide. The molecular and fragment ion values were consistent with the peptide backbone sequence of 462 LETASPPTR 470. This peptide contains three potential O-glycosylation sites. The glycan compositions HexHexNAc 2 NeuAcHex 2 HexNAc 4 were assigned to mucintype O-glycans with a large amount of GalNAcα1-Ser/Thr (Tn antigen) which could be the reason that the glycopeptides bound to the WFA column (Table VII). An aliquot of the WFA lectin chromatography wash fraction was also subjected to online nano-lc ES-MS and MS/ MS. The 462 LETASPPTR 470 glycopeptide eluted between 28.9 and 30.7 min. In the wash fraction, the glycan compositions contained more hexose HexNAc Hex 3 HexNAc 3 (Table VII). This increase in hexose may have decreased the amount of terminal GalNAc and therefore would explain why these glycopeptides did not bind to the lectin and were found in the wash fraction. A peak eluting in the wash fraction was observed at m/z 790.9 +. This molecular ion mapped to a previously undetected α-dg peptide, 588 GGLSAVDAFEIHVK 603 (Table I). Discussion A total of 14 different peptide sequences and 38 glycopeptides from mouse α-dg were identified resulting in coverage of 49% of the sequence C-terminal to the furin-cleavage site (Figure 2). Peptides originating from the N-terminal region of the construct were not detected; this is presumed to be due to processing at the furin cleavage site and is consistent with the studies of Stalnaker et al. looking at endogenously purified protein from rabbit skeletal muscle. The 38 glycopeptides identified consisted of six different peptide backbones containing 19 potential glycosylation sites of which 11 were confirmed to be glycosylated; three of these glycopeptides were found to carry mucin-type, O-GalNAc initiated glycans and three glycopeptides contained nine Thr residues and one Ser residue carrying O-mannose initiated glycans. These glycopeptides displayed heterogeneity associated with both their glycan core structures and their glycan site occupancies. Only one site carrying the novel trisaccharide, previously shown to contain a phosphomannose residue (Yoshida-Moriguchi et al. 2010), was observed. Between Ile-329 and Arg-380 (52 amino acid residues), 10 of the 11 Thr/Ser residues were found to carry O-mannose. The 11th residue, Thr-363, was not detected in our analyses. This was likely due to the short peptide backbone created by the tryptic digest, 363 TR 364, and the resulting low molecular weight and high hydrophilicity of the glycopeptide. No information about the peptide and associated glycosylation sites between Val-381 and Arg-445 of α-dg was acquired. These 65 amino acid residues make up a region containing 23 potential O-glycosylation site (Ser/Thr) residues and 11 Pro residues. One reason for not detecting the glycopeptides in this region could have been due to incomplete protease digestion restriction, caused by the densely packed glycosylation preventing trypsin from accessing the peptide backbone substrate. The resulting high molecular weight of the large peptide region plus the potential heterogeneity of glycosylation would lead to molecular ion amounts below the level of detection. A recent mass spectrometric analysis of the extracellular domain of Drosophila DG revealed an O-mannosylation-rich, trypsin- 667

R Harrison et al. Fig. 3. MALDI-TOF/TOF MS/MS spectra of the molecular ions at m/z 1441.9 (A), m/z 1603.9 (B) and m/z 1969.1 (C) from α-dgfc1. Peptide fragmentation provides strong evidence for the sequence 351 DPVPGKTVTIR 362. Intact y-ions are labeled in purple, intact b-ions are labeled in green and y- and b-ions with losses of monosaccharides, ammonia or water are labeled in black. Immonium ions are labeled in pink. Internal fragment ions are labeled in orange. The pink arrows show loss of the indicated glycan substituent. Ions diagnostic for glycosylation patterns are highlighted in bold. resistant region that was not amenable to MS analysis and was thought to correspond to the central mucin-like domain (Nakamura et al. 2010). We achieved a much higher degree of sequence coverage after the mucin-like domain, between the residues Thr-446 and Arg-631, where sequence coverage of 73.6% was achieved, and within this region, nine mucin-type O-glycosylation sites were detected. A combination of complementary mass spectrometry instruments were utilized in this study. The ES-QSTAR MS/MS (CID-ES-MS/MS) spectrum is dominated by low mass 668

Glycoproteomics of recombinant mouse α-dystroglycan Fig. 4. MALDI-TOF/TOF MS/MS spectrum of the glycopeptide at m/z 3736.0. Glycopeptide fragmentation provides strong evidence for the peptide sequence 329 IVPTPTSPAIAPPTETMAPPVR 350. Y-ions are labeled in purple, no b-ions were observed. Immonium ions are labeled in pink. Internal fragment ions are labeled in orange. The pink arrows show loss of indicated glycan substituent. The major structure for the molecular ion at m/z 3736.0 is shown in the inset; however, the presence of the y 16 ion at m/z 2437 provides evidence for the trisaccharide at the C-terminal Thr residues or the Ser residue. Methionine in bold blue indicates a dethiomethyl methionine. Fig. 5. MALDI-TOF/TOF MS/MS spectrum of the glycopeptide at m/z 1187.6. Peptide fragmentation provides evidence for the sequence 446 TPRPVPR 452. Intact y-ions are labeled in purple, intact b-ions are labeled in green. Ions with additional water and losses of monosaccharides are labeled in black. Immonium ions are labeled in pink. Internal fragment ions are labeled in orange. The loss of Hex followed by HexNAc from the molecular ion (m/z 1025 and 822) suggests a mucin-type core 1 structure. 669

R Harrison et al. Fig. 6. MALDI-TOF/TOF MS/MS spectrum of the molecular ion at m/z 2555.2. Peptide fragmentation provides evidence for the sequence 365 GAIIQTPTLGPITR 380. A major structure for the molecular ion at m/z 2555.2 is shown in the inset. Y- and b-ions produced from fragmentation of this structure are labeled in purple and green, respectively. The pink arrows show the loss of the indicated glycan substituent. Immonium ions are labeled in pink. carbohydrate fragment ions providing definite evidence for glycosylation. The y- and b-ions bear few or no glycan structures, allowing confident assignment of the peptide backbone. The carbohydrate fragment ions, however, can often suppress the signals from y- and b-ions. The MALDI-TOF/TOF MS/ MS spectrum, on the other hand, is dominated by losses of the glycan components from the molecular ion. The order in which glycan components are lost provides information on the glycan sequence. In addition, y-ions that do not lose the glycan components allow the assignment of site-specific glycosylation. A second difference between MALDI and ES ionization highlighted in this study is the sensitivity to minor and sialylated components seen in ES-MS. No sialylated components were identified in the MALDI data due to loss of the sialic acid moiety during in-source decay and post-source decay. In contrast, during ES ionization, the sialic acid remains associated with the glycan allowing sialylated glycopeptides to be detected. In their analysis of rabbit α-dg, Stalnaker et al. (2010) succeeded in characterizing 91 glycopeptides containing 21 O-glycosylation sites, 9 of which were O-mannosylated and 14 were O-GalNAcylated. In our study, we found one glycopeptide and two peptides that were not identified by Stalnaker et al. These were 329 IVPTPTSPAIAPPTETMAPPVR 350, 547 SNSQLMYGLPDSSHVGK 563 and 588 GGLSAVDAFEI HVK 603. Detection of the large glycopeptide between Ile-329 and Arg-350 demonstrates the advantage of employing both MALDI and ES-MS instrumentation. This glycopeptide was only detected by MALDI-TOF MS. Detection of Ser-547 to Lys-563 was difficult due to unusual non-specific cleavage of the peptide backbone and was assigned only after intense manual analysis of the data. The absence of the rabbit peptide equivalent to 588 GGLSAVDAFEIHVK 603 (GGLSAVD AFEIHVHK) in the data set of Stalnaker et al. led them to hypothesize that a reason for this may be the presence of an additional phosphomannose moiety. However, in our recombinant mouse α-dg, this peptide was detected in an unglycosylated form [an observation that was also made during the analysis of human α-dg (Nilsson et al. 2010)], suggesting that the presence of an additional phosphomannose at this position is unlikely. In the recombinant mouse α-dg, the peptide DPVPGKPTVTIR was found to carry O-mannose glycans. In contrast, in purified α-dg from rabbit skeletal muscle, these sites have been reported to bear mucin-type O-glycans (Stalnaker et al. 2010). In order for the authors to see this rabbit glycopeptide, glycosidases were used to reduce the heterogeneity of the glycopeptides. Interestingly, in another recent study, the only O-mannosylated sites identified in human α-dg were in the peptide, DPVPGKPTVTIR (Nilsson et al. 2010). Analysis of α-dg purified from human skeletal muscle tissue also revealed that O-glycosylation at the C terminus (after Thr-446) is mucin-type O-glycosylation, which is in agreement with data reported here. Stalnaker et al. do not agree with this observation; in rabbit skeletal muscle, O-mannose glycans were reported on the C-terminal peptides, LETASPPTR and TTTSGVPR. The discrepancies between human and rabbit skeletal muscle extracts suggest that there is species-specific variation in O-mannosylation patterns. Similarities between the glycosylation of recombinant mouse α-dg expressed in HEK cells and human α-dg demonstrates the validity of the system as a model for studying α-dg glycosylation. Nilsson et al. used ES-MS/MS and, due to fragmentation limitations associated with this instrumentation, did not obtain information on glycan sequences or specific attachment sites. As a result, they speculated that an O-mannosylated Thr may carry a novel glycan structure with one or two Hex 670

Glycoproteomics of recombinant mouse α-dystroglycan Fig. 7. CID-ES-MS/MS spectra of m/z 825.5 3+ (A) and m/z 1071.3 3+ (B) acquired during the WFA-enriched mouse α-dg online LC ES-MS/MS experiment. The peptide sequence assignments of 365 GAIIQTPTLGPITR 380 are provided in the schematic. Y-ions, b-ions and internal fragment ions are labeled in orange, immonium ions are labeled in pink and the peaks corresponding to carbohydrate fragments are annotated. bound to the innermost O-mannose, leaving mucin-type O-glycosylation on the other Thr residues within the peptide (DPVPGKPTVTIR; Nilsson et al. 2010). In this report, using MALDI-TOF/TOF MS/MS, we were able to rule out potential novel structures and confirm that this peptide does carry the biosynthetically predicted O-mannosyl lactosamine structures. Nilsson et al. (2010) report limited structural variability and suggest a reason for this is that the O-mannosylation is specific and strictly regulated. In contrast, the data obtained during this study found a higher degree of heterogeneity in the O-mannosylated structures, e.g. the 13 forms of the glycopeptide, 365 GAIIQTPTLGPIQPTR 380, that were seen in the online LC ES-MS/MS experiment. Stalnaker et al. also report a similar level of heterogeneity of O-mannose structures on native rabbit α-dg. Nevertheless, compared with the potential heterogeneity displayed by mammalian mucin-type O-glycosylation [exemplified in a recent study by Ismail et al. (2011)], the array of reported O-mannose structures is relatively limited. Analyses of rabbit skeletal muscle α-dg (Stalnaker et al. 2010) and Drosophila DG (Nakamura et al. 2010) revealed glycosylation sites that could be modified by either O-mannose or O-GalNAc. The finding that O-glycosylation sites could be modified with either O-mannose or O-GalNAc led to the 671

R Harrison et al. suggestion that interplay exists between the two biosynthetic pathways. Stalnaker et al. observed both O-mannose and O-GalNAc glycans on Ser and Thr residues in the peptide LETASPPTR; in the analysis described here, this glycopeptide was identified only in the online nano-lc ES-MS and MS/MS experiment and therefore glycan compositions, not structures, could be assigned. The observation that a fraction of the glycopeptides bound to the WFA column and a fraction was collected in the wash may indicate that the presence of O-mannose initiated glycans in the wash fraction. Recent glycomic profiling of O-glycans from the stomach of mice deficient in core 2 β1,6-n-acetylgalactosaminyltransferases (C2GnT triple KO) revealed that when the core 2 mucin-type O-glycosylation pathway is disrupted stomach cells synthesize elongated O-mannose glycans containing up to three Fuc residues and three LacNAc repeats. The authors suggested that cells may amplify the addition of complex antennae to O- linked mannose in order to replace the diminished extended core 2 O-glycans (Ismail et al. 2011). This provides additional evidence for interplay between O-mannosylation and the O-GalNAcylation. O-Mannosylation and O-GalNAcylation are two pathways that are theoretically competing for the same sites (Ser/Thr residues). However, enzymes for O-mannose attachment are localized to the ER and precede the O-GalNAc machinery that is localized in the cis-golgi. Site-specificity of O-GalNAcylation appears to be a consequence of the large family of enzymes that catalyze the attached of α-galnac to Ser/Thr residues [UDP-GalNAc:polypeptide α-n-acetylgalactosaminyltransferases ( pp-galnacts)]. There are up to 20 different known isoforms of pp-galnact. Many of these enzymes have clear specificities for the sites of attachment and are differentially expressed over tissue and time (Jensen et al. 2009; Perrine et al. 2009), permitting a complex and strict regulation. O-Mannosylation, in contrast, is synthesized by the action of two enzymes (POMT1 and POMT2) that form a single complex (Akasaka-Manya et al. 2006). Although an alternative splice form of POMT2 was found in the testis, it did not appear to be involved in O-mannosylation (Lommel et al. 2008). The importance of the O-mannose modification on α-dg and its restricted expression thus implies a different and much stricter regulation of O-mannosylation compared with O-GalNAcylation. Investigations into the mechanisms behind this strict regulation are needed. In the study of human α-dg by Nilsson et al., the peptide LETASPPTR is reported to carry only mucin-type O-GalNAc initiated glycans. The authors suggest that O-mannosylation is a strictly regulated glycosylation that occurs at specific sites within α-dg to direct the synthesis of the large, unknown, phosphomannose-linked moiety (Nilsson et al. 2010). The distribution of the different O-glycosylation types on human α-dg and the localization of O-mannose glycans observed in recombinant mouse α-dg make this an attractive hypothesis. This would explain the finding of Combs and Ervasti (2005) who showed that when the O-mannose glycans were removed by glycosidases the binding of laminin was not affected. Removal of the small O-mannose glycans after the synthesis of the large, functional moiety would not interfere with it binding to laminin. This hypothesis also provides an explanation for the involvement of the enzyme, POMGnT1 in the correct functioning of α-dg. The demonstration that addition of a large, unknown moiety via a phosphodiester linkage onto the mannose residue within GalNAcβ1,3GlcNAcβ1,4Man was necessary for correct α-dg functioning (Yoshida-Moriguchi et al. 2010) does not explain why mutations in POMGnT1 (an enzyme that catalyzes the addition of GlcNAc in a β1,2-linkage to O-linked mannose) also cause dystroglycanopathies in humans (Yoshida et al. 2001; Godfrey et al. 2007) and mice (Liu et al. 2006; Miyagoe-Suzuki et al. 2009). If a dense population of lactosamine-containing O-mannose glycans (NeuAcα2,3/Galβ1,4GlcNAcβ1,2Man) play a role in the synthesis of the large, unknown, functional moiety, it is feasible that mutations in POMGnT1 would prevent proper α-dg function. It is worth noting that both in this paper and that of Yoshida-Moriguchi et al. a glycan is observed two amino acids C-terminal of the novel trisaccharide that has to have been biosythesized by the action of POMGnT1. A recent, in vitro study by Manya et al. (2007) used a series of synthetic peptides derived from the mucin-like domain of α-dg and recombinant POMT1/2 enzymes to demonstrate that mammalian O-mannosylation prefers the amino acid sequence IXPT(P/X)TXPXXXXPT(T/X)XX. This proposed consensus sequence agrees with the glycopeptide, 329 IVPTPTSPAIAPPTETMAPPVR 350 (IXPT(P/X) TXPXXXXPT(T/X)XX), containing 5 of our 10 defined O-mannosylation sites. Three further O-mannosylated sites, in the peptide 365 GAIIQTPTLGPIQPTR 380, are in a similar sequence, only deviating from the proposed consensus sequence twice; IXQT(P/X)TXGPXXPT(T/X)XX. Nevertheless, the remaining two O-mannosylation sites, present in the peptide 351 DPVPGKPTVTIR 362, did not agree with the consensus sequence. Additional contradicting evidence was provided by a blast search undertaken by Manya et al. (2007) that revealed α-dg as the only protein containing this consensus sequence. In their glycoproteomic analysis of rabbit α-dg, Stalnaker et al. (2010) also concluded that O-mannosylation of particular residues is not regulated solely by a local consensus sequence. The large amounts of O-mannose initiated glycans seen in the mouse brain (Parry et al. 2007), and the recent glycomics investigation revealing large amounts of O-mannosylation in the gastrointestinal tract of a mouse model with the disrupted core 2 O-glycosylation pathway (Ismail et al. 2011) suggests that O-mannosylation is a lot more prevalent than originally thought. This has additionally been shown in a recent paper by Stalnaker et al. (2011) who utilized glycomic methodologies to characterize O-glycosylation in the mouse brain released from proteins of three different knockout mouse models associated with O-mannosylation. A subsequent study by Breloy et al. (2008) used recombinant fragments of human α-dg expressed in epithelial cell lines and argued that O-mannosylation was a doubly controlled process involving an upstream peptide region and residues flanking the modified Ser/Thr residue. The upstream peptide region (equivalent to Glu-369 to Ile-409 in the construct studied here) was reported to be necessary and sufficient to induce O-mannosylation. This peptide was not upstream of any of the 10 O-mannosylation sites identified. The authors 672

Glycoproteomics of recombinant mouse α-dystroglycan also suggest that flanking basic amino acid residues increases O-mannosylation of Ser and Thr residues. However, 9 of 10 O-mannosylated Ser/Thr residues reported here do not have flanking basic residues, and the alignment of the 10 O-mannosylation sites provided no obvious local consensus sequence for attachment. Yoshida-Moriguchi et al. (2010) showed that treating purified α-dg from mouse skeletal muscle with cold hydrofluoric acid (which cleaves phosphodiester linkages) resulted in a reduction in mass (from 150 to 70 kda) and loss of the IIH6 epitope and laminin-binding activity. MS and NMR analysis of a peptide fragment from the mucin-like domain of rabbit α-dg produced in a HEK293 expression system, identified the phosphorylated O-mannosyl trisaccharide with the structure GalNAcβ1,3GlcNAcβ1,4(C6-phosphate)Manα-Ser/Thr. This novel O-linked mannose structure represents the first vertebrate non-glycosylphosphatidylinositol-anchored glycoprotein modified by a phosphodiester linkage (Yoshida-Moriguchi et al. 2010). After treatment with hydrofluoric acid and purification by WFA lectin chromatography, a glycopeptide carrying the novel trisaccharide at position Thr-379 was observed ( 374 GAIIQTPTLGPIQPTR 389 ). Data presented in this study demonstrated a greater heterogeneity of glycan structures on Thr-370, Thr-372 and Thr-379, with larger sialylated glycan structures. Our data are fully consistent with the presence of a phosphorylated O-mannosyl trisaccharide with the structure GalNAcβ1,3GlcNAcβ1,4(C6-phosphate)Man. However, it should be noted that the phosphorylated O-mannosyl trisaccharide was not directly observed. This is most likely due to the smaller amount of starting material utilized in comparison with Yoshida-Moriguchi et al. This is the first report of the phosphorylated O-mannose moiety on analysis of a full-length α-dg, as Yoshida-Moriguchi et al. used a fragment corresponding to the mucin-like domain of the protein, and the recent glycoproteomic analyses of α-dg purified from rabbit skeletal muscle and human skeletal muscle did not observe this peptide and corresponding trisaccharide. We note that both mass spectrometric analyses of this novel glycan have utilized recombinant α-dg expressed in HEK cell expression systems, and therefore, it is possible that this modification reflects the use of this cell system and might be different from what happens in muscle tissue. Materials and methods Expression and purification of recombinant mouse α-dg in HEK293T cells (α-dgfc) The cdna fragment corresponding to amino acid residues 29 651 of mouse α-dg was amplified by polymerase chain reaction from IMAGE clone 3,496,914 and cloned into pfuse-migg2a-fc2 (Invivogen, Paisley, UK). HEK293T adherent cells were grown in the Dulbecco s modified Eagle s medium (DMEM) containing 10% (v/v) fetal calf serum, 25 mm glucose, 2 mm glutamine, 100 U/mL of penicillin and 100 µg/ml of streptomycin and transfected with the recombinant plasmid. Stable cell lines expressing secreted α-dgfc fusion protein were selected by the addition of 400 µg/ml of zeocin and maintained by growing in 100 µg/ml of zeocin. For recombinant protein production, the α-dgfc-expressing HEK293T cells were transferred to the DMEM containing 10% (v/v) ultra-low IgG fetal bovine serum (Gibco-BRL, Paisley, UK). The conditioned medium was incubated with protein G-agarose (Calbiochem, Nottingham, UK) for 4 h at room temperature or overnight at 4 C. The beads were then washed extensively with 1.5 M glycine, 3 M NaCl, ph 9.0, and the bound α-dgfc protein was eluted with 0.2 M glycine HCl, ph 2.8, and neutralized by addition of 1 M Tris HCl, ph 9.0. Protein recovery and purity were assessed by silver staining of PAGE gels and western blotting using antibodies against the Fc domain (horseradish peroxidase conjugated anti-mouse Fc; Pierce, Loughborough, UK) or the IIH6 epitope (Upstate, Abingdon, UK). Reduction and carboxymethylation of recombinant mouse α-dg The sample was dissolved in degassed Tris (0.6 M, ph 8.5), incubated with 1.5 mg of dithiothreitol at 37 C for 1 h and carboxymethylated with 20 mg of iodoacetic acid at room temperature for 1 h in the dark. The reaction was terminated by transferring the protein into a dialysis cassette and dialyzing against regular changes of Ambic buffer (50 mm, ph 7.5) at 4 C for 48 h. Tryptic digestion of recombinant mouse α-dg About 20 µg of lyophilized sequencing grade modified trypsin was reconstituted with 20 µl of acidic trypsin buffer (1 µg/µl) and a 200-µL working solution (100 ng/ml) was created by the addition of Ambic buffer (50 mm, ph 8.4). A protease:protein ratio of 1:100 to 1:20 (w/w) was added to the sample and incubated at 37 C for 14 16 h. A drop of 5% (v/v) acetic acid terminated the reaction and the products were purified by reverse-phase chromatography on a Classic C18 cartridge using the propan-1-ol/5% (v/v) acetic acid system. Hydrofluoric acid hydrolysis Glycans and glycopeptides were lyophilized in Lo-bind Eppendorfs and resuspended in 50 µl of a 48% (v/v) HF (Aldrich, Gillingham, UK) solution. The tube was sealed and the reaction was carried out at 4 C for 20 h, before being dried under gentle stream nitrogen. WFA lectin chromatography Lyophilized glycopeptides were resuspended in 0.5 ml of degassed WFA buffer (10 mm phosphate-buffered saline, ph 7.4). After the 2 ml of lectin column was pre-equilibrated with 30 ml of degassed WFA buffer, the sample was loaded and washed with 5 column volumes of WFA buffer. The bound glycopeptides were then eluted with 3 4 volumes of elution buffer (0.1 M lactose in WFA buffer). The wash and eluted bound fractions were collected in 15-mL falcon tubes. The amount of material coming off the lectin column was monitored using a UV detector to follow the chromatography at 214 or 280 nm. The column was regenerated with NaCl (1.4 M) between samples. 673