Protein Bioinformatics (26.655) Lecture 9: Quantitative Proteomics Tuesday, April 27, 21 Robert N. Cole, Ph.D. Mass Spectrometry and Proteomics Facility Johns Hopkins School of Medicine 371 Broadway Research Bldg 733 N. Broadway St. Baltimore, MD 2125 Ph: (41) 614-6968 email: rcole@jhmi.edu Topics Protein identification by mass spectrometry Relative quantification of proteins Applications How a Mass Spectrometers Measure Mass Ionization Ion Sorting Detection Ion Source Form Ions Inlet Solid, Liquid, or Vapor Mass Analyzer Sort by mass-to-charge ratio (m/z) Counts Ion Detector Spectrum m/z 1
Identifying Proteins by Mass Spectrometry Three Methods Bottom Up: Proteins identified from a few peptides Know cleavage site, peptide mass and sequence Do not know proteins actual size and modifications Problems: Sequence homology, seeing all peptides Top Down: Proteins identified from the intact protein Know protein s size and whether it is modified Sequence usually from ends of protein Problems: Size limitations, Internal sequences Middle Down: Proteins identified from large sections of protein Identifying Proteins from Peptide Sequence (Many proteins in solution, gel bands or spots) Nanospray Ionization Full MS Scan Parent Mass Selection Parent Mass Fragmentation Fragment Mass Scan HPLC Survey Scan Collision induced MS/MS or Tandem MS Scan Intensity Intensity m/z m/z Tandem MS Sequencing Peptides by Fragmentation b R 1 R 2 R 3 H + + H 3 N-CH-CO NH-CH-CO NH-CH-COOH y b-ion (b1) y-ion (y2) R 1 + H 3 N-CH-CO R 2 R 3 H + NH-CH-CO NH-CH-COOH 2
Survey (Full MS or MS1) Scan of Peptides 6 584.792 5 779.49 4 535.32 Intensi ty 3 2 1 55.765 429.76 648.387 723.331 445.17 63.773 549.299 4 5 6 7 8 9 Mass (m/z) MS/MS (MS2) Scan of Peptide Fragments showing Amino Acid Sequence 2 15 549.2 686.3 Fragmentation of 584.79 Ion ity Intens 1 5 379.1 492.2 799.4 175.1 232.1 913.4 11.4 167.5 2 4 6 8 1 12 Mass (m/z) 2 15 b-ions 12 159 256 37 483 62 677 79 937 994 115 T G P N L H G L F G R y-ions 1168 167 11 913 799 686 549 492 379 232 175 y5 549.2 y6 686.3 Fragmentation of 584.79 Ion ity Intens 1 y3 y4 y7 5 y1 y2 379.1 492.2 799.4 y8 175.1 232.1 913.4 y9 11.4 y1 167.5 2 4 6 8 1 12 Mass (m/z) 3
Tandem MS (MS/MS) Uses Peptide Mass and Sequence Tag Need Only One Peptide Mass with >3 Amino Acids Masses in Sequence (at least two preferred) High Sample Complexity Tolerated Protein modifications identified and mapped to an amino acid High mass accuracy nice, but not required Search Engines for Protein Identification from MS Data Summary of Programs Proteome Software ExPASy www.proteomesoftware.com/ expasy.proteome.org.au Free Programs ProteinProspector XProteo Prowl Mascot prospector.ucsf.edu xproteo.com:2698 prowl.rockefeller.edu rockefeller edu www.matrixscience.com (Free < 3 ions) Open Source Programs OMSSA pubchem.ncbi.nlm.nih.gov/omssa/ X! Tandem www.thegpm.org/ Commerical Programs Mascot Sequest Spectrum Mill Proteolynx www.matrixscience.com fields.scripps.edu/sequest/ www.chem.agilent.com/ www.waters.com/watersdivision/ Quantitative Proteomics Quantifying Individual Proteins in Complex Mixtures (Functional Proteomics) Quantifying a Proteome Reveals Changes in protein levels (biomarkers, pathways, binding partners) ) Changes in protein modifications (structure/function) Changes in subcellular localization (trafficking) Kinetics (protein turnover, modification dynamics) 4
Approach Non-Labeling Labeling Chemical Metabolic Quantitative Proteomics Methods Methods (Gel or MS based) Gel matching, Densitometry, Spectral Counting, Peak Intensity, Multiple Reaction Monitoring (MRM) Difference Gel Electrophoresis (DIGE) Isobaric Tags for Relative and Absolute Quantitation (itraq) Isotope-Coded Affinity Tags (ICAT) Enzymatic 18 O-Labeling Spiking Radiolabeling Stable Isotope Labeling of Amino Acids in Cell Culture (SILAC) Absolute Quantification (AQUA) Multiple Reaction Monitoring (MRM) Quantification Concatamers (QconCAT) Best Approach? There is NO one best approach. All approaches are: Complementary different separation techniques different sets of proteins identified Technically challenging sample preparation data acquisition data analysis Require fractionation to dig deeper into proteome limited by dynamic range Best Quantitative Proteomic Experiments Defined question (i.e. hypothesis driven) Defined system Independently measurable phenotype Sample preparation Reproducible Scalable Compatible buffer system 5
Typical Quantitative Proteomic Experiment Fractionation? (enrich for target proteins) Sample Preparation (cell, tissue, fluid) (controls, experimentals) Labeling? (multiplex samples) Resolution on a common platform (proteins or peptides) (gel or column) Protein Quantification (protein or peptide level) Protein Identification (e.g. Tandem MS) Bioinformatics (statistics, ontology, pathways) Validation (e.g. western blot, MRM, etc) Sample Preparation (Most Important Step!) Reproducibility Protein Amount and Complexity Buffer Composition Compatibility Sample Preparation Must Be Reproducible and Standardized! Biological Replicates 1 2 3 1 2 3 Initial Protocol Standardized Protocol Reduce Sources of Variability: Technical: good < bad hands few < many steps Biological: cells < tissue < body fluids yeast < nematode < human 6
Protein Amount and Complexity Detect Low Abundance Proteins by Fractionating MW kd MWS BSA Top 6 proteins removed Total Serum 18 115 82 64 49 37 26 19 15 6 4-12% SDS-PAGE, SimplyBlue stain 3. pi 11. 1D versus 2D 25 15 75 5 37 25 1 versus 3.5 5.8 8.2 1.5 Quantitative Proteomics Methods Non-Labeling Methods No additions to analysis Separate analysis of Each sample Biological Variability Technical Variability (Sample Prep and MS analysis) 7
Non-Labeling Densitometry -1D Gel Sample 2 1 Slice Gel slice Name of protein number Polymeric immunoglobulin 22 receptor Transferrin 22 22 Vanin 1 21 1B-glycoprotein 21 Complement component 5 hgc-1 (human G-CSFstimulated 21 clone-1) IgG Fc-binding protein 21 21 Mac-2-binding protein 18-1-antichymotrypsin t i 18 Albumin Comparing protein bands on same gel Quantify by densitometry Often more than one protein per band Kristiansen et al. Molec Cell Proteomics 3:715 728, 24. No Labeling Densitometry - 2D Gels pi MW Bacterial Strain #1 Bacterial Strain #2 Compare spots from different gels Quantify by relative spot volume Gel reproducibility and spot matching critical Gel warping for matching spots can warp out modification Often more than one protein per spot No Labeling MS Methods Spectral Counting Each sample separate LCMS/MS experiment Quantify each protein from: number of peptides from protein number of spectra for each peptide Compares multiple LCMS/MS experiments Reproducibility of LCMS/MS system critical 8
No Labeling MS Methods Peak Intensity Each sample separate LCMS/MS experiment Quantify each protein from: intensity of 3 most intense peptides averaged from 3 technical replicates 3D Peak Intensity Map m/z Time Compares multiple LCMS/MS experiments Annotate Spots Reproducibility of LCMS/MS system critical Quantitative Proteomics Methods Non-Labeling Methods No additions to analysis Separate analysis of Each sample Biological and Technical Variability Labeling Methods Typically adds steps to analysis Simultaneous analysis of Many samples (Multiplexing) Biological Variability, Reduces Technical Variability Cuts instrument time (data collection) by 5-75% Difference Gel Electrophoresis (DIGE) Control Protein Extract Label with fluor 1 Disease Protein Extract Label with fluor 2 Mix labeled proteins from both samples Separate proteins on one 2-D gel Excitation wavelength 1 Image gel Excitation wavelength 2 Protein View differences comparison in a single gel! Unlu et al. Electrophoresis 18:271-277, 1997 9
Novel Phosphorylation of Myofilament Protein Correlates with Myocardial Stunning Yuan et al. Proteomics 6:4176-4186, 26 Three Fluorescent Cy Dyes Cy2, Cy3, Cy5!-amino group of lysine Matched for Charge and MW Cy2 Cy 5 Cy 3 Third Cy Dye used as an Internal Standard d All possible protein spots overlaid on every gel. Simplifies gel to gel matching. Each spot has it s own internal standard spot for normalizing across gels. Reduces experimental variations. Accounts for differences in sample load. DIGE Analysis Pooled Samples labeled with Cy2 (on all gels to compare one gel to another) Control Protein Extract Label with Cy3 Disease Protein Extract Label with Cy5 Mix samples Separate proteins on one 2-D gel Excitation wavelength 1 Excitation wavelength 2 Quantify intact proteins Cells, tissue, fluids One additional step Reduced technical variability 1
1 5 9 13 2 6 1 14 3 7 11 15 4 8 12 16 Relative MS Quantitative Methods Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC) Control Treated 12 C-Arg 13 C-Arg Metabolic labeling Only for cell culture Mix Cells No added steps No technical variability Standard Deviations 5% Isolate proteins and digest with trypsin Peptide ID and quantify by tandem MS Quantify mass pairs Ong et al. J Proteome Res 2: 173-181, 23 18 O-labeling Protocol Control Protein Extract Trypsin Experimental Protein Extract Trypsin + 18 O-water C-terminal Enzymatic labeling Mix labeled peptides Cells, tissue, fluids No added steps Peptide ID and quantify by tandem MS No MS technical variability Standard deviations 1-2% Loss of label to water C-terminal (adds 4 Da) Yao et al. Anal. Chem 73:2836-2842, 21 Yao et al. J Proteome Res 2:147-152, 23 11
itraq Tags (Isobaric Tag for Relative and Absolute Quantitation) Isobaric Tag (Total mass = 35) Eight possible tags Reporter Balance PRG Reporter Mass 113 114 115 116 117 118 119 121 Charged Balance Mass 192 191 19 189 188 187 186 Peptide Reactive Group (binds to amines) 184 Ross et al. Mol. Cell. Proteomics 3:1154-1169, 24 Pierce et al Mol. Cell Proteomics 7:853-863,27 Neutral 8-plex itraq Workflow Samples: S1 S2 S3 S4 S5 S6 S7 S8 Digestion: S1 S2 S3 S4 S5 S6 S7 S8 Label Peptides S1 113 S2 S3 S4 S5 S6 S7 S8 114 115 116 117 118 119 121 Chemical labeling Skip 12 Cells, tissue, fluids Phe (F) Mix Labeled Peptides Immonium ion Adds steps No MS technical variability Fractionate Peptide Mixture on SCX Column Analyze SCX Fractions by LC-MS/MS Intensity 992 9 8 175.12 Eight Different Tags 7 Compare b1eight Samples 376.25 6 5 in One Experiment 4 3 2 1 y1 Quantifying Proteins using itraq y2 288.24 y3 437.2 b-ions 734 376 463 626 854 13 1116 1273 itraq-a S Y L D mshc I R y-ions b2 y8 +2 b5 645.87 y7 463.31 854.5 915.47 b3 y5 y4 y6 552.29 b4 739.43 1291 915 828 665 552 437 288 175 y8-itraq 986.52 1 2 3 4 5 6 7 8 9 1 11 12 m/z 6 b6 5 4 Eight reporter ions quantify same peptide from eight samples in one spectrum 739.43 one spectrum S3 115.11 S4 116.11 S3 119.12 S4 121.12 3 S2 2 S1 114.11 1 113.11 S2 S1 118.12 117.12 Skip 113. 114. 115. 116. 117. 118. 119. 12. 121. m/z 12
115 11 15 1 95 9 85 8 75 7 65 6 55 5 45 4 35 3 25 2 15 1 22.13 23.17 259.1 2.11 291.23 459.31 546.36 416.24 588.33 12.6 241.1 317.17 494.76 273.69 517.27 33.15 523.28 745.46 57.32 617.37 5 988.47 b 1 145.49 1 15 2 25 3 35 4 45 5 55 6 65 7 75 8 85 9 95 1 15 11 m/z, amu 2 1 114.11 115.11 114. 115. 116. 117. m/z, amu Changes in Protein Levels (biomarkers and pathways) Phospholipid Growth Factor (S1P) on Pulmonary Endothelial Permeability 1.6 1.4 S1P Resistance Normalized R 1.2 1.8.6.4 Resistance = Permeability Lipid Rafts? Proteins involved?.2 5 1 15 2 25 3 Time (min) Guo Y et al (27) Mol Cell Proteomics 6:689-696 itraq Revealed Proteins More Abundant in S1P Stimulated Lipid Rafts MRP Peptide Fragmentation Western Blots Confirm itraq MRP_Human GDVTAEEAAGASPAK (3+) y 1 y 3 Control S1P Sample 1 Sample 2 Control S1P Control S1P A. Anti-MARCKS Intensity, counts y 2+ 3 y 2 2+ b 362.27 7 423.72 B. Anti-p-MARCKS b 1 y 2+ 4 y 6 b 6 674.42 717.39 b 3 b b 5 4 b 7 115.11 b 2+ 9 846.42 b 2 2+ b 9 y 4 y 5 y 7 C. Anti-MRP D. Anti-Caveolin-1 b 8 917.43 b 9 MRP is up-regulated using both methods Guo Y et al (27) Mol Cell Proteomics 6:689-69 13
MRP sirna Attenuates S1P Stimulated Endothelial Barrier Enhancement sistance Normalized Res 1.6 1.4 1.2 1.8.6.4 S1P S1P+Scrambled MRP sirna S1P+MRP sirna MRP sirna or Scrambled MAP sirna.2 5 1 15 2 25 3 Time (min) Guo Y et al (27) Mol Cell Proteomics 6:689-69 Can more than 8 samples be analyzed using 8-plex itraq? Experimental Design Yes! Pool of all samples: Standard in all itraq experiments Repeat labeling of at least 1 sample in all itraq experiments Completely randomize labeling Expected Result itraq 1 - Sample 1 itraq 1 - Pool = itraq 2 - Sample 1 itraq 2 - Pool itraq 1 - Sample 2 itraq 1 - Pool = itraq 2 - Sample 2 itraq 2 - Pool 14
Ratios for All Proteins in Sample 1 or Sample 2 Relative to Pool are the Same in itraq 1 and itraq 2 Ratios itraq #2 3. 2.5 2. 1.5 1..5 Sample 2 B47 y =.981x +.196 R 2 =.8611 Sample 1 A1249 y =.8426x +.1586 R 2 =.9571...5 1. 1.5 2. 2.5 3. 3.5 itraq #1 Ratios Micronutrient Deficiencies and Health of Undernourished Child and Maternal Health Problems Nutritional Deficiencies Infant/ Child Poor growth Impaired development Disability Infection Chronic disease Childhood Death Mother Obstetric morbidity Infection/sepsis Anemia Death Photo: K West Micronutrient Status Vitamin A, zinc, iron, iodine, folate, others Serum Protein Profiles? 5 samples 75 itraq experiments Gates Foundation Grant, PI: Keith West, JHSPH Changes in Protein Modifications (structure/function) 15
Using itraq to Quantify Site Specific Auto-Phosphorylation of the EGF Receptor Kinase Tyr 869 1 MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLS 51 LQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIP 11 LENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRF 151 SNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCW 21 GAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLV 251 CRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYV 31 VTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLS 351 INATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKE 41 ITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGL 451 RSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK 51 ATGQVCHALCSPEGCWGPEPRDCVSCRNVSRGRECVDKCNLLEGEPREFV 551 ENSECIQCHPECLPQAMNITCTGRGPDNCIQCAHYIDGPHCVKTCPAGVM 61 GENNTLVWKYADAGHVCHLCHPNCTYGCTGPGLEGCPTNGPKIPSIATGM 651 VGALLLLLVVALGIGLFMRRRHIVRKRTLRRLLQERELVEPLTPSGEAPN 71 QALLRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEKVKIPVAIKELREA 751 TSPKANKEILDEAYVMASVDNPHVCRLLGICLTSTVQLITQLMPFGCLLD 81 YVREHKDNIGSQYLLNWCVQIAKGMNYLEDRRLVHRDLAARNVLVKTPQH 851 VKITDFGLAKLLGAEEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSY 91 GVTVWELMTFGSKPYDGIPASEISSILEKGERLPQPPICTIDVYMIMVKC 951 WMIDADSRPKFRELIIEFSKMARDPQRYLVIQGDERMHLPSPTDSNFYR Tyr 998 Qiu et al. 29 Biochemistry 48: 6624-6632 Experimental Design Expressed EGFR Kinase Incubated +/- ATP and +/- kinase inhibitors Isolated EGFR Kinase Resolved EGFR Kinase by SDS-PAGE In gel digest, itraq label, SCX, LCMS/MS -ATP +ATP +EGFR +SRC Inhibitor Inhibitor EGFR Kinase 113 115 117 121-12 kda -1 kda - 75 kda Qiu et al. 29 Biochemistry 48: 6624-6632 EGFR Top Hit >35 peptide IDs with at least 95% confidence Same amount of EGRF in all gel bands (Ratios relative to No ATP sample labeled with 113) +ATP +EGFR +SRC kinase kinase Inhibitor Inhibitor Rank Score %Cov Name Species 115:113 117:113 121:113 1 7 65 Epidermal growth factor receptor precursor - Homo sapiens (Human) HUMAN 1.1 1.15 1.32 2 9 49 Keratin, type II cytoskeletal 1 - Homo sapiens (Human) HUMAN.76 1. 1.2 3 8 43 Exportin-2 - Homo sapiens (Human) HUMAN.98.89 1.31 4 5 36 Exportin-4 - Homo sapiens (Human) HUMAN.98.98 1.25 5 4 39 Exportin-1 - Homo sapiens (Human) HUMAN.94 1. 1.29 6 2 46 Exportin-T - Homo sapiens (Human) HUMAN 1.7 1.19 1.56 7 2 34 Keratin, type I cytoskeletal 1 - Homo sapiens (Human) HUMAN.7.75 1.21 8 2 71 Uncharacterized protein C14orf139 - Homo sapiens (Human) HUMAN 1.15.87 1.12 Qiu et al. 29 Biochemistry 48: 6624-6632 16
Phosphotyrosines: Phosphothreonines: Phosphoserines: Increase with ATP or Inhibitor to another kinase, but NOT with an Inhibitor to EGFR Kinase. No change No change +EGFR +Scr +ATP Kinase Kinase Alone Inhibitor Inhibitor Confidence Sequence Phosphate Modification 115:113 117:113 121:113 99 ELVEPLTPSGEAPNQALLR Phospho(T)@7 1.14 1.1 1.52 99 LLGAEEKEYHAEGGK Phospho(Y)@9 16.98 1.71 14.62 99 LLGAEEKEYHAEGGKVPIK Phospho(Y)@9 5.18 1.27 3.99 99 LLGAEEKEYHAEGGKVPIK Phospho(Y)@9 6.33 1.27 4.89 99 MHLPSPTDSNFYR Phospho(Y)@12 2.71 1.63 2.88 99 MHLPSPTDSNFYR Phospho(Y)@12 2.24 1.15 2.28 99 MHLPSPTDSNFYR Phospho(Y)@12 2..78 1.75 99 MHLPSPTDSNFYR Phospho(S)@5.77.8.98 7 MHLPSPTDSNFYR Phospho(S)@5 1.1 1.21 1.74 Qiu et al. 29 Biochemistry 48: 6624-6632 Kinetics (protein turnover, modification dynamics) Which peptide is a better substrate? Expermental Design: Two time courses (one for each peptide) Two itraq experiments (one for each time course) Substrate and products labeled with same itraq tag at each time point reagent Different itraq label for different time points Wade Gibson Time (hr) itraq Label 113.5 114 1 115 2 116 3 117 4.5 118 6 119 24 121 Mix all itraq labeled substrates and proteins from all time points Run one MS analysis (LCMS/MS) for each time course 17
1 9 8 Peptide 1 1 9 8 Peptide 2 7 6 5 Substrate 1 7 6 5 Substrate 2 4 4 3 3 2 2 1 1 14 12 1 Product 1a 9 8 7 6 Product 2a 8 5 6 4 4 3 2 2 1 25 2 Product 1b 8 7 6 Product 2b 15 5 4 1 3 5 2 1 itraq Label Time (hr) 113 114 115 116 117 118 119 121.5 1 2 3 4.5 6 24 113 114 115 116 117 118 119 121.5 1 2 3 4.5 6 24 Software for Identifying and Quantifying Proteins Label Free Sieve MSQuant www.thermo.com msquant.alwaysdata.net Labeling ProteinPilot www.absciex.com Mascot www.matrixscience.com Scaffold Q+ www.proteomesoftware.com Protein Discoverer www.thermo.com 18