HIGH RESOLUTION MASS SPECTROMETRY (HRMS) IN DISCOVERY PROTEOMICS A clinical proteomics perspective Michael L. Merchant, PhD School of Medicine, University of Louisville Louisville, KY Learning Objectives After this presentation, you should be able to: 1. Provide a working description of the relationship of resolution and mass accuracy as it related to protein identification. 2. Provide an overview of the LCMS based proteomic workflow. 3. Describe factors mitigating the discovery process in clinical proteomics. 4. Provide an example of the application of high resolution mass spectrometry (HRMS) to biomarker discovery and protein post translational modification Overview of topics to be discussed Introduction Overview of the LCMS workflow Sample handling and preparation Data dependent acquisition (DDA) workflow versus the data independent acquisition (DIA) workflow Informatics Current applications Biomarker discovery Quantitative Phosphoproteomics Top down mass spectrometry Conclusions Acknowledgements University of Louisville Jon Klein, MD, PhD Ken McLeish, MD Michael Brier, PhD Jian Cai, PhD James Hribar, PhD Danny Wilkey, BA Ming Li, BS The Ohio State Univ. Brad Rovin, MD University of Pennsylvania Harv Feldman, MD Peter Yang, PhD University of Washington Jonathan Himmelfarb, MD Disclosures NIH (NIDDK) R01 DK091584 U01 DK085673 Owner/Partner Pharos Medicine, LLC. 1
Introduction terms and definitions Resolution Low ( 1000) Medium ( 10,000 20,000) High ( 50,000) Mass Accuracy Low (1.0 0.1 Da) Medium (0.1 0.01 Da) High (0.01 0.001 Da) Tandem Mass Spectrometry MS2 Precision Proteomics Application of high resolution methods at the MS1 and MS2 (or MS n ) level of analysis Proteospecific Amino acid sequence that is specific to the species being studied Data dependent analysis MS2 fragmentation targeting based on MS1 information Data independent analysis MS2 fragmentation data collected systematically and independent of MS1 information Qualitative analysis Index of proteins present in the sample Quantitative analysis Associating relative or absolute values of abundance to the Qual analysis. Mann M, and Kelleher N L PNAS 2008;105:18132-18138 Egertson JD et. Al. Nature Methods 2013 10(8) 744-2528 Proteomic workflow Traditional Bottom up Approach Bottlenecks protease Protein Peptides Bench work (Variability) Informatics In Silico Peptide assignment (Statistics) Protein assignment Mass Spectrometer Amount of isolated protein Sample complexity Protein separation methods Protease digestion Peptide separation methods Sensitivity, mass accuracy, and scan speed of mass spectrometer Prior knowledge of PTM Protein database False positive assignments Impact of mass spectrometer in contemporary proteomics Precision Proteomics Mann M, and Kelleher N L PNAS 2008;105:18132-18138 2008 by National Academy of Sciences 2
Factors mitigating discovery progress Defining the question that proteomics will answer Informatics Research question Sample handling Sample handling Isolation of protein containing structure Protein extraction Protein separation Protein digestion Mass spectrometry analysis Quantification Mass spectrometer Informatics Identification Bioinformatics/Interpretation The research question The research design Expression and quantitative proteomics Define and quantify the protein components Functional proteomics. Define the interactions among proteins yielding knowledge about Protein Interaction Networks (PIN) Define the mechanisms by which proteins communicate with each other yielding knowledge about Protein Signaling Networks (PSN) PIN PSN Expression Minimizing unnecessary variance and systematic errors Power the study correctly Correcting for multiple hypothesis Replicates a) Biological b) Technical Detergents a) SDS or NP 40 b) Protease MAX Chaotropes a) Urea or b) GuanHCL Buffers a) ammonium bicarbonates/acetates or b) TRIS, c) Trizma, d) Hepes, e) MES Inhibitors a) Protease, b) Phosphatase, c) Deacetylase, or d) Bacteriastat Proteases a) Trypsin, b) Arg C, c) Lys C, or d) Asp N Addressing complexity before mass measurement Protein Expression System Lysis Subcellular Fractionation Electrophoresis Chromatography Denaturing Native Reversed Phase Ion Exchange 1 DE 2 DE HILIC Affinity SEC Antibody Lectin Metal Chelate 3
Overview of LCMS workflow Acquisition of parent peptide mass to charge (m/z) and fragment m/z values Algorithmic matching of observed to predicted fragment masses y 3 y 2 y 1 nanolc column nanospray source sample nanolc LTQ Orbitrap ELITE b 1 b 2 b 3 MS1* Curates database Intensity Fragment 0 m/z z 3 z 2 z 1 MS2 c 1 c 2 c 3 Advantages of HRMS in proteomic discovery research Qualitative/quantitative methods using HRMS data dependent analysis (DDA) MS1 based quantification Feature analysis Differential LCMS (dlcms) Enzymatic labeling Oxygen 18 ( 18 O) Chemical labeling Reductive methylation ICAT Chemical synthesis with stable isotope labeled amino acids Metabolic labeling SILAC ( 13 C, 15 N labeled amino acids) 13 C, 15 N labeled amino acids growth media MS2 based quantification Label Free Spectral counting Isobaric chemical tags Reporter ion quantification itraq TMT 4
Caveats to increased sensitivity 636.6425m/z z=+3 Urine proteomic study Acute kidney injury following cardiac surgery Elution of identical peptide in two patient samples 636.6413m/z z=+3 Single MS spectra can contain >100 peptide features High complexity 1) analysis of only 16 of 100 peptides is common 2) repeat analysis may miss 5 of the 16 3) many will have closely isobaric species Increased sensitivity can contribute to co isolation of closely isobaric peptides 636.6425m/z z=+3 636.6413m/z z=+3 Sequence: VFNNIGADLLTGSESENK, Charge: +3, Monoisotopic m/z: 636.64252 Da, MH+: 1907.91301 Da, RT: 49.54 min, Identified with: Mascot (v1.27); IonScore:48, Exp Value:1.5E 003, Ions matched by search engine: 11/102 Fragment match tolerance used for search: 1.2 Da Fragments used for search: c; y; z; z+2 SCY1 like protein 2 [SCYL2_HUMAN] 5
HRMS in clinical proteomic studies Label free identification and quantification of biomarkers of human disease (urine proteomics) Addressing over abundant proteins Post translational modifications Cell signaling and the phosphoproteome Direct analysis of whole proteins via top down MS Histones and epigentics URINE PROTEOMICS AND LUPUS NEPHRITIS Efforts to support diagnosis without renal biopsies Renal disease, urine protein, and depletion Patient No. 10 15 34 49 71 81 105 ~80% of urinary MS/MS data is from abundant serum proteins 6
All Proteins No Decoy IDs Decoy IDs # Proteins # Identified Spectra # Spectra % Identified LTQ Analysis 266 773 94157 0.94 Orbitrap Elite Analysis 620 35543 145551 24 Scaffold IDs; 99% Protein Probability, 95% Peptide Probability with at Least Two High Confidence Peptides TIC for 0.15M Salt Step of SCX Fractionation LTQ LTQ Orbitrap E+7 E+9 Scan #19133 MS Scan #19133 1049.5465m/z Zoom from 990 1070m/z 7
MS2 Spectrum for 1049.5465m/z Peak Observed Matched Peroxiredoxin 6 [PRDX6_HUMAN] Monoisotopic m/z: 1049.54653 Da ( 3.54 mmu/ 3.37 ppm) Mascot IonScore:68 Sequence: PGGLLLGDVAPNFEANTTVGR, Charge: +2 Phosphoproteome Studies Identification of low abundant (rare), protein post translational modifications to gain information on molecular signaling events Studying protein phosphorylation events using affinity enrichment needle in a haystack Engholm Keller, K and Larsen, MR J. Proteomics 75 (2011) 317 328 8
Reductive methylation: an inexpensive and effective quantification strategy Wilson Grady, JT, Haas, W, and Gygi, SP Methods 61 (2013) 277 286 Integrated workflow for ps/pt/py detection and quantification Wilson Grady, JT, Haas, W, and Gygi, SP Methods 61 (2013) 277 286 Informatics analysis of phosphoproteome data Wilson Grady, JT, Haas, W, and Gygi, SP Methods 61 (2013) 277 286 9
HRMS AND TOP DOWN PROTEOMICS Analysis of protein post translational modifications from the whole protein level. FTMS selected ion monitoring (932 942m/z) for CAII Z=+31 937.302 CAII 937.979 CAII + Na +1 936.721 938.655 CAII H 2 O CAII + 2Na +1 MS2 Fragmentation Spectrum Analysis of whole protein modification Isolated Cells Extract histones Abs 280nm Isolate/enrich Histones Time (min) HPLC 12% SDS PAGE Histones Modified histones Arg C digestion Top Down Arg C digestion Modified and unmodified histone peptide fragments Bottom Up LCMS Analysis 10
mrp HPLC PTM states Charge states Conclusions The sensitivity and analytic power of current HRMS platforms are pushing the boundaries of discovery work. Given the increased sensitivity of state of the art mass spectrometers, sample (mis )handling can significantly impact the quality of the data regardless of the proteomic end goal. With regards to clinical proteomics experiments, maximizing reproducible sample handling is vital to the success of all clinical proteomic projects. LCMS based proteomics methods are important tools that can be used to aid in the study human health and disease. Efficient utilization of proteomic resources are often times best achieved through collaboration. Self Assessment Questions 1. Which of the following is not a source of variability in proteomic discovery research A) sample complexity B) peptide separation methods C) sensitivity, mass accuracy, and scan speed of the mass spectrometer D) amount of isolated protein E) none of the above 2. Which of the following is not a statistical bottleneck proteomic discovery research A) prior knowledge of PTM B) protein database C) false positive assignments D) peptide separation methods E) none of the above 3. Advantages of high resolution mass spectrometers over older low resolution mass spectrometers include A) detection of lower abundant peptides B) greater coverage of peptide fragmentation data C) better detection of protein post translational modifications D) characterization of whole proteins and post translational patterns E) all of the above 11