Workshop on Analysis and prediction of contacts in proteins

Similar documents
Term Definition Example Amino Acids

Lecture 10 More about proteins

Project Manual Bio3055. Cholesterol Homeostasis: HMG-CoA Reductase

HOMEWORK II and Swiss-PDB Viewer Tutorial DUE 9/26/03 62 points total. The ph at which a peptide has no net charge is its isoelectric point.

BIOCHEMISTRY I HOMEWORK III DUE 10/15/03 66 points total + 2 bonus points = 68 points possible Swiss-PDB Viewer Exercise Attached

The Hospital Anxiety and Depression Scale Guidance and Information

SMPD 287 Spring 2015 Bioinformatics in Medical Product Development. Final Examination

TMWSuite. DAT Interactive interface

Structural Analysis of TCRpMHC Complexes Using Computational Tools. Feroze Mohideen Briarcliff High School

IMPaLA tutorial.

User Guide. Protein Clpper. Statistical scoring of protease cleavage sites. 1. Introduction Protein Clpper Analysis Procedure...

OneTouch Reveal Web Application. User Manual for Healthcare Professionals Instructions for Use

Amino Acids. Review I: Protein Structure. Amino Acids: Structures. Amino Acids (contd.) Rajan Munshi

Hands-On Ten The BRCA1 Gene and Protein

Molecular Dynamics of HIV-1 Reverse Transcriptase

Phenylketonuria (PKU) Structure of Phenylalanine Hydroxylase. Biol 405 Molecular Medicine

User Instruction Guide

Excerpt from J. Mol. Biol. (2002) 320, :

Detergent solubilised 5 TMD binds pregnanolone at the Q245 neurosteroid potentiation site.

Lionbridge Connector for Hybris. User Guide

Project Manual Bio3055. Apoptosis: Superoxide Dismutase I

USER GUIDE: NEW CIR APP. Technician User Guide

Arginine side chain interactions and the role of arginine as a mobile charge carrier in voltage sensitive ion channels. Supplementary Information

List of Figures. List of Tables

Trilateral Project WM4

May 2003: Hemoglobin Red Blood, Blue Blood Use and Abuse of Hemoglobin

Clay Tablet Connector for hybris. User Guide. Version 1.5.0

Introduction to Protein Structure Collection

Student Guide. Concluding module. Visualizing proteins

7.014 Problem Set 2 Solutions

Ras and Cell Signaling Exercise

FIRM. Full Iterative Relaxation Matrix program

Data mining with Ensembl Biomart. Stéphanie Le Gras

Instructor Guide to EHR Go

ProScript User Guide. Pharmacy Access Medicines Manager

Entering HIV Testing Data into EvaluationWeb

About REACH: Machine Captioning for Video

BlueBayCT - Warfarin User Guide

Anders Hansson. A closer look at proteins! The compact version

Chemical Mechanism of Enzymes

Catalysis & specificity: Proteins at work

Diabetes Management Software V1.3 USER S MANUAL

Table of Contents Morning Set-up (GSI equipment, only)... 2 Opening AudBase... 3 Choosing a patient... 3 Performing Pure-Tone Air & Bone

Protein Investigator. Protein Investigator - 3

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

Module 3: Pathway and Drug Development

Walkthrough

Chapter 1: Managing workbooks

Hemoglobin & Sickle Cell Anemia Exercise

Agile Product Lifecycle Management for Process

HOW TO USE THE BENCHMARK CALENDAR SYSTEM

Cancer Rates and Cement Plants 1

Biological Mass Spectrometry. April 30, 2014

Elemental Kinection. Requirements. 2 May Version Texas Christian University, Computer Science Department

Chemical Nature of the Amino Acids. Table of a-amino Acids Found in Proteins

CS612 - Algorithms in Bioinformatics

Biochemistry 15 Doctor /7/2012

User Manual. RaySafe i2 dose viewer

Automated process to create snapshot reports based on the 2016 Murray Community-based Groups Capacity Survey: User Guide Report No.

SAM Teacher s Guide Four Levels of Protein Structure

CEU MASS MEDIATOR USER'S MANUAL Version 2.0, 31 st July 2017

Q: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg.

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Hemoglobin & Sickle Cell Anemia Exercise

Molecular Graphics Perspective of Protein Structure and Function

We are going to talk about two classifications of proteins: fibrous & globular.

Supplementary Figure 1 (previous page). EM analysis of full-length GCGR. (a) Exemplary tilt pair images of the GCGR mab23 complex acquired for Random

Pathway Exercises Metabolism and Pathways

Biochemistry - I. Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture 1 Amino Acids I

ARV Mode of Action. Mode of Action. Mode of Action NRTI. Immunopaedia.org.za

Cbl ubiquitin ligase: Lord of the RINGs

CoCoLysis: A Web-Accessible Coiled-Coil Protein Database with Analysis Tools

Contour Diabetes app User Guide

Getting Started.

GST: Step by step Build Diary page

Study on Different types of Structure based Properties in Human Membrane Proteins

Data Management System (DMS) User Guide

Table of Contents. Contour Diabetes App User Guide

Spectrum. Quick Start Tutorial

Exploring HIV Evolution: An Opportunity for Research Sam Donovan and Anton E. Weisstein

Chymotrypsin Lecture. Aims: to understand (1) the catalytic strategies used by enzymes and (2) the mechanism of chymotrypsin

Sleep Apnea Therapy Software User Manual

Jmol and Crystal. Bob Hanson St. Olaf College, Northfield, MN

Mechanisms of Enzymes

Online hearing test Lullenstyd Audiology :

The North Carolina Health Data Explorer

RESULTS REPORTING MANUAL. Hospital Births Newborn Screening Program June 2016

EasyComp. The TPN Compatibility Software. Clinical Nutrition

East Stroudsburg University Athletic Training Medical Forms Information and Directions

Data Management, Data Management PLUS User Guide

smk72+ Handbook Prof. Dr. Andreas Frey Dr. Lars Balzer Stephan Spuhler smk72+ Handbook Page 1

Web Feature Services Tutorial

Table S1: Kinetic parameters of drug and substrate binding to wild type and HIV-1 protease variants. Data adapted from Ref. 6 in main text.

General Single Ion Calibration. Pete 14-May-09

University of Alaska Connected! FAQs

3.2 Ligand-Binding at Nicotinic Acid Receptor Subtypes GPR109A/B

GridMAT-MD: A Grid-based Membrane Analysis Tool for use with Molecular Dynamics

Chapter 10. Regulatory Strategy

Transcription:

Workshop on Analysis and prediction of contacts in proteins 1.09.09 Eran Eyal 1, Vladimir Potapov 2, Ronen Levy 3, Vladimir Sobolev 3 and Marvin Edelman 3 1 Sheba Medical Center, Ramat Gan, Israel; Departments of Biochemistry 2 and Plant Sciences 3, Weizmann Institute of Science, Rehovot, Israel I II III IV Analysis of ligand-protein contacts using LPC and PDBsum Analysis of inter-atomic contacts in proteins and protein-protein interfaces using CSU and CMA Metal binding site prediction using CHED and seqched Predicting side chain conformations using SCCOMP and analysis of equilibrium dynamics (for advanced users) 1

I. Analysis of ligand-protein contacts using LPC and PDBsum During this session we will be working with the following software for analysis of interactions in PDB structures: LPC - Ligand-Protein Contacts (Sobolev et al. Bioinformatics, 1999, 15, 327-332) PDBsum: a database of the known 3D structures of proteins and nucleic acids (Laskowski et al., Trends Biochem. Sci., 1997, 22, 488) 1. Enter http://ligin.weizmann.ac.il/space and click Servers-->LPC/CSU: At the first line choose LPC; at the second line choose JMOL; at the third line type 2fxe in the field "PDB ID code" (HIV-1 protease Crm complexed with the antiretroviral drug Atazanavir); at the fifth line click "run" You should get a list of all the ligands in the structure you chose. 2. Choose size of picture Large ; ligand No 3 (DR7, i.e. Atazanavir, a substituted canedioic acid dimethyl ester) and Click RUN The program gives several options to consider: - Using option CONTACTS GROUPED BY Ligand atoms, find (in the table and picture the most solvent accessible atoms in the protein-ligand complex ( Compl column) and in the protein free of ligand ( Uncompl column); Note, the first atom, called in PDB as CAA, has a large solvent accessible surface in the free ligand, but a relatively high buried surface in the complex. On the other hand, atom CAO has almost the same solvent accessibility in both the free and complexed ligand. - Using option CONTACTS GROUPED BY Residues, find which residues forming the binding pocket have the largest contacts with the ligand. Which of the four mutations M46I, V82F, I84V, L90M, and in which chain, directly influence ligand binding? - Using option CONTACTS SORTED BY - Contact types, find the largest distance for a putative hydrogen bonds (denoted Hb) in the file. What could the distance 5.3 A, between the OBJ atom of the ligand and the N atom of Gly48 chain B signify in this case? Note, the OBJ atom has, in addition, a strong H-bond to a water molecule (HOH 115B). - Using option CONTACTS SORTED BY- residues, find which type of contacts do Asp25, ILE47 and VAL82 form; Note, definition and number of different contact type is provided under the link "Contacts grouped by contact type - Using option DERIVED DATA Complementarity, find the probable results of atom-type substitution for atoms OAI or CAA) in the ligands. Note, in the table obtained, you may see changes in ligand complementarity upon atom replacement. Red and green colours indicates "considerable" decrease and increase in normalized complementarity, respectively. 3. Open: http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/ 4. Enter PDB ID 2fxe and click Find 5. At the left side, where it says DR7, click link to "Ligands" 6. The accessed page enables to view the ligand molecule in details and shows the contact distances. 2

II Analysis of inter-atomic contacts in proteins and proteinprotein interfaces using CSU and CMA The goal of this exercise is to get acquainted with tools for analysis of interatomic contacts within proteins and in protein-protein interfaces. You will work with two servers: Contacts of Structural Units (CSU) and Contact Map Analysis (CMA). The CSU server (Sobolev et al. Bioinformatics, 1999, 15, 327-332) identifies all atom-atom contacts of any single residue in a particular PDB structure. The CMA server (Sobolev et al. NAR, 2005, 33, W39-W43) identifies all residue-residue contacts in particular PDB structure, represents them in a graphical way (contact map) and provides detailed information on atom contacts between any two chosen residues. Server URLs: CSU (with visualization) http://ligin.weizmann.ac.il/lpccsu CSU (text version) http://bip.weizmann.ac.il/oca-bin/lpccsu/ CMA http://ligin.weizmann.ac.il/cma Contacts of Structural Units 1. Load PDB file of HIV protease from http://bip.weizmann.ac.il/oca-bin/ocamain by typing 2FXD in the PDB ID field. Press Search and in the resulting page click the link Save to disk. 2. Navigate to the CSU server (with visualization), choose CSU option, load the structure that you locally saved on your computer and press Run. (NOTE: if your structure is deposited in the PDB, you can enter PDB ID and run the analysis. In this exercise you will load locally saved file to get acquainted with this option). 3. The server analyzes content of the PDB file and displays a page where you have to choose ChainID, Residue No and Insertion code of the residue of interest. In this exercise you will analyze contacts of Phe residue in position 82 of chain A. Enter necessary information (leave the field Insertion code blank) and start the analysis by pressing on the Submit. 4. In the displayed page you will see three areas: Navigation menu, Visualization area and Contacts area. Navigation menu is used to choose a preferred way to list atomatom contacts. (NOTE: You can set the size of visualization area in the previous page by choosing Large or Small option). 5. Phe82 is a mutation that may affect the binding of the inhibitor. Does this mutation have a contact with the drug (ligand DR7)? Choose the link CONTACTS GROUPED BY: Residue atoms and examine contacts of atoms in aromatic ring of Phe82. (How many atoms in Phe82? How many contacts for each atom? Is a particular atom buried or exposed in protein structure?) 6. Choose the link CONTACTS GROUPED BY: Residues and examine what residues make contacts with Phe82. (How many residues are in contact with Phe82? What is the minimal distance between two contacting residues? What is a contact surface area between two residues? How many atom-atom contacts between two residues?) 3

7. Choose the link CONTACTS GROUPED BY: Contact types and examine different types of contacts (hydrogen bonds, hydrophobic contacts, etc.) formed by Phe82. (How many hydrogen bonds are formed with Phe82? How many aromataromat contacts, etc.?) 8. The menu CONTACTS SORTED BY gives detailed information about all atom-atom contacts between Phe82 and neighboring residues. There three different options to sort list of atoms (Residue atoms/residues/contact types). Try them. 9. You can choose a different residue or a PDB file for analysis by clicking on appropriate link in GO TO menu. A short description of CSU approach is given under Help menu. 10. Analyze H bonds formed by HOH 115B in the PDB entry 2FXE from the first exercise. 11. A text version of the CSU server is also available (see URL above). It essentially provides the same information but in a printer-friendly way. Contact Map Analysis 1. Navigate to the CMA server. Choose the locally saved PDB file in the appropriate field of the web form. Enter A in the field Chain #1 and enter B in the field Chain #2 and run the analysis by pressing Submit. 2. In the displayed page (Contact Map screen), all contacts in the interface between chain A and chain B of the submitted structure are represented as a contact map. 3. Take a look at the left side of the contact map. You will see all residues in chain A that make contacts with residues in chain B. At the top of the contact map you will see all residues in chain B. (NOTE: you can find detailed information on all atomatom and residue-residue contacts at the bottom of the page. Explore links to Full list of atom-atom contacts and to List of displayed residue-residue contacts), 4. Each residue-residue contact in the interface between chains A and B is represented as a blue square. When you click on any blue square you get detailed information on all atom-atom contacts between two selected residues. You can display any atomatom contact in the visualization area by clicking on appropriate check-box. 5. Analyze which contacts are formed with Phe82 (chain A) across the interface? 6. Links to a more detailed CSU analysis are provided for two residues at the bottom of the screen. 7. The CMA server can analyze contacts not only between two chains but also within a chain. To do this, enter the same chain identifier in the fields Chain #1 and Chain #2. Enter 2FXD in the PDB ID field, enter A in Chain #1 and Chain #2 fields and press Submit button. You will get all contacts within chain A in HIV protease. 4

8. Note, that you can change the size of the contact map by choosing different Scale parameter of the main page of the CMA server. You can also display only contacts that have contact area above chosen threshold. 9. You can compare contact maps for the wild type (2FXE) and the mutant (2FXD) structures of HIV protease by opening them in two browser windows. 5

III Metal binding site prediction using CHED and seqched Often proteins bind metal ions at the catalytic site or at regions important for structure. To predict these regions in the protein, two alternative procedures can be applied depending on your available input data. Structure-based Prediction (CHED) (Babor et al., Proteins, 2008, 70, 208-217) 1. Access the CHED website at: http://ligin-temp.weizmann.ac.il/ched/. The approach uses a PDB protein as an input. Therefore, it should be applied only in case the protein was resolved structurally. 2. Check whether the crystal structure of the human P53 core domain is predicted to bind a metal: Scroll down to the box where you are asked to insert a PDB ID code, and type "1UOL". At the box on the right add the chain id A. Click on the Run. 3. The procedure offers two types of predictions, with different degrees of confidence: Mild filtration and Stringent filtration. The mild filter eliminates some of the false predictions keeping almost all correct predictions. In contrast, the stringent filter eliminates most false predictions at the expense of loosing some true predictions. Click on the link Stringent filtration. How many residues were predicted? Now click on the link Mild filtration How many residues were predicted? Among the predicted residues by mild, which residues are more probable to bind metal? 4. In the java applet you can view the protein structure. The predicted residues are colored in green. Right click on the applet, and select zoom and increase to 200% in case you would like to have a better view of the predicted metal site. Sequence-based Prediction (SeqCHED) (Levy et al, Proteins, 2009, 76, 365-374) 1.Access the SeqCHED website at: http://ligin-temp.weizmann.ac.il/seqched/. The approach uses a protein sequence as an input. Therefore, it can be applied in case the protein was not resolved structurally. 2. Scroll down to the first part of the instructions where it says "1) Click on button to select option". Here you are offered to choose whether you would like to give a protein sequence as an input for analysis, or give the Uniprot/Swissprot protein sequence ID. Mark the option where it says "Insert UniProt sequence ID" and type "Q4A7M9" in the associated box. The relevant sequence of mannose-6-phosphate isomerase will be downloaded automatically when you will activate the procedure. 3. The results will be sent to your e-mail address. Therefore, type an e-mail address where it says "2) Results will be sent to your email address". Click on the Submit. 4. The application will check for homology of your target sequence to PDB templates. Open the mail sent to you, and access the associated link. 5. Scroll down to the table below. How many templates did you get? The procedure will model in 3D the side-chains of your sequence on the backbone of the template you select. Which template is probable to give a better prediction (compare the sequence identity, structural resolution, and existence of metal in the template)? 6. Click on the template you prefer where it says, "prediction using ". 7. How many metal binding sites are predicted by the stringent filter? How many are predicted by the mild filter? Which of them is more probable to be a true one? 6

IV Predicting side chain conformations using SCCOMP and analysis of equilibrium dinamics (for advanced users). Modeling of side chain conformations 1. Placing of side chains on a fixed backbone is an important final step for building a complete structural model of proteins. It is also needed in order to model point mutations and to complete missing structural information in experiments. 2. In this exercise we will know a tool for structural modeling of side chains, get to feel the quality of the prediction of such tools and use it to model mutations. 3. During the exercise we will work with HIV-1 protease, a major target for HIV drugs. This protein cleaves the pre-mature viral peptides and allows the formation of mature proteins needed to complete the viral cell cycle. HIV-1 protease is a homodimer with two identical subunits of 99 amino-acids each. Most of the FDA approved drugs against HIV are targeted to block the enzymatic activity of this protein. We will work with X-ray crystallographic structures of HIV-1 protease resolved with the inhibitor 4. Download from one of the PDB browsers such as OCA (HYPERLINK http://bip.weizmann.ac.il/oca-bin/ocamain) the structures of two structures of HIVprotease with the inhibitor Atazanavir. The PDB codes are 2fxd and 2aqu. 5. Open the SCCOMP server foe side chain modeling at: HYPERLINK ignmtest.ccbb.pitt.edu/sccomp.html. This is a web interface for the program SCCOMP (Eyal et al., J. Comput Chem, 2004, 25:712-724) which makes on line modeling of side chain conformations. The scoring function of this program is based on contact surface areas. 6. We will first get a general impression for the quality of the prediction by placing all side chains on the backbone and compare to the real positions of the side chains (known in this case). 7. Enter PDB code 2fxd under "Enter PDB id". In the pull down titled "Output PDB file" choose "Input coordinates and model". You will get back a file which includes both the prediction and the original side chains deposited in the file. Enter an internet-based email account (such as gmail or hotmail) and submit. The calculation takes less than a minute and after that you will receive an email with the resulting model attached. 8. Open the email and save the model. Open the file in a molecular graphics program such as Rasmol and color it according by models. 9. Which residues are predicted more accurately buried or exposed? Which residues are predicted better, those with aromatic rings or linear side chains? 10. Select residue 82. Note that the Phenylalanine in this position is a mutation which is a well known drug-resistance mutation rapidly evolved against several drugs. Does the program successfully model the side chain of this Phenylalanine? Which type of interaction is formed between the Phenylalanine and the inhibitor? 7

11. Do you think that this mutation will be affective as a drug resistant mutation against Atazanavir? 12. We will now mutate the protein in this position in silico back to the wild type residue which is Valine in this case. Go back to the SCCOMP web site. Enter again "2fxd". Now go to the "To model a residue" form and enter in positions 82 of chains A and B the residue VAL. Under the "Output PDB file" pull down choose now "Model only". Only coordinates of the model will be retrieved. Open the model that you get. Is there still an interaction between the Valine and the Atazanavir? You can use LPC to check the interactions between position 82 and the inhibitor. Note, SCCOMP can be downloaded locally to unix/linux platforms. This can be useful when we want to run the program by automatically for many structures/mutations. Equilibrium dynamics It is now clear that in many cases understanding the motion of molecular machines is crucial to elucidating their function. In this exercise we will get to know a simple tool for analysis of the basic cooperative motions of a molecule in equilibrium state, the Anisotropic Network Model (ANM). This tool performs a simple type of normal mode analysis. During the exercise we will work with HIV-1 protease, the major target for HIV drugs. This protein cleaves the pre-mature viral peptides and allows the formation of mature proteins needed to complete the viral cell cycle. 1. Open the ANM server (Eyal E, Bioinformatics. 2006, 22:2619-2627) at: http://ignmtest.ccbb.pitt.edu/anm. This web interface takes as input PDB coordinates and performs the calculations to determine the dynamics of the system in this state. 2. In the "Enter the PDB id of your protein" form, enter the PDB code 2fxd. Leave the "*" symbol for the chain ID such that the calculations will include both subunits of this dimer. Also leave the other parameters unchanged. Submit. 3. The calculation will take several seconds, after which you will be looking at the motion of this molecule, which is displayed as a network of interacting residues. Red colors indicate more mobile regions and blue colors indicate static regions. 4. Select the "chain connectivity" bottom in order to see the molecule as a backbone trace and colored according to its subunits. You may watch more easily the direction of the motion by checking on the "vectors" box. 5. The region between residues 46-54 is known as the "flap" region of HIV protease. This is a highly mobile region which controls the access of soluble molecules into the active site. Use the "Labels" pull-down to label some of the residues in the flap region. 8

6. The basic motion of the molecule is decomposed to normal modes. As a default, you are watching the mode which exhibits the most cooperative and slow motion. You can watch other modes (the 20 slowest modes) by changing the current visible mode in the "Modes" pull down. 7. Determine in which modes the flap region is more mobile among the slowest 10 modes. These modes are assumed to be functionally relevant for the purpose of accessibility to the binding site. 8. The simplest way to evaluate the prediction of the ANM model is to compare the magnitude of fluctuations of each residue to that found experimentally in the X-ray crystallography the B-factors profile. Open the "B-factors/mode fluctuations" link at the bottom of the page. Compare the theoretical fluctuations predicted by the model (which are summed over all normal modes) and the experimental B-factors. The correlation coefficient between the two curves is 0.66. 9. The residues involved in the catalytic reaction are located at positions 25-27. Residues crucial for many inhibitors binding sites are located also around residue 50. Check in the graphs: are these regions relatively mobile or static? Can you think why? 10. You can try the ANM server on your own favorite PDB structure. 9