Disulphide Connectivity Prediction in Proteins Based on Secondary Structures and Cysteine Separation
|
|
- Joella Jefferson
- 6 years ago
- Views:
Transcription
1 Disulphide Connectivity Prediction in Proteins Based on Secondary Structures and Cysteine Separation Raju Balakrishnan India Software Labs. IBM Global Services India Pvt Ltd. Embassy Golf Links, Koramangala Ring Road, Off Indiranagar, Bangalore, Karnataka, India Abstract The disulphide bonds are important in deciding the final 3D conformation of protein. Knowing disulphide connectivity will help to find out the final protein conformation, as it will limit the conformational search space. Fariselli and Casadio[] approached problem of predicting disulphide connectivity by equating the problem to a imum graph matching problem and assigning edge weights based on the residues in the nearest neighborhoods of the cysteines. This paper modifies the weights by adding constraints based on secondary structure and separation between the cysteines in the protein chain. Prediction results show considerable improvement. The prediction results can provide insight into the protein folding and disulphide bond formation as they are supporting the hypothesis on which the objective function is based upon. Introduction Proteins, the building blocks of life, consist of chains of ao acids. There are 20 different types of ao acids constituting protein chains. A protein can contain a single chain of ao acids or can contain multiple ao acid chains linked together. Protein structure can be described in four levels, namely, primary, secondary, tertiary, and quaternary. The sequence information of proteins, or the order in which ao acids constitute the chains, is the primary structure of a protein. The strands formed by these ao acids again bend locally forg sheets, coils etc. These form the secondary structure of proteins. These secondary structures again tangle themselves in different shapes. This is the tertiary structure. Quaternary structure refers to the manner in which different chains in a protein are bonded together. Detering the tertiary structure of proteins is of prime importance in medical science. Prediction of tertiary structure, given its primary structure, removes the cost for experimentation to detere the tertiary structure, and, will enable the detection of tertiary structures of many proteins for which empirical detection is difficult, or not possible. Hence protein s tertiary structure prediction is an active research area. Disulphide bonds (SS-Bonds) are formed between the Sulfur atoms in proteins. Cysteine is the only ao acid residue forg disulphide bonds. The major types of forces contributing to the protein folding are disulphide bonds and hydrogen bonds. Compared to hydrogen bonds, the disulphide bonds are very less in number but many times stronger. Hence it will be simpler to predict the disulphide bonds. Formation of the disulphide bonds adds to the stability of the protein conformations. The correct prediction of disulphide bonds will help tertiary structure prediction of proteins considerably by reducing conformational search space for the tertiary structure. Prediction of disulphide bonds involves two sub-problems. First problem is to predict the bonding state of cysteines. Second problem is the prediction of connectivity of cysteines. There are good solutions available for first sub-problem. Also in a protein chain bonded and un-bonded cysteines rarely co-exist [3]. This paper attempts the second problem, predicting the disulphide connectivity. 2. Problem Definition Given ao acid sequence and secondary structure information of a protein chain with 2N bonded cysteines and no un-bonded cysteines, predict the bonding pattern of the N disulphide bonds in the protein. For a protein the difficulty for predicting increases with increase in number of bonds in the protein. So the goodness, or accuracy, of the prediction need to be evaluated against the number of cysteines in the protein chain. For example, in a protein with only two cysteines and one disulphide bridge, accuracy of prediction will always be 00% as there can only be one way of bonding possible. But for a chain with four cysteines the accuracy
2 of a random predictor will be 50%. Likewise, as the number of disulphide bonds in a protein increases the probability of correct prediction for the random predictor decreases. The data set used consists of proteins with 2 to 6 disulphide bonds. Nearly 82% of proteins with disulphide bonds have only to 4 disulphide bonds []. So the percentage of proteins having greater than 6 SSbonds is small. 3. Prior work Several methods are tried to predict the disulphide connectivity of the proteins [], [2], [5].The method suggested in this paper is an improvement on the objective function used in []. Paper [2] uses neural network based methods. The method in this paper gives better results than [] for all classes of proteins 2. Even though neural network based methods described in [2] gives better accuracy in some classes of protein than the method proposed by us, method in [2] uses a different data set and it does not use an explicit objective function. Objective functions proposed here can be combined with tools like neural networks and can be used for more accurate predictions. Numerous resources are available for protein folding in general [7], [8], [9], [], [3], [4]. Databases like PDB [2] has a collection of 3D structure of proteins which can be visualized using tools like Rasmol [5] 4. System and Method 4. Protein Data Set The data set contains protein chains satisfying following constraints.. Single chain. Justification: To keep the program complexity low. The method is scalable to multiple chains also. 2. All cysteines bonded. Justification: Predicting bonding state of the cysteine-whether the cysteine is bonded or not-is an independent problem and methods for accurate prediction methods are available. These methods can be combined with the method proposed in this paper. 3. Disulphide bonds, secondary structure information and sequence information are annotated without ambiguity: The data set do not include proteins with one disulphide bonds as they can have only one connectivity pattern and do not need prediction. 2 Protein classification based on number of disulphide bonds, e.g. proteins having, 2, 3 etc number of disulphide bonds. Justification: The primary and secondary structure data is used as inputs for prediction program and the disulphide bond annotation is used for calculating accuracy of prediction. Inaccuracy in any of this information may result a wrong prediction or accuracy evaluation. Data set was downloaded from SwissProt database [6]. Proteins data with disulphide bonds (Proteins annotated with the string DISULFID in SwissProt) were downloaded, and those with multiple chains and un-bonded cysteines were removed. Also protein with ambiguous disulphide bonds (bonds annotated as POTENTIAL, PROBABLE, BY SIMILARITY, and REDOX ACTIVE) or having ambiguous residue names-like X-were removed from the data set. 4.2 Measures of Accuracy Two measures of accuracies are used for the evaluating the predictions. The first measure is fraction of bonds predicted correctly (Qc), and second is fraction of proteins for which all bonds are predicted correctly. Number of bonds predicted correctly Qc Total Number of bonds Qp No : of prot : all bonds predicted correct Total Number of proteins Probability of a random prediction to give correct results is taken as accuracy of the random predictor. Any prediction accuracies need to be evaluated against that of a random predictor. Methods giving accuracies less than or equal to that of a random predictor is not valid. For a random predictor the prediction accuracy are given below [] Q c of Random Predictor: Qc ( Rp) ( ) 2 B Q p of Random Predictor: Qp ( Rp) (2B )!! i B(2i ) ( 2 ) Where B is the number of disulphide bonds in protein.
3 4.3 Approach The disulphide connectivity in proteins is modeled as an undirected graph with nodes as cysteines and edges as bonds between the cysteines. The weights for the graph edges are assigned using the objective functions described in next section. H.Gabow s N-cubed imum weighted matching for undirected graphs is used to find out the set of pairs for which the sum of the weights of edges is the imum. For example, suppose we have a protein with 6 bonded cysteines. Say C to C6. We assign weights to these cysteine pairs as shown in Table. This is a symmetric matrix. After assigning weights, 3 pairs of cysteines which will give the imum sum of weights need to be found out. This is a optimization problem. H Gabow s Maximum weighted matching algorithm [4] is used for this optimization. The output of the algorithm is the predicted connectivity of the disulphide graph. This output is compared against the actual connectivity described in the SwissProt database protein annotations to calculate accuracy. Here the Optimization method is same as that proposed by Fariselli and Casadio[]. Objective function to assign the weights for graph edges is modified. C C2 C3 C4 C5 C6 C C C C C C Table : The weights are assigned using the objective functions described in next section. This weight matrix is given as the input to the imum weight matching optimization. 4.4 Objective Function The objective function used to assign the graph weights is the combination of following four functions. A. Monte Carlo derived contact potential[] B. Weights decreasing exponentially with increasing distance between proteins in the chain C. Penalty for bonds which cannot be formed without bending of alpha helixes or beta sheets. below D. Penalty for bonds between cysteines which are less than two residues apart in the chain 3. Each of these functions is described in detail A. Monte Carlo derived Contact potential Odd ration contact potential modified by Monte Carlo simulated annealing is found to be giving the best predictions by Fariselli and Casadio[]. We assumed that the cysteines along with the four residues, two in each side of the chain, are in contact. For example if the cysteine at 5 and cysteine at 2 are forg a SS-Bond we assume that all the residues at 3,4,5,6 and 7 are in touch with all the residues at 9,20,2,22 and 23. Then we add the contact potentials for all these cysteines as shown in (3) W mc W Ri, Rj) i j W ( Ri, Rj) ( (3) R i and R j Values are listed in table 5 in [] These are the only criteria used by Fariselli and Casadio[] B. Weights decreasing exponentially with increasing distance between cysteines (Wd ). A number factor, decreasing exponentially with number of residues between the cysteines in the chain, is used as weight. The following function used to calculate these weights. 2 2 ( d 2σ ) (4) W d e σ 2π σ 0.75 d I I 2 00 Where d is the number of residues between two cysteines divided 4 by 00 This term in objective function is based on the intuition that disulphide bonding state in proteins may not be going to the global ima of energy, but may be getting trapped in local energy ima s. 3 Here each ao acid residue in protein is indexed ascending order starting from 0, i.e. 0,,2,3 etc. Each index in a protein chain maps to a single ao acid residue. 4 Numerical values are detered empirically
4 C. Penalty for bonds between cysteines less than two residues apart. 0, I I2 < 2 W { W, I I 2 (5) 2 Here I and I 2 refer to indexes of a cysteine in the chain. If the magnitude of difference between indexes of two cysteines is less than two, then weight for that particular bond is set to 0, i.e. a penalty being applied. The steric hindrance and excessive strain in the bonds may be preventing the formation of the bond. D. Penalty for bond between the cysteines which necessitate bending of alpha helixes or beta sheets (Penalty based on secondary structure constraints) It is observed that secondary structure units like alpha helixes and beta sheets do not bend. Mostly the bending happens in turns. So protein chains are modeled as rigid alpha and beta structures connected by flexible turns. In this model the criteria for two points in the chain to come together is that the longest single rigid segment in between the points should be shorter than sum of the lengths of the all other segments in between. Where, d2 is the imum distance between I and I 2 possible without bending alpha helixes or beta sheets. L i Length of a rigid or flexible segment between I and I 2, where I and I 2 are the positions at at which the secondary structure unit is starting and ending respectively. If I or I 2 is positioned on the particular segment the length of the segment between I and I 2 is taken instead of the whole length of the segment. is, L Maximum of L i between I and I 2 Overall function to assign weights to graph edges W k Wmc + k 2 W d (7) Empirically detered value 7, for k 2 /k,, found to be giving best accuracies. is Figure : Protein s rigid segments are represented by thick lines and flexible segments are represented by thin lines. Here if the length of longest rigid segment ( L ) is greater than the sum of lengths of all other segments ( L i i ) the points I and I 2 can not come in touch (bonding) without bending L. Figure illustrates the protein model for this method. The criterion can be formulated 5 as below. d 2 L L i i W, d 0 W 2 (6) 0, d > 0 2 The weighing factors are obtained based on empirical results. The weighing factors are kept high to assign integer weights as the input to the program implementing EG algorithm [4], which can accept only integer values. Small weights will result higher rounding errors while converting to integer values. The adjacency list representation of graph with edge weights assigned as described above is given as input to Edmond-Gabow graph matching [4]. 5 The alpha helix and beta sheet lengths adjusted based on inter residue distance may give better results.
5 Sequence Info: Secondary Structure Info: () Assign Simulated Annealing Contact Potential (2) Modify weights Based on Distance between the Cysteins (3) Penalty for cysteins <2 residues apart (4) Penalty based on secondary structure constraints (5) Maximum Weighted Graph Matching Predicted Connectivity Figure 2: Prediction flow. The sequence information and secondary structure information are parsed from SwissProt database protein files. This information is passed to modules and 4 respectively. Modules to 4 assign weights to graph edges. This graph representation is passed as input to the imum weighted matching module. This module predicts the optimum match representing the disulphide connectivity of protein. The entire implementation is a stand alone java program except Edmond-Gabow alogirithm, which is a C module. 5. Results Proteins are classified based on number of disulphide bonds. Disulphide bonds prediction accuracies are calculated for each class separately. Results and comparison with the results obtained by applying prediction method in Fariselli and Casadio [] are listed in Table 2. The abbreviations used in Table 2 are as follows, EG Simulated annealing contact potential applied in [] S Constraints base on secondary structure. D Constraints and weight base on distance between the cysteines. 6. Conclusion and future work The suggested methods in this paper show considerable increase in the accuracy of prediction. Analyzing the objective function may be helpful in giving better insight into the protein folding mechanism. Secondary structure based constraints in equation (6) increases accuracy for proteins having three 3 and 4 disulphide bonds. Incorporating more domain knowledge such as adjusting inter residual distance in alpha and beta structures and tools like neural network may give better accuracies. Cysteines closer together tend to form bonds which may be an indication of disulphide bonding state of the proteins getting trapped in local energy imas rather that attaining the global energy ima.
6 No: of No: of Qp(EG,S,D) Qp(EG,D) Qp( EG ) Qc(EG,S, D) Qc(EG,D) Qc(EG) SS- Bonds Chains Table 2: Qp and Qc are the measures of accuracies mentioned in section 4.2. Qp( EG) and Qc(EG) are from methods used in prior work[] and Qc( EG,S,D) and Qp(EG,S,D) are using method proposed in this paper, i.e. using EG potential ( EG ), secondary structure based constraints ( S ) and distance based weights and penalties for cysteines less than two residues apart ( D ). Qp( EG, D ) and Qc( EG, D ) are accuracies of prediction not using secondary structure based constraints ( S ). Methods not using secondary structure information has the advantage of overall accuracy not depending on the accuracy of detering secondary structures. References [] Piero Fariselli and Rita Casadio, Prediction of disulphide Connectivity to proteins, BioInformatics, vol. 7 No. 0, pp , 200 [2] Alessandor Vullo, Paolo Frasconi Bioinformatics, Disulphide Connectivity prediction using recursive neural networks and evolutionary information., Bioinformatics, vol. 20, No. 5, pp , [3] Paolo Frasconi, Andrea Passerini, A two stage SVM architecture for predicting Disulphide bonding state of cysteine, Proc. of the IEEE Workshop on Neural Networks for Signal Processing, 2002 [4] Gabow H N, An efficient implementation of edmonds algorithm for imum weighted matching in graphs., Technical Report CU-CS Department of computer science, colorado university, 975 [5] Fariselli P, Martelli P.L. Casadio R. A neural network based methode for predicting disulphide connectivity in protein. In Damiani et al.(kes 2002), vol, pp , 2002 [6] Swiss Prot Database of protein sequences. [7] Lecture notes Biochemistry Carnegie Mellon University, 4/Lec08/lec08.pdf [8] Lecture Notes on Protein folding, Indiana University lecture_notes_27.pdf [9] Studies on the Principles that Govern the Folding of Protein Chains, Nobel Lecture, Chemistry 97-80, Christian Anfisen [0] Arnold Neumier, Molecular Modeling of Proteins and Mathematical Prediction of Proteins, structure, SIAM Rev. 39, , 997 [] A Guide to Structure Prediction, Rob Russel, EMBL. [2] Protein Data Bank. [3] Andriy Kryshtafovych, Torgeir R. Hvidsten et al, Fold Recognition Using Sequence Fingerprints of Protein Local Substructures, IEEE Computer Society Bioinformatics Conference (CSB'03), [4] Mohammed J. Zakit, Shan Jint et al, Mining Residue Contacts in Proteins Using Local Structure Predictions Proceedings of Bio-Informatics and Biomedical Engineering, [5] Rasmol Molecular visualization tool.
Proteins consist of joined amino acids They are joined by a Also called an Amide Bond
Lecture Two: Peptide Bond & Protein Structure [Chapter 2 Berg, Tymoczko & Stryer] (Figures in Red are for the 7th Edition) (Figures in Blue are for the 8th Edition) Proteins consist of joined amino acids
More informationH C. C α. Proteins perform a vast array of biological function including: Side chain
Topics The topics: basic concepts of molecular biology elements on Python overview of the field biological databases and database searching sequence alignments phylogenetic trees microarray data analysis
More informationa) The statement is true for X = 400, but false for X = 300; b) The statement is true for X = 300, but false for X = 200;
1. Consider the following statement. To produce one molecule of each possible kind of polypeptide chain, X amino acids in length, would require more atoms than exist in the universe. Given the size of
More informationProteins and their structure
Proteins and their structure Proteins are the most abundant biological macromolecules, occurring in all cells and all parts of cells. Proteins also occur in great variety; thousands of different kinds,
More informationBiology 2E- Zimmer Protein structure- amino acid kit
Biology 2E- Zimmer Protein structure- amino acid kit Name: This activity will use a physical model to investigate protein shape and develop key concepts that govern how proteins fold into their final three-dimensional
More informationSecondary Structure. by hydrogen bonds
Secondary Structure In the previous protein folding activity, you created a hypothetical 15-amino acid protein and learned that basic principles of chemistry determine how each protein spontaneously folds
More informationLecture Series 2 Macromolecules: Their Structure and Function
Lecture Series 2 Macromolecules: Their Structure and Function Reading Assignments Read Chapter 4 (Protein structure & Function) Biological Substances found in Living Tissues The big four in terms of macromolecules
More informationLecture Series 2 Macromolecules: Their Structure and Function
Lecture Series 2 Macromolecules: Their Structure and Function Reading Assignments Read Chapter 4 (Protein structure & Function) Biological Substances found in Living Tissues The big four in terms of macromolecules
More informationpaper and beads don t fall off. Then, place the beads in the following order on the pipe cleaner:
Beady Pipe Cleaner Proteins Background: Proteins are the molecules that carry out most of the cell s dayto-day functions. While the DNA in the nucleus is "the boss" and controls the activities of the cell,
More informationLecture Series 2 Macromolecules: Their Structure and Function
Lecture Series 2 Macromolecules: Their Structure and Function Reading Assignments Read Chapter 4 (Protein structure & Function) Biological Substances found in Living Tissues The big four in terms of macromolecules
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationProtein Secondary Structure
Protein Secondary Structure Reading: Berg, Tymoczko & Stryer, 6th ed., Chapter 2, pp. 37-45 Problems in textbook: chapter 2, pp. 63-64, #1,5,9 Directory of Jmol structures of proteins: http://www.biochem.arizona.edu/classes/bioc462/462a/jmol/routines/routines.html
More informationA Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification
A Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification JONGMAN CHO 1 1 Department of Biomedical Engineering, Inje University, Gimhae, 621-749, KOREA minerva@ieeeorg
More informationChem Lecture 2 Protein Structure
Chem 452 - Lecture 2 Protein Structure 110923 Proteins are the workhorses of a living cell and involve themselves in nearly all of the activities that take place in a cell. Their wide range of structures
More informationSide-Chain Positioning CMSC 423
Side-Chain Positioning CMSC 423 Protein Structure Backbone Protein Structure Backbone Side-chains Side-chain Positioning Given: - amino acid sequence - position of backbone in space Find best 3D positions
More informationSecondary Structure North 72nd Street, Wauwatosa, WI Phone: (414) Fax: (414) dmoleculardesigns.com
Secondary Structure In the previous protein folding activity, you created a generic or hypothetical 15-amino acid protein and learned that basic principles of chemistry determine how each protein spontaneously
More informationQuestion Expected Answers Mark Additional Guidance 1 (a) (i) peptide (bond / link) ; 1 DO NOT CREDIT dipeptide (a) (ii) hydrolysis ;
Question Expected Answers Mark Additional Guidance 1 (a) (i) peptide (bond / link) ; 1 DO NOT CREDIT dipeptide (a) (ii) hydrolysis ; IGNORE name of bond (b) 1 water / H O, is, added / used / needed ; substrate
More informationAmino Acids and Proteins Hamad Ali Yaseen, PhD MLS Department, FAHS, HSC, KU Biochemistry 210 Chapter 22
Amino Acids and Proteins Hamad Ali Yaseen, PhD MLS Department, FAHS, HSC, KU Hamad.ali@hsc.edu.kw Biochemistry 210 Chapter 22 Importance of Proteins Main catalysts in biochemistry: enzymes (involved in
More informationProteins. (b) Protein Structure and Conformational Change
Proteins (b) Protein Structure and Conformational Change Protein Structure and Conformational Change Proteins contain the elements carbon (C), hydrogen (H), oxygen (O2) and nitrogen (N2) Some may also
More informationIntroduction to proteins and protein structure
Introduction to proteins and protein structure The questions and answers below constitute an introduction to the fundamental principles of protein structure. They are all available at [link]. What are
More informationDraw how two amino acids form the peptide bond. Draw in the space below:
Name Date Period Modeling Protein Folding Draw how two amino acids form the peptide bond. Draw in the space below: What we are doing today: The core idea in life sciences is that there is a fundamental
More informationBIO 311C Spring Lecture 15 Friday 26 Feb. 1
BIO 311C Spring 2010 Lecture 15 Friday 26 Feb. 1 Illustration of a Polypeptide amino acids peptide bonds Review Polypeptide (chain) See textbook, Fig 5.21, p. 82 for a more clear illustration Folding and
More informationBIOB111 - Tutorial activity for Session 14
BIOB111 - Tutorial activity for Session 14 General topics for week 7 Session 14 Amino acids and proteins Students review the concepts learnt and answer the selected questions from the textbook. General
More informationA. Lipids: Water-Insoluble Molecules
Biological Substances found in Living Tissues Lecture Series 3 Macromolecules: Their Structure and Function A. Lipids: Water-Insoluble Lipids can form large biological molecules, but these aggregations
More informationOPTION GROUP: BIOLOGICAL MOLECULES 3 PROTEINS WORKBOOK. Tyrone R.L. John, Chartered Biologist
NAME: OPTION GROUP: BIOLOGICAL MOLECULES 3 PROTEINS WORKBOOK Tyrone R.L. John, Chartered Biologist 1 Tyrone R.L. John, Chartered Biologist 2 Instructions REVISION CHECKLIST AND ASSESSMENT OBJECTIVES Regular
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationCoCoLysis: A Web-Accessible Coiled-Coil Protein Database with Analysis Tools
CoCoLysis: A Web-Accessible Coiled-Coil Protein Database with Analysis Tools David Brinkmann, Sai Nandoor, Jugal Kalita, Brian Tripet AND Robert Hodges sainandoor@hotmail.com, david.brinkmann@hp.com, kalita@pikespeak.uccs.edu
More informationProblem Set 2 September 18, 2009
September 18, 2009 General Instructions: 1. You are expected to state all your assumptions and provide step-by-step solutions to the numerical problems. Unless indicated otherwise, the computational problems
More informationSome Connectivity Concepts in Bipolar Fuzzy Graphs
Annals of Pure and Applied Mathematics Vol. 7, No. 2, 2014, 98-108 ISSN: 2279-087X (P), 2279-0888(online) Published on 30 September 2014 www.researchmathsci.org Annals of Some Connectivity Concepts in
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Math 24 Study Guide for Exam 1 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the number as prime, composite, or neither. 1) 173 A)
More informationQ1: Circle the best correct answer: (15 marks)
Q1: Circle the best correct answer: (15 marks) 1. Which one of the following incorrectly pairs an amino acid with a valid chemical characteristic a. Glycine, is chiral b. Tyrosine and tryptophan; at neutral
More informationA BEGINNER S GUIDE TO BIOCHEMISTRY
A BEGINNER S GUIDE TO BIOCHEMISTRY Life is basically a chemical process Organic substances: contain carbon atoms bonded to other carbon atom 4 classes: carbohydrates, lipids, proteins, nucleic acids Chemical
More informationBCH Graduate Survey of Biochemistry
BCH 5045 Graduate Survey of Biochemistry Instructor: Charles Guy Producer: Ron Thomas Director: Glen Graham Lecture 10 Slide sets available at: http://hort.ifas.ufl.edu/teach/guyweb/bch5045/index.html
More informationRecognizing Scenes by Simulating Implied Social Interaction Networks
Recognizing Scenes by Simulating Implied Social Interaction Networks MaryAnne Fields and Craig Lennon Army Research Laboratory, Aberdeen, MD, USA Christian Lebiere and Michael Martin Carnegie Mellon University,
More informationReview of Biochemistry
Review of Biochemistry Chemical bond Functional Groups Amino Acid Protein Structure and Function Proteins are polymers of amino acids. Each amino acids in a protein contains a amino group, - NH 2,
More informationIntroduction to Protein Structure Collection
Introduction to Protein Structure Collection Teaching Points This collection is designed to introduce students to the concepts of protein structure and biochemistry. Different activities guide students
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationLecture 10 More about proteins
Lecture 10 More about proteins Today we're going to extend our discussion of protein structure. This may seem far-removed from gene cloning, but it is the path to understanding the genes that we are cloning.
More informationOPTION GROUP: BIOLOGICAL MOLECULES 3 PROTEINS WORKBOOK. Tyrone R.L. John, Chartered Biologist
NAME: OPTION GROUP: BIOLOGICAL MOLECULES 3 PROTEINS WORKBOOK Tyrone R.L. John, Chartered Biologist 1 Tyrone R.L. John, Chartered Biologist 2 Instructions REVISION CHECKLIST AND ASSESSMENT OBJECTIVES Regular
More informationAmino Acids and Proteins (2) Professor Dr. Raid M. H. Al-Salih
Amino Acids and Proteins (2) Professor Dr. Raid M. H. Al-Salih 1 Some important biologically active peptides 2 Proteins The word protein is derived from Greek word, proteios which means primary. As the
More informationPredicting Disulfide Connectivity Patterns
67:262 270 (2007) Predicting Disulfide Connectivity Patterns Chih-Hao Lu, 1 Yu-Ching Chen, 1 Chin-Sheng Yu, 2 and Jenn-Kang Hwang 1,2,3 * 1 Institute of Bioinformatics, National Chiao Tung University,
More informationCHAPTER 29 HW: AMINO ACIDS + PROTEINS
CAPTER 29 W: AMI ACIDS + PRTEIS For all problems, consult the table of 20 Amino Acids provided in lecture if an amino acid structure is needed; these will be given on exams. Use natural amino acids (L)
More informationThe Basics: A general review of molecular biology:
The Basics: A general review of molecular biology: DNA Transcription RNA Translation Proteins DNA (deoxy-ribonucleic acid) is the genetic material It is an informational super polymer -think of it as the
More informationLesson 5 Proteins Levels of Protein Structure
Lesson 5 Proteins Levels of Protein Structure Primary 1º Structure The primary structure is simply the sequence of amino acids in a protein. Chains of amino acids are written from the amino terminus (N-terminus)
More informationAll living things are mostly composed of 4 elements: H, O, N, C honk Compounds are broken down into 2 general categories: Inorganic Compounds:
Biochemistry Organic Chemistry All living things are mostly composed of 4 elements: H, O, N, C honk Compounds are broken down into 2 general categories: Inorganic Compounds: Do not contain carbon Organic
More informationA Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data
Method A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Xiao-Gang Ruan, Jin-Lian Wang*, and Jian-Geng Li Institute of Artificial Intelligence and
More informationBioinformatics for molecular biology
Bioinformatics for molecular biology Structural bioinformatics tools, predictors, and 3D modeling Structural Biology Review Dr Research Scientist Department of Microbiology, Oslo University Hospital -
More informationA Learning Method of Directly Optimizing Classifier Performance at Local Operating Range
A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range Lae-Jeong Park and Jung-Ho Moon Department of Electrical Engineering, Kangnung National University Kangnung, Gangwon-Do,
More informationBiochemistry - I. Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture 1 Amino Acids I
Biochemistry - I Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture 1 Amino Acids I Hello, welcome to the course Biochemistry 1 conducted by me Dr. S Dasgupta,
More informationStudent Guide. Concluding module. Visualizing proteins
Student Guide Concluding module Visualizing proteins Developed by bioinformaticsatschool.eu (part of NBIC) Text Hienke Sminia Illustrations Bioinformaticsatschool.eu Yasara.org All the included material
More informationRefactoring for Changeability: A way to go?
Refactoring for Changeability: A way to go? B. Geppert, A. Mockus, and F. Roessler {bgeppert, audris, roessler}@avaya.com Avaya Labs Research Basking Ridge, NJ 07920 http://www.research.avayalabs.com/user/audris
More informationAmino Acids. Review I: Protein Structure. Amino Acids: Structures. Amino Acids (contd.) Rajan Munshi
Review I: Protein Structure Rajan Munshi BBSI @ Pitt 2005 Department of Computational Biology University of Pittsburgh School of Medicine May 24, 2005 Amino Acids Building blocks of proteins 20 amino acids
More informationSupporting Information Identification of Amino Acids with Sensitive Nanoporous MoS 2 : Towards Machine Learning-Based Prediction
Supporting Information Identification of Amino Acids with Sensitive Nanoporous MoS 2 : Towards Machine Learning-Based Prediction Amir Barati Farimani, Mohammad Heiranian, Narayana R. Aluru 1 Department
More informationAP Biology Summer Assignment Chapter 3 Quiz
AP Biology Summer Assignment Chapter 3 Quiz 2016-17 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. All of the following are found in a DNA nucleotide
More informationSupplementary Materials for
advances.sciencemag.org/cgi/content/full/4/3/eaaq0762/dc1 Supplementary Materials for Structures of monomeric and oligomeric forms of the Toxoplasma gondii perforin-like protein 1 Tao Ni, Sophie I. Williams,
More informationProtein Structure and Function
Protein Structure and Function Protein Structure Classification of Proteins Based on Components Simple proteins - Proteins containing only polypeptides Conjugated proteins - Proteins containing nonpolypeptide
More informationComparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation
Comparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation U. Baid 1, S. Talbar 2 and S. Talbar 1 1 Department of E&TC Engineering, Shri Guru Gobind Singhji
More informationTWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING
134 TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING H.F.S.M.Fonseka 1, J.T.Jonathan 2, P.Sabeshan 3 and M.B.Dissanayaka 4 1 Department of Electrical And Electronic Engineering, Faculty
More informationContents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network
Contents Classification Rule Generation for Bioinformatics Hyeoncheol Kim Rule Extraction from Neural Networks Algorithm Ex] Promoter Domain Hybrid Model of Knowledge and Learning Knowledge refinement
More informationThe building blocks of life.
The building blocks of life. All the functions of the cell are based on chemical reactions. the building blocks of organisms BIOMOLECULE MONOMER POLYMER carbohydrate monosaccharide polysaccharide lipid
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationTerm Definition Example Amino Acids
Name 1. What are some of the functions that proteins have in a living organism. 2. Define the following and list two amino acids that fit each description. Term Definition Example Amino Acids Hydrophobic
More informationEstimation of Area under the ROC Curve Using Exponential and Weibull Distributions
XI Biennial Conference of the International Biometric Society (Indian Region) on Computational Statistics and Bio-Sciences, March 8-9, 22 43 Estimation of Area under the ROC Curve Using Exponential and
More informationProteins. Amino acids, structure and function. The Nobel Prize in Chemistry 2012 Robert J. Lefkowitz Brian K. Kobilka
Proteins Amino acids, structure and function The Nobel Prize in Chemistry 2012 Robert J. Lefkowitz Brian K. Kobilka O O HO N N HN OH Ser65-Tyr66-Gly67 The Nobel prize in chemistry 2008 Osamu Shimomura,
More informationLearning Convolutional Neural Networks for Graphs
GA-65449 Learning Convolutional Neural Networks for Graphs Mathias Niepert Mohamed Ahmed Konstantin Kutzkov NEC Laboratories Europe Representation Learning for Graphs Telecom Safety Transportation Industry
More informationA prediction model for type 2 diabetes using adaptive neuro-fuzzy interface system.
Biomedical Research 208; Special Issue: S69-S74 ISSN 0970-938X www.biomedres.info A prediction model for type 2 diabetes using adaptive neuro-fuzzy interface system. S Alby *, BL Shivakumar 2 Research
More informationBiological Molecules B Lipids, Proteins and Enzymes. Triglycerides. Glycerol
Glycerol www.biologymicro.wordpress.com Biological Molecules B Lipids, Proteins and Enzymes Lipids - Lipids are fats/oils and are present in all cells- they have different properties for different functions
More informationDetails of Organic Chem! Date. Carbon & The Molecular Diversity of Life & The Structure & Function of Macromolecules
Details of Organic Chem! Date Carbon & The Molecular Diversity of Life & The Structure & Function of Macromolecules Functional Groups, I Attachments that replace one or more of the hydrogens bonded to
More informationLecture 15. Membrane Proteins I
Lecture 15 Membrane Proteins I Introduction What are membrane proteins and where do they exist? Proteins consist of three main classes which are classified as globular, fibrous and membrane proteins. A
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the
More informationMacromolecules. 3. There are several levels of protein structure, the most complex of which is A) primary B) secondary C) tertiary D) quaternary
Macromolecules 1. If you remove all of the functional groups from an organic molecule so that it has only carbon and hydrogen atoms, the molecule become a molecule. A) carbohydrate B) carbonyl C) carboxyl
More informationBridging task for 2016 entry. AS/A Level Biology. Why do I need to complete a bridging task?
Bridging task for 2016 entry AS/A Level Biology Why do I need to complete a bridging task? The task serves two purposes. Firstly, it allows you to carry out a little bit of preparation before starting
More informationCS612 - Algorithms in Bioinformatics
Spring 2016 Protein Structure February 7, 2016 Introduction to Protein Structure A protein is a linear chain of organic molecular building blocks called amino acids. Introduction to Protein Structure Amine
More informationHuman Activities: Handling Uncertainties Using Fuzzy Time Intervals
The 19th International Conference on Pattern Recognition (ICPR), Tampa, FL, 2009 Human Activities: Handling Uncertainties Using Fuzzy Time Intervals M. S. Ryoo 1,2 and J. K. Aggarwal 1 1 Computer & Vision
More informationPosterREPRINT AN AUTOMATED METHOD TO SELF-CALIBRATE AND REJECT NOISE FROM MALDI PEPTIDE MASS FINGERPRINT SPECTRA
Overview AN AUTOMATED METHOD TO SELF-CALIBRATE AND REJECT NOISE FROM MALDI PEPTIDE MASS FINGERPRINT SPECTRA Jeffery M Brown, Neil Swainston, Dominic O. Gostick, Keith Richardson, Richard Denny, Steven
More informationTHE UNIVERSITY OF MANITOBA. DATE: Oct. 22, 2002 Midterm EXAMINATION. PAPER NO.: PAGE NO.: 1of 6 DEPARTMENT & COURSE NO.: 2.277/60.
PAPER NO.: PAGE NO.: 1of 6 GENERAL INSTRUCTIONS You must mark the answer sheet with pencil (not pen). Put your name and enter your student number on the answer sheet. The examination consists of multiple
More informationAutomatic Detection of Heart Disease Using Discreet Wavelet Transform and Artificial Neural Network
e-issn: 2349-9745 p-issn: 2393-8161 Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com Automatic Detection of Heart Disease
More informationMRI Image Processing Operations for Brain Tumor Detection
MRI Image Processing Operations for Brain Tumor Detection Prof. M.M. Bulhe 1, Shubhashini Pathak 2, Karan Parekh 3, Abhishek Jha 4 1Assistant Professor, Dept. of Electronics and Telecommunications Engineering,
More informationAssigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science
Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science Abstract One method for analyzing pediatric B cell leukemia is to categorize
More informationMacromolecules. Note: If you have not taken Chemistry 11 (or if you ve forgotten some of it), read the Chemistry Review Notes on your own.
Macromolecules Note: If you have not taken Chemistry 11 (or if you ve forgotten some of it), read the Chemistry Review Notes on your own. Macromolecules are giant molecules made up of thousands or hundreds
More informationMost life processes are a series of chemical reactions influenced by environmental and genetic factors.
Biochemistry II Most life processes are a series of chemical reactions influenced by environmental and genetic factors. Metabolism the sum of all biochemical processes 2 Metabolic Processes Anabolism-
More informationThe three important structural features of proteins:
The three important structural features of proteins: a. Primary (1 o ) The amino acid sequence (coded by genes) b. Secondary (2 o ) The interaction of amino acids that are close together or far apart in
More informationاالمتحان النهائي لعام 1122
االمتحان النهائي لعام 1122 Amino Acids : 1- which of the following amino acid is unlikely to be found in an alpha-helix due to its cyclic structure : -phenylalanine -tryptophan -proline -lysine 2- : assuming
More informationERA: Architectures for Inference
ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering 7/28/09 1 Intelligent Computing In spite of the transistor bounty of Moore s law, there is a large class of problems
More informationRevision Sheet Final Exam Term
Revision Sheet Final Exam Term-1 2018-2019 Name: Subject: Chemistry Grade: 12 A, B, C Required Materials: Chapter: 22 Section: 1,2,3,4 (Textbook pg. 669-697) Chapter: 23 Section: 1,2 (Textbook pg. 707-715)
More informationFIRM. Full Iterative Relaxation Matrix program
FIRM Full Iterative Relaxation Matrix program FIRM is a flexible program for calculating NOEs and back-calculated distance constraints using the full relaxation matrix approach. FIRM is an interactive
More information2) {p p is an irrational number that is also rational} 2) 3) {a a is a natural number greater than 6} 3)
Exam Name SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Write the set by listing its elements. 1) {a a is an even integer greater than 4} 1) 2) {p p
More informationبسم هللا الرحمن الرحيم
بسم هللا الرحمن الرحيم Q1: the overall folding of a single protein subunit is called : -tertiary structure -primary structure -secondary structure -quaternary structure -all of the above Q2 : disulfide
More informationBCB 444/544 Fall 07 Dobbs 1
BCB 444/544 Lecture 19 A bit of: Protein Structure - Basics Protein Structure Visualization, & Comparison #19_Oct5 Required Reading (before lecture) Mon Oct 1 - Lecture 17 Protein Motifs & Domain Prediction
More informationHonors Biology Chapter 3: Macromolecules PPT Notes
Honors Biology Chapter 3: Macromolecules PPT Notes 3.1 I can explain why carbon is unparalleled in its ability to form large, diverse molecules. Diverse molecules found in cells are composed of carbon
More informationPolypeptide and protein structure
Polypeptide and protein structure **structure of amino acids is very important and you must identify them Slide 1) single amino acid >more than one > polypeptide There is levels of structures of protein
More informationClassification of ECG Data for Predictive Analysis to Assist in Medical Decisions.
48 IJCSNS International Journal of Computer Science and Network Security, VOL.15 No.10, October 2015 Classification of ECG Data for Predictive Analysis to Assist in Medical Decisions. A. R. Chitupe S.
More informationECG Beat Recognition using Principal Components Analysis and Artificial Neural Network
International Journal of Electronics Engineering, 3 (1), 2011, pp. 55 58 ECG Beat Recognition using Principal Components Analysis and Artificial Neural Network Amitabh Sharma 1, and Tanushree Sharma 2
More informationPrediction of temperature factors from protein sequence
www.bioinformation.net Hypothesis Volume 9(3) Prediction of temperature factors from protein sequence Shrihari Sonavane 1 *, Ashok A Jaybhaye 2 & Ajaykumar G Jadhav 1 1Department of Microbiology, Institute
More informationCh5: Macromolecules. Proteins
Ch5: Macromolecules Proteins Essential Knowledge 4.A.1 The subcomponents of biological molecules and their sequence determine the properties of that molecule A. Structure and function of polymers are derived
More informationMITOCW MIT7_01SCF11_track13_300k.mp4
MITOCW MIT7_01SCF11_track13_300k.mp4 HAZEL SIVE: All right. Let's move on to the second topic of our discussion today, which we will start today and then continue on Friday. And this is a discussion of
More informationBiology 5A Fall 2010 Macromolecules Chapter 5
Learning Outcomes: Macromolecules List and describe the four major classes of molecules Describe the formation of a glycosidic linkage and distinguish between monosaccharides, disaccharides, and polysaccharides
More informationSome Thoughts on Calibrating LaModel
Some Thoughts on Calibrating LaModel Dr. Keith A. Heasley Professor Department of Mining Engineering West Virginia University Introduction Recent mine collapses and pillar failures have highlighted hli
More informationHOMEWORK II and Swiss-PDB Viewer Tutorial DUE 9/26/03 62 points total. The ph at which a peptide has no net charge is its isoelectric point.
BIOCHEMISTRY I HOMEWORK II and Swiss-PDB Viewer Tutorial DUE 9/26/03 62 points total 1). 8 points total T or F (2 points each; if false, briefly state why it is false) The ph at which a peptide has no
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More information