Protein Structure
recommended books
Proteins protein definition From gr. proteios (superior, erstrangig) 1836 JJ Berzelius Functions: structural, enzymes, muscle, transport immune system, Linear polymer of amino acid residues. Connection: peptide bond Polypeptide chain 3D structure: 1950s Myoglobin, hemolobin (Kendrew and Perutz)
4 levels of protein structure Primary: the linear sequence of amino acids Secondary: the local organization of parts of a polypeptide chain (α-helix, β-sheet) Tertiary: the overall, three-dimensional arrangement of the polypeptide chain (Domain: folding unit) Quaternary: the assembly of two or more polypeptides into a multisubunit complex
Part I protein main chain
Amino acids the building blocks In general: 2 functional groups α-amino acids R: side chain: 20 different amino acids encoded
the building blocks α-carboxyl group pka 2.0 α-amino group pka 9.5
The fractional protonation at a certain ph and the pi can be calculated by the Henderson-Hasselbalch equation. pka and pi http://www.rose-hulman.edu/%7ebrandt/chem423a/lecturenotes/amino_acid_properties.pdf
Absolute configuration the building blocks Amino acids contain asymmetric carbons. α-l amino acids S configuration
the peptide bond Condensation reaction ΔG = + 10 kj/mol metastable in water Activation by esterification with CCA-end of t-rnas
the peptide bond
the peptide bond Ramachandran et al., Biochim. Biophys. Acata 339 (1974), 298
The peptide bond has partial double bond character (40%) the peptide bond planarity The peptide bond has a constant dipole µ (3.5 Debye) µ
cis and trans peptides Configuration of the peptide bond Peptide bonds are mainly in trans. 180 0 Conformation defined by angle ω& Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
cis and trans peptides Cis bonds cause sterical hindrance. after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
cis and trans peptides Only proline residues are significantly in cis. after Stryer, Biochemistry, Copyright 2002 by W.H. Freeman & Co
the peptide bond Each peptide bond formation requires about 4 ATP and 4 GTP molecules for translation and quality control. The peptide planes are relatively oriented by the rotation around the Φ and Ψ angles. Ramachandran plot
Φ and ψ angles after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
Ramachandran plot Ψ Φ Hoeffken et al., J. Mol. Biol. 204 (1987), 629
Ramachandran plot
Ramachandran plot Ramachandran plot β-sheet right handed α-helix most favoured regions additionally allowed regions generously allowed regions dis - allowed regions
δ+ α - helix 3.6 aar 5.4 Å [ δ-
α - helix The α-helix is a 3.613-helix Helices like to form bundles or coiled coils or transmembrane domains (TMDs)
the helical wheel 6 3 2 7 5 4 1 1234567 1234567 1
the helical wheel!hydrophobic amphipathic hydrophilic protein core protein surface exposed (membrane) (membrane surface) Branden & Tooze, Protein Structure, 1999 Garland Publ.
β-sheet Extended zigzag of peptide chain & Location of side chains alternates & β-sheets form by the alignment of β-strands parallel or antiparallel β-sheets are twisted and like to form β-barrels. after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
parallel β-sheet
antiparallel β-sheet
poly-proline helix Ramachandran plot antiparallel β-sheet parallel β-sheet π-helix 3-10 helix left handed α-helix right handed α-helix most favoured regions additionally allowed regions generously allowed regions dis - allowed regions
β II β VIa Ramachandran plot inv. γ β V β I β III β VIb β VIII Rose et al., Adv. Prot. Chem. 37 (1985), 1; Chou & Fasman, J. Mol. Biol. 115 (1977), 135; Wilmot & Thornton, J. Mol. Biol. 203 (1988), 221; Richardson, Adv. Prot. Chem. 34 (1981), 167 β III β I β II γ β V
Which secondary structure is formed? propensities Secondary structures can be predicted.
side chains amino acid side chains
Amino acid classification: amino acid classes 1. Aliphatic 2. Proline (imino acid) 3. Hydroxyl (polar) 4. Acidic 5. Amide (polar) 6. Basic 7. Histidine 8. Aromatic 9. Sulfur containing
G Gly Glycine 1. Aliphatic Glycine R: -H Glycine is not asymmetric High conformational flexibility molecular mass: 57 dalton frequency: 7.2 % All aliphatic residues are chemically inert. Structural role (hydrophobic core) after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
1. Aliphatic Alanine R: -CH 3 A Ala Alanine molecular mass: 71 dalton frequency: 8.3 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
1. Aliphatic Valine R: -CH-(CH 3 ) 2 V Val Valine molecular mass: 99 dalton frequency: 6.6 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
1. Aliphatic Leucine R: -CH 2 -CH-(CH 3 ) 2 L Leu Leucine molecular mass: 113 dalton frequency: 9.0 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
1. Aliphatic Isoleucine I Ile Isoleucine molecular mass: 113 dalton frequency: 5.2 % Note: Additional asymmetric β-carbon. after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
1. Proline (imino acid) R: -CH 2 -CH 2 -CH 2 - Side chain is bonded to main chain nitrogen. Rigid breaks secondary structures P Pro Proline molecular mass: 97 dalton frequency: 5.1 % black = protein main chain after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
2. Hydroxyl (polar) Serine R: -CH 2 -OH Hydroxyl group is polar but chemically inert. S Ser Serine molecular mass: 87 dalton frequency: 6.9 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
2. Hydroxyl (polar) Threonine R: -CH(-CH) 3 (-OH) Note: Additional asymmetric β-carbon. T Thr Threonine molecular mass: 101 dalton frequency: 5.8 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
3. Acidic D Asp Aspartate Aspartic acid (Aspartate) R: -CH 2 -COOH Used for ionic and polar interactions molecular mass: (active sites, metal chelating) 115 dalton frequency: 5.3 % pka: 4.0 shaded = partial double bond after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
E Glu Glutamate 3. Acidic Glutamic acid (Glutamate) R: -CH 2 -CH 2 -COOH Used for ionic and polar interactions molecular mass: 129 dalton frequency: 6.2 % pka: 4.4 shaded = partial double bond after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
N Asn Asparagine 5. Amide (polar) Asparagine R: -CH 2 -CONH 2 Amid form of aspartic acid black = double bond relatively inert but amide is labile molecular mass: 114 dalton frequency: 4.4 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
5. Amide (polar) Glutamine R: -CH 2 -CH 2 -CONH 2 Amid form of glutamic acid Q Gln Glutamine molecular mass: 128 dalton frequency: 4 % black = double bond after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
5. Amide (polar) Glutamine spontaneously cyclises at N-terminus pyrrolidone carboxylic acid Q Gln Glutamine Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
6. Basic K Lys Lysine Lysine R: -(CH 2 ) 4 -NH 2 non-ionised Lys is a potent nucleophile! molecular mass: frequency: 128 dalton 5.7 % pka: 10.4 11.1 after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
6. Basic Arylation K Lys Lysine 2,4,6-trinitrobenzene sulfonate absorbance at 367 nm Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
6. Basic acetylation by anhydrides (reversible) K Lys Lysine Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
K Lys Lysine reversible Schiff base with Aldehydes Example: PLP (Vit. B6) Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
6. Basic Arginine R: -(CH 2 ) 3 -NH-C-(NH 2 ) 2 used for ionic and polar interactions (i.e. nucleotides) R Arg Arginine molecular mass: 156 dalton frequency: 5.7 % shaded = partial double bond pka: 12 Guanidinium group planar after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
H His Histidine 7. Histidine Contains very reactive imidazole ring Nucleophilic and acid-base reactions active site molecular mass: 137 dalton black = double bond pka: 6-7 frequency: 2.2 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
pka: 6-7 H His Histidine an aromatic heterocycle with one pyrrole N (δ 1 ) and one pyridine N (ε 2 ) The two nitrogens contribute 3 valence electrons to the aromatic ring. In the neutral imidazole ring, six π-electrons are de-localised over five p-orbitals in the five atoms of the imidazole ring. This makes for three bonding molecular orbitals. after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co and Kyte, Structure in Protein Chemistry, 1995 by Garland Publ. Inc.
+ H His Histidine - after Kyte, Structure in Protein Chemistry, 1995 by Garland Publ. Inc.
F Phe Phenylalanine 8. Aromatic Phenylalanine Hydrophobic and inert protein core molecular mass: 147 dalton shaded = partial double bond frequency: 3.9 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
Y Tyr Tyrosine 8. Aromatic Tyrosine Less hydrophobic due to hydroxyl group Used for ligand interactions Absorbs at 280 nm shaded = partial double bond molecular mass: 163 dalton frequency: 3.2 % pka: 11 after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
W Trp Tryptophane 8. Aromatic Tryptophane Indole ring important for absorption and flourescence molecular mass: 186 dalton shaded = partial double bond frequency: 1.3 % black = double bond after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
Phe Trp Tyr Wetlaufer, Adv. Protein Chem. 17 (1962), 303
Phe Trp Tyr Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
C Cys Cysteine 9. Sulfur containing Cysteine R: -CH 2 -SH Thiol group is most reactive side chain active sites Thiolate is a potent nucleophile! Sensitive to oxidation! pka: 9.0 9.5 molecular mass: 103 dalton frequency: 1.7 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
C Cys Cysteine 9. Sulfur containing cysteine oxidation disulfide formation Cellular system to prevent oxidation: Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
C Cys Cysteine 9. Sulfur containing thiole-disulfide exchange Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
9. Sulfur containing Cleland s reagent C Cys Cysteine DTT Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
C Cys Cysteine 9. Sulfur containing Ellman Assay DTNB di-thio-bis-nitrobenzoic acid after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
9. Sulfur containing C Cys Cysteine alkylation at basic ph pka: 8.2 9.5 Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
9. Sulfur containing C Cys Cysteine p-mercuribenzoic acid Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
9. Sulfur containing Methionine R: -CH 2 -CH 2 -S-CH 3 Long and flexible and hydrophobic M Met Methionine less sensitive to oxidation molecular mass: 131 dalton frequency: 2.4 % after Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co; amino acid frequency McCaldon & Argos, Proteins 4 (1988), 99
9. Sulfur containing M Met Methionine methionine oxidation by air Creighton, Proteins, Copyright 1993 by W.H. Freeman & Co
A Venn diagram showing the relationship of the 20 naturally occurring amino acids Taylor, W. R. The Classification of Amino Acid Conservation. J. Theor. Biol. 119 (1986), 205-218; Livingstone & Barton, Meth. Enzymol. 266 (1996), 497-512.
Essential amino acids I Ile, L Leu, V Val, W Trp, F Phe, M Met, K Lys, T Thr
Modified amino acids
phosphorylation O-phosphonoserine, O-phosphonothreonine, O-phosphonotyrosine, O-phosphonoglutamate, N-phosphonolysine, N-phosphonohistidine, N-phosphonoarginine, S-phosphonocysteine Jack Kyte, Structure in Protein Chemistry, 1995 by Garland Publ. Inc.; Uy, R. & Wood, F., Science 198 (1977), 890-896.
glycosylation O-(polymannosyl)serine, O-(polymannosyl)threonine, O-[oligo(α1,2)galactosyl]serine, O-(3-O-(β-glucosyl)-α-fucosyl]threonine, O-[2-O-(α-glucosyl)-β-galactosyl)-5-hydroxylysine, O-(β-xylosyl)serine, O-[4-O-(β-galactosyl)-β-xylosyl]serine, S-digalactosylcysteine, S-triglucosylcysteine, O-(glucosylarabinosyl)hydroxyproline, O-(N-acetylglucosaminyl)serine Jack Kyte, Structure in Protein Chemistry, 1995 by Garland Publ. Inc.; Uy, R. & Wood, F., Science 198 (1977), 890-896.