Topics The topics: basic concepts of molecular biology elements on Python overview of the field biological databases and database searching sequence alignments phylogenetic trees microarray data analysis Next generation sequencing protein structure prediction Protein Synthesis Proteins the national health museum Proteins Proteins perform a vast array of biological function including: Transport: hemoglobin (delivers O2 from lungs) Mechanical support: collagen Storage: ferritin (stores iron) Regulation: repressor proteins (gene expression) Antibodies: immunoglobulin Catalysis: SOD (superoxide dismutase) Misfold: mad cow disease, Alzheimer's disease, Amino acid composition Basic Amino Acid Structure: The side chain, R, varies for each of the 20 amino acids Side chain N Amino group R C α C O O Carboxyl group 1
The Peptide Bond Dehydration synthesis Repeating backbone: N C α C N C α C Side chain properties Carbon does not make hydrogen bonds with water easily hydrophobic O and N are generally more likely than C to h- bond to water hydrophilic The amino acids form three general groups: ydrophobic Charged (positive/basic & negative/acidic) Polar The ydrophobic Amino Acids The Charged Amino Acids Proline severely limits allowable conformations! The Polar Amino Acids More Polar Amino Acids And then there s 2
Peptidyl polymers A few amino acids in a chain are called a polypeptide.. A protein is usually composed of 50 to 00+ amino acids. Primary & Secondary Structure Primary structure = the linear sequence of amino acids comprising a protein: AGVGTVPMTAYGNDIQYYGQVT Secondary structure Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the α-helix and the β-sheet The location of direction of these periodic, repeating structures is known as the secondary structure of the protein Levels of Protein Structure Dihedral angles Secondary structure elements combine to form tertiary structure Quaternary structure occurs in multi-enzyme complexes Many proteins are active only as homodimers, homotetramers, etc. The alpha helix φ ψ 60 The beta strand (& sheet) φ 15 ψ +15
Determining Protein Structure There are >>20,000 distinct proteins in human proteome. Two methods for revealing positions of atoms in -D: X-Ray Crystallography X-ray diffraction pattern + mathematical construction Good protein crystal needed, good resolution of diffraction needed Nuclear Magnetic Resonance Small proteins only (< 250 residues) Inter-proton distances + geometric constraints Bovine Ribonuclease Christian Anfinsen, 1957. Two cysteines in close proximity will form a covalent bond Disulfide bond, disulfide bridge, or dicysteine bond. Significantly stabilizes tertiary structure. Disulfide Bonds Principles that govern the folding of protein chains - Christian Anfinsen, Science 197 Ribonuclease
Disulfide Bonds Levinthal s paradox # of cysteines 6 8 10 12 f(n)=(n-1)f(n-2) # of S-S S bonds 2 5 6 # of combinations 15 105 95 1095 ow do proteins find the right conformation out of the simply endless number of potential three-dimensional forms that it could randomly fold into? Consider a 100 residue protein. If each residue can take only positions, there are 100 = 5 10 7 possible conformations. If it takes 10-1 s to convert from 1 structure to another, exhaustive search would take 1.6 10 27 years! What determines fold? Anfinsen s experiments in 1957 demonstrated that proteins can fold spontaneously into their native conformations under physiological conditions. This implies that primary structure does indeed determine folding or -D structure. Some exceptions exist Chaperone proteins assist folding Abnormally folded Prion proteins can catalyze misfolding of normal prion proteins that then aggregate Current Opinion in Structural Biology, 200, 1, 70-75 Other factors Physical properties of protein that influence stability & therefore, determine its fold: Rigidity of backbone Amino acid interaction with water ydropathy index for side chains Interactions among amino acids Electrostatic interactions ydrogen, disulphide bonds Volume constraints CASP changed the landscape Critical Assessment of Structure Prediction competition. Even numbered years since 199 Solved, but unpublished structures are posted in May, predictions due in September Various categories Relation to existing structures, ab initio,, homology, fold, etc. Partial vs. Fully automated approaches Produces lots of information about what aspects of the problems are hard, and ends arguments about test sets. Results showing steady improvement, and the value of integrative approaches. 5