In situ proteolysis for protein crystallization and structure determination

Similar documents
MD1-74-ECO. MemGold Eco Screen Combo Value Pack (MD1-39-ECO & MD1-63-ECO)

Nature Protocols: doi: /nprot Supplementary Figure 1

MemGold TM. MD1-39 is presented as 96 x 10 ml conditions.

MD1-64-GREEN is a targeted sparse matrix presented as a 96 x 1 ml deep-well block.

MD1-38 is presented as a 96 x 10 ml condition targeted sparse matrix screen.

MIDASplus MD MD1-106 is presented as 96 x 10 ml conditions.

Structure Screen 1 & 2 HT-96 Eco Screen

Clear Strategy TM Screen I HT-96 A 6 4 matrix screen 1 that offers a more rational, logical and flexible approach to crystallization experiments.

Membrane Proteins. MemGoldMeso 10 ml and HT-96* MD1-114, MD1-115

Supporting Information

MD1-05. The Stura Footprint Combination (MD1-20 & MD1-22)

The Stura Footprint Combination HT-96 MD1-43 Rapidly analyse the crystallization potential of your protein and protein complexes.

CryoProtX MD1 61. MD1-61 is presented as a 46 x 1.5 ml kit. To be used in conjunction with the Quick-Start Guide. Features of CryoProtX :

Designing successful membrane protein crystallisation screens: how we designed MemGold. Simon Newstead

MD1-60 is presented as 96 x 1 ml conditions in a deep-well block.

PACT premier,* HT-96 MD1-36. PACT premier is a ph, Anion, Cation crystallization trial devised to test ph within a PEG/Ion screen environment.

moleculardimensions.com

Comparative analysis of anti-polyglutamine Fab crystals grown on Earth and in microgravity

CryoProtX TM Eco Screen

Supplementary Materials for

MultiXtal Formulation

Typical ph conditions used for membrane protein crystallization. Total concentration of salts used for membrane protein crystallization.

Constructs expressing recombinant proteins were generated by PCR; mutant variants were

The PGA Screen TM HT-96. The kit contains 96 1ml conditions in a deep-well block.

MD1-32 is presented as 96 x 1 ml conditions in a deep-well block.

Biological Mass Spectrometry. April 30, 2014

CS612 - Algorithms in Bioinformatics

Introduction to proteins and protein structure

Supplementary Figure 1 Preparation, crystallization and structure determination of EpEX. (a), Purified EpEX and EpEX analyzed on homogenous 12.

Levels of Protein Structure:

Chapter 3. Protein Structure and Function

Membrane and Soluble Proteins

Crystal Structure of the Subtilisin Carlsberg: OMTKY3 Complex

Unveiling transient protein-protein interactions that modulate inhibition of alpha-synuclein aggregation

SUPPLEMENTAL INFORMATION

Supporting Information for:

Chapter 3. Structure of Enzymes. Enzyme Engineering

User Guide HR2-451 (pg 1)

Structure and functional analysis of the IGF-II/IGF2R interaction

MD1-46 is presented as 96 x 10 ml conditions.

BabyBio IMAC columns DATA SHEET DS

Morpheus TM MD1 46/MD1 47

PTM Discovery Method for Automated Identification and Sequencing of Phosphopeptides Using the Q TRAP LC/MS/MS System

(B D) Three views of the final refined 2Fo-Fc electron density map of the Vpr (red)-ung2 (green) interacting region, contoured at 1.4σ.

Supporting Information

Crystallisation Conditions The good, the bad and those with special challenges

SUPPLEMENTARY MATERIAL

Nature Methods: doi: /nmeth Supplementary Figure 1

HOMEWORK II and Swiss-PDB Viewer Tutorial DUE 9/26/03 62 points total. The ph at which a peptide has no net charge is its isoelectric point.

The original systematic screen for membrane proteins.

Secondary Structure North 72nd Street, Wauwatosa, WI Phone: (414) Fax: (414) dmoleculardesigns.com

APPENDIX Heparin 2 mg heparin was dissolved in 0.9 % NaCl (10 ml). 200 µl of heparin was added to each 1 ml of blood to prevent coagulation.

Secondary Structure. by hydrogen bonds

Activities for the α-helix / β-sheet Construction Kit

Characterizing the mesophase behavior of hydrated 9.9 MAG at 20 C and at increasing concentrations of DDM by SAXS.

Structural Characterization of Prion-like Conformational Changes of the Neuronal Isoform of Aplysia CPEB

The Immunoassay Guide to Successful Mass Spectrometry. Orr Sharpe Robinson Lab SUMS User Meeting October 29, 2013

Phenylketonuria (PKU) Structure of Phenylalanine Hydroxylase. Biol 405 Molecular Medicine

KDM2A. Reactions. containing. Reactions

MemMeso TM MD1-86. MemMeso TM A 96-condition crystallization screen specifically for use with mesophases. (LCP compatible). Features of MemMeso:

The N-terminal loop of IRAK-4 death domain regulates ordered assembly of the Myddosome signalling scaffold

SYNOPSIS STUDIES ON THE PREPARATION AND CHARACTERISATION OF PROTEIN HYDROLYSATES FROM GROUNDNUT AND SOYBEAN ISOLATES

Previous Class. Today. Detection of enzymatic intermediates: Protein tyrosine phosphatase mechanism. Protein Kinase Catalytic Properties

Residue Monograph prepared by the meeting of the Joint FAO/WHO Expert Committee on Food Additives (JECFA), 82 nd meeting 2016.

Proteins. Amino acids, structure and function. The Nobel Prize in Chemistry 2012 Robert J. Lefkowitz Brian K. Kobilka

Purification of Glucagon3 Interleukin-2 Fusion Protein Derived from E. coli

Supplementary Figure 1. Method development.

The Structure and Function of Macromolecules

Supplementary Figure 1 (previous page). EM analysis of full-length GCGR. (a) Exemplary tilt pair images of the GCGR mab23 complex acquired for Random

Transient β-hairpin Formation in α-synuclein Monomer Revealed by Coarse-grained Molecular Dynamics Simulation

Supplementary Figure-1. SDS PAGE analysis of purified designed carbonic anhydrase enzymes. M1-M4 shown in lanes 1-4, respectively, with molecular

Tenofovir disoproxil fumarate (Tenofoviri disoproxili fumaras)

Separation of a phosphorylated-his protein using phosphate-affinity polyacrylamide gel electrophoresis

OCR (A) Biology A-level

Insulin SEC analysis. Insulin Samples:

This exam consists of two parts. Part I is multiple choice. Each of these 25 questions is worth 2 points.

Supplementary Information. Top-down/bottom-up mass spectrometry workflow using dissolvable polyacrylamide gels

Systematic analysis of protein-detergent complexes applying dynamic light scattering to optimize solutions for crystallization trials

the nature and importance of biomacromolecules in the chemistry of the cell: synthesis of biomacromolecules through the condensation reaction lipids

[NOTE The relative retention times for calcitonin salmon and calcitonin salmon related compound A Change to read:

Supplementary material: Materials and suppliers

AMPK Assay. Require: Sigma (1L, $18.30) A4206 Aluminum foil

Lecture 10 More about proteins

Date Updated: 10/14/2003

Supplementary Material

ESTIMATION OF PEG TYPES AND THEIR CONCENTRATION DURING PROTEIN CRYSTALLIZATION Rajneesh K. Gaur

Supplementary Table 1. Properties of lysates of E. coli strains expressing CcLpxI point mutants

Crystal Screens. JBScreen Family JBScreen Formulations. Crystallography

Sequence Identification And Spatial Distribution of Rat Brain Tryptic Peptides Using MALDI Mass Spectrometric Imaging

Supplementary Materials for

MD1-46 is presented as 96 x 10 ml conditions.

MD1-76. Power Combo Value Pack (MD1-46 & MD1-59)

Lesson 5 Proteins Levels of Protein Structure

OPTION GROUP: BIOLOGICAL MOLECULES 3 PROTEINS WORKBOOK. Tyrone R.L. John, Chartered Biologist

Food for special medical purposes. phenylketonuria (PKU) Important notice: Suitable only for individuals with proven phenylketonuria.

PAPER No. : 16, Bioorganic and biophysical chemistry MODULE No. : 22, Mechanism of enzyme catalyst reaction (I) Chymotrypsin

SUPPLEMENTARY INFORMATION FOR. (R)-Profens Are Substrate-Selective Inhibitors of Endocannabinoid Oxygenation. by COX-2

Supporting information (protein purification, kinetic characterization, product isolation, and characterization by NMR and mass spectrometry):

Lane: 1. Spectra BR protein ladder 2. PFD 3. TERM 4. 3-way connector 5. 2-way connector

Transcription:

In situ proteolysis for protein crystallization and structure determination Aiping Dong, Xiaohui Xu, Aled M Edwards, Midwest Center for Structural Genomics Members & Structural Genomics Consortium Supplementary figures and text: Supplementary Figure 1. Representations of the protein structures. Supplementary Table 1. Composition of the 96-condition crystal screen used for in situ proteolysis. Supplementary Table 2. Mass spectrometry characterization of protein crystals that were used for structure determination. Supplementary Methods Supplementary Results

Supplementary Figure 1a Ribbon representation of SCO6256. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1b. Ribbon representation of SCO4942. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1c Ribbon representation of NE2398. Dotted lines represent the parts of the protein digested with protease. Blue molecules represent other molecules in the crystal lattice.

Supplementary Figure 1d Ribbon representation of ATU0899. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1e Ribbon representation of ATU2452. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1f Ribbon representation of ATU0870. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1g Ribbon representation of HP0029. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1h Ribbon representation of ATU0299. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1i Ribbon representation of ATU0434. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Figure 1j Ribbon representation of the AIRS domain of human GART. Dotted lines represent the parts of the protein digested with protease. Grey molecules represent other molecules in the crystal lattice.

Supplementary Table 1. Composition of the 96-condition crystal screen used for in situ proteolysis Bmain 96w Precipitant 1 Concentration Precipitant 2 Concentration Buffer 0.1M ph Bm-N1 Ammonium Sulphate 1.5 M Bis-Tris 6.5 Bm-N2 Ammonium Sulphate 1.5 M Hepes Na 7.5 Bm-N3 Ammonium Sulphate 1.5 M Tris 8.5 Bm-N4 Ammonium Sulphate 2.5 M Na Acetate 4.6 Bm-N5 Ammonium Sulphate 2.5 M Bis-Tris Propane 7 Bm-N6 Ammonium Sulphate 2.5 M Tris 8.5 Bm-N7 Ammonium Sulfate 2.0 M K/Na Tartrate 0.2 M Sodium Citrate 5.6 Bm-N8 Ammonium Sulfate 2M PEG400 2% Na Hepes 7.5 Bm-N9 PEG 3350 25% w/v Ammonium Sulphate 0.1 M Bis-Tris 5.5 Bm-N10 PEG 3350 25% w/v Ammonium Sulphate 0.2M Bis-Tris 6.5 Bm-N11 PEG 3350 25% w/v Ammonium Sulphate 0.2M HEPES 7.5 Bm-N12 PEG 3350 25% w/v Ammonium Sulphate 0.2M Tris 8.5 Bm-N13 Succinic Acid 0.8M 7 Bm-N14 Ammonium Phosphate 2.0 M Tris HCI 8.5 Bm-N15 PEG 3350 25% w/v Ammonium Acetate 0.2 M Bis-Tris 5.5 Bm-N16 PEG 3350 25% w/v Ammonium Acetate 0.2 M Bis-Tris 6.5 Bm-N17 PEG 3350 25% w/v Ammonium Acetate 0.2 M HEPES 7.5 Bm-N18 PEG 8000 9% Calcium Acetate 0.2 M Na Cacodylate 6.5 Bm-N19 PEG 3350 20% Ammonium Chloride 0.2 M Bm-N20 PEG 3350 20% Ammonium dihydrogen 0.2 M Phosphate Bm-N21 PEG 3350 20% Ammonium Formate 0.2 M Bm-N22 PEG 3350 20% Ammonium Iodide 0.2 M Bm-N23 Sodium-Potassium Phosphate 1.0 M 8.2 Bm-N24 PEG 3350 20% di-ammonium Tartrate 0.2 M Bm-N25 PEG 3350 20% w/v Sodium Formate 0.2 M Bm-N26 PEG 3350 25% Na Citrate 5.6 Bm-N27 PEG 4000 30% Ammonium Acetate 0.2 M Na Acetate 4.6 Bm-N28 tri-ammonium Citrate ph 1.0M Bis-Tris Propane 7 7.0 Bm-N29 PEG 3350 20% Calcium Chloride 0.2 M Bm-N30 PEG 3350 20% K Sulfate 0.2 M Bm-N31 PEG 3350 20% K Thiocyanate 0.2 M Bm-N32 PEG 3350 25% w/v Li Sulphate 0.2 M Bis-Tris 6.5 Bm-N33 PEG 3350 25% w/v Li Sulphate 0.2 M HEPES 7.5 Bm-N34 PEG 3350 20% Mg Acetate 0.2 M Bm-N35 PEG 3350 20% Na dihydrogen Phosphate Bm-N36 PEG 3350 20% di-ammonium hydrogen Citrate 0.2 M Bm-N37 Na Chloride 3.2M Na Acetate 4.6 0.2 M

Bm-N38 Na Chloride 3.2M Bis-Tris Propane 7 Bm-N39 Na Chloride 3.2M Tris 8.5 Bm-N40 Na Citrate 1.4 M Na Hepes 7.5 Bm-N41 Na Formate 2.0M Na Acetate 4.6 Bm-N42 Ammonium Sulfate 1.6 M Na Cloride 0.1 M Hepes 7.5 Bm-N43 Ammonium Sulfate 1.5 M Glycerol 12% Tris 8.5 Bm-N44 Na Formate 3.5M Na Acetate 4.6 Bm-N45 Na Formate 3.5M Bis-Tris Propane 7 Bm-N46 Na Formate 3.5M Tris 8.5 Bm-N47 Na/K Phosphate 1M 0.5M K3PO4, 0.5M Na3PO4 Bm-N48 PEG 3350 20% di-na Tartrate 0.2 M Bm-N49 PEG 3350 20% K Acetate 0.2 M Bm-N50 PEG 3350 20% K Chloride 0.2 M Bm-N51 PEG 3350 20% K dihydrogen Phosphate 0.2 M 6.9 Bm-N52 PEG 3350 20% K Fluoride 0.2 M Bm-N53 PEG 3350 20% K/Na Tartrate 0.2 M Bm-N54 PEG 3350 25% w/v Mg Chloride 0.2 M HEPES 7.5 Bm-N55 PEG 3350 25% w/v Mg Chloride 0.2 M Tris 8.5 Bm-N56 PEG 3350 20% Mg Formate 0.2 M Bm-N57 PEG 5KMME 25% Ca Chloride 0.2M Bis-Tris 6 Bm-N58 PEG 3350 20% Na Acetate 0.2 M Bm-N59 PEG 3350 20% tri-lithium Citrate 0.2 M Bm-N60 PEG 3350 20% Na Iodide 0.2 M Bm-N61 tri-sodium Citrate 0.7M Tris 8.5 dihydrate Bm-N62 PEG 2K MME 30% w/v K Thiocyanate 0.1 M Bm-N63 PEG 2K MME 20% Trimethylamine N-oxide 0.2 M Tris 8.5 Bm-N64 PEG 4K 30% Ammonium Acetate 0.2 M Na Citrate 5.6 Bm-N65 PEG 4K 15% Ammonium Acetate 0.2 M Na Citrate 5.6 Bm-N66 PEG 4K 25% Ammonium Sulfate 0.2 M Na Acetate 4.6 Bm-N67 PEG 4K 30% Lithium Sulfate 0.2 M Tris HCI 8.5 Bm-N68 PEG 4K 30% Magnesium Chloride 0.2 M Tris HCI 8.5 Bm-N69 PEG 8K 30% Ammonium Sulfate 0.2 M Na Cacodylate 6.5 Bm-N70 PEG 8K 18% Calcium Acetate 0.2 M Na Cacodylate 6.5 Bm-N71 PEG 8K 20% Mg Acetate 0.2 M Na Cacodylate 6.5 Bm-N72 PEG 3350 15% w/v Succinic Acid ph 7.0 0.1 M Bm-N73 PEG 4K 30% Na Acetate 0.2 M Tris HCI 8.5 Bm-N74 PEG 3350 25% w/v Na Chloride 0.2 M Na Citrate 5.6 Bm-N75 PEG 8K 30% Na Acetate 0.2 M Bis-Tris 6.5 Bm-N76 PEG 3350 25% w/v Na Chloride 0.2 M Bis-Tris 6.5 Bm-N77 PEG 3350 25% w/v Na Chloride 0.2 M HEPES 7.5 Bm-N78 PEG 3350 25% w/v Citric Acid 3- Jan Bm-N79 PEG 3350 25% w/v Na Acetate 0.1M ph4.5, 4.5 Bm-N80 PEG 2K MME 28% w/v Bis-Tris 6.5 Bm-N81 PEG 8K 8% Tris HCI 8.5

Bm-N82 PEG 10000 20% Hepes 7.5 Bm-N83 PEG 3350 25% w/v Mg Chloride 0.2 M Bis-Tris 5.5 Bm-N84 PEG 3350 25% w/v Mg Chloride 0.2 M Bis-Tris 6.5 Bm-N85 PEG 5KMME 25% Ammonium Sulphate 0.2M HEPES 7.5 Bm-N86 PEG 5KMME 30% Ammonium Sulphate 0.2 M MES 6.5 Bm-N87 PEG 5K MME 20% w/v Bis-Tris 6.5 Bm-N88 Pentaerythritol Ethoxylate (15/4 EO/OH) 30% v/v Ammonium Sulphate 0.05 M Bis-Tris,0.05M 6.5 Bm-N89 Jeffamine ED-2001 ph 30% v/v HEPES 7 7.0 Bm-N90 PEG 20K 10% Bis-Tris 5.5 Bm-N91 PEG 10K 20% MES 6- Jan Bm-N92 PEG 400 28% Calcium Chloride 0.2 M Na Hepes 7.5 Bm-N93 Iso-Propanol 5% Ammonium Sulphate 2.0 M Bm-N94 PEG 6000 10% MPD 5% Hepes 7.5 Bm-N95 iso-propanol 10% PEG 4K 20% Na Hepes 7.5 Bm-N96 iso-propanol 20% PEG 4K 20% Na Citrate 5.6

Supplementary Table 2. Mass spectrometry characterization of protein crystals that were used for structure determination. Protein Dmin Å Total residues including affinity tag Protease Mass of Crystal (Da) ΔDa (%) Fragment crystallized SCO6256 2.5 1-266 Chymotrypsin 16594.81 0.0034 115-266 SCO4942 2.8 1-248 Chymotrypsin 22003.7 0.0016 44-244 Atu0434 2.3 1-391 Trypsin 39427.53 0.0514 25-383 NE2398 1.9 1-170 Chymotrypsin 15147.88-0.002 19-153 Atu0899 1.8 1-332 Chymotrypsin N/A N/A N/A Atu2452 2.56 1-269 Chymotrypsin 22767 0.0033 61-269 Atu0870 1.95 1-278 Chymotrypsin 28330.19-0.0005 26-278 HP0029 1.47 1-242 Chymotrypsin 25208.53-0.0006 19-242 Atu0299 1.8 1-219 Chymotrypsin 22212.93 0.002 19-219 Hs GART, AIRS domain 2.1 1-1025 Chymotrypsin 34968 0.017 489-816

Supplementary Methods Protein purification Proteins were purified as described in Zhang et al 1. In situ proteolysis α-chymotrypsin (SigmaC3142), dissolved at a concentration of 2 mg/ml in a solution containing 1 mm HCL and 2 mm CaCl2, was added to the purified protein on ice immediately prior to crystallization trials at a ratio of 1μg chymotrypsin per 100 μg of histidine-tagged protein, dissolved at 10-20 mg/ml in 10 mm Hepes, ph 7.5 and 500 mm NaCl. Crystallization was performed in sitting drops at room temperature, adding 0.5 μl of the protease/protein mixture to 0.5 μl of the precipitant. Crystallization trials were set up immediately, without assessing the efficacy of the proteolysis, without stopping the proteolysis reaction, and without purification of any proteolyzed fragments. Trypsin (Sigma T8003), dissolved at 1.5 mg/ml in 1 mm HCl and 2 mm CaCl2, was added to the purified protein on ice immediately prior to crystallization trials at a ratio of 1μg trypsin per 100-1000 μg of histidine-tagged protein, dissolved at 10-20 mg/ml in 10 mm Hepes, ph 7.5 and 500 mm NaCl. Crystallization was performed in sitting drops at room temperature, adding 1μl of the protease/protein mixture to 1μl of the precipitant. Crystallization trials were set up immediately, without assessing the efficacy of the proteolysis, without stopping the proteolysis reaction, and without purification of any proteolyzed fragments. Sample preparation and mass spectrometry Crystals were harvested from the crystallization drop in a loop and washed twice with two 5µl drops of water or 3M NaCl. The washed crystals were then transferred to a 5ul drop of 2% formic acid, which was then transferred to 145 µl of 2% formic acid. The ESI MS experiments were carried out on an Agilent 1100 LC/MSD TOF equipped with an electrospray source operating in positive ionization mode. Proteins were resolved from

salt and buffer components using reverse phase HPLC on an Agilent Poroshell 300SB-C3 column with internal dimension 1.0x75mm 5 micron. X-ray data collection and structure determination Diffraction data were collected at 100 K at the 19ID and 23ID beamline of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory and CHESS, Cornell University. Single-wavelength SAD data sets were collected from Se- Met-labeled protein crystals at the Se absorption peak wavelength. Most diffraction data were integrated and scaled with HKL2000, Atu0870 and Atu0899 were processed by HKL3000 2. Se atom sites were found by the SOLVE program 3. Density modification, NCS average and initial model building was carried out by RESOLVE 4. Model building continued using the ARPwARP program 5, and the final models were built manually using the program COOT 6. All models were refined with the REFMAC 5 program 7 of the CCP4 program suite 8 and the final models analyzed and validated with PROCHECK 9.

Supplementary Results All proteins were appended with a hexahistidine tag and a recognition site for the TEV protease (MGSSHHHHHHSSGRENLYFQG 1 or MGSSHHHHHHSSGRENLYFQGH). Serendipitously, the tag also contains chymotrypsin cleavage sites (Y and F). The lengths of proteins referred to in the text include the 21- or 22-residue histidine tag; the normal N-termini of the recombinant proteins therefore begin at residue 22 or 23. In situ proteolysis of proteins that had never before been crystallized. SCO6256, a putative transcriptional regulator (gi 21224577) from Streptomyces coelicolor could be purified readily but failed to crystallize. When histidine-tagged, Se- Met-labelled SCO6256 was screened against a 96 condition screen (Supplementary Table 1) in the presence of chymotrypsin, crystals formed in two conditions (0.1M Na Cacodylate ph 6.5, 0.2M Mg Acetate and 20% PEG8K and 0.1M Tris HCl ph 8.5, 0.2M Na Acetate and 30%PEG4K) with different crystal forms. The crystals that appeared in the second condition were optimized for data collection, data was collected at APS beamline 19ID and the structure determined using the anomalous diffraction from Se at its absorption peak. The structure revealed that the protein crystallized as a dimer; each monomer comprised a 6-stranded anti-parallel beta sheet connected by 4 small helices with a long C terminal loop (Supplementary Figure 1a). No electron density could be observed for the 123 N-terminal residues but the C-terminal appeared intact. The N- terminus was directed toward the dimer interface. The fragment of the protein that crystallized was determined to be 16594.81 Da using mass spectrometry; this corresponded to a region comprising residues 115-266 (Supplementary Table 2). SCO4942, annotated as a putative regulatory protein (gi 21223315) could not be crystallized by conversional screening, but was also crystallized by in situ proteolysis. The full-length protein and the histidine tag comprised 248 residues. Crystals of the Se Met-labeled protein were refined and the structure solved by using the anomalous signal from Se (Supplementary Figure 1b). The N-terminal 51 and C terminal 10 residues could

not be traced in the all-helical structure, and mass spectrometry of the crystals revealed that the crystallized fragment comprised residues 44-244 (Supplementary Table 2). In situ proteolysis to improve crystal quality NE2398, a CBS domain (gi 30250323), had formed crystals in 0.1 M Bis-Tris ph5.5, 0.2M MgCl2 and 25% PEG3350, and in 0.1M MES ph 6.5, 20%PEG10K. The crystals, although a reasonable size, did not diffract despite extensive screening and optimization. The histidine tagged version of NE2398 (168 residues) was re-screened in the presence of chymotrypsin (1:100 w/w). Within 5 ~ 6 days, needle-like crystals (200 x 50 x 30 µm) were grown in 25%PEG 3350, 0.1M Na Acetate ph 4.5. The crystals diffracted to about 1.8Å resolution. The Se-Met labeled protein did not crystallize and S-based phasing strategies were attempted without success. The structure was eventually solved from the anomalous signal from Br, after having soaked the crystal in ~2.5 M NaBr for 10 minutes and cryo-protecting it with Paratone-N oil. The structure revealed that NE2398 has two domains, each with two anti-parallel beta strands and helices (Supplementary Figure 1c). NAD, which presumably co-purified with the protein, was found bound between the two domains. Electron density could not be observed for the 22 N-terminal residues and the 21 C-terminal residues. The mass of the fragment that crystallized was 15147.88 Da, which corresponded to a region comprising residues 19-153, and suggests that the chymotrypsin digested after the tyrosine in the N- terminal tag and deleted 17 residues from the C-terminus, cutting after phenylalanine (Supplementary Table 2). Although it was difficult to ascribe a mechanism by which proteolysis facilitated crystallization in most instances, in the case of NE2398, examination of the crystal packing shows that this crystal form allows no room for an N- terminal extension. Atu0899 (gi 17934807), dihydrodipicolinate synthase, crystallized in many conditions in its histidine tagged form, but always formed small clusters of needles which did not diffract. Without the histidine tag, the crystals diffracted to 5Å. After in situ proteolysis

of the histidine-tagged form (332 residues) with chymotrypsin, crystals that diffracted strongly grew in 0.1M Tris ph 8.5, 0.2M Na acetate, 30% PEG 4K. A 1.8Å Se Met SAD data set was collected and the structure determined. The Atu0899 structure has one compact domain comprising an 8-stranded parallel beta-barrel surrounded by 11 helices (Supplementary Figure 1d). No electron density could be observed for the 29 N-terminal residues, which includes the affinity tag and 8 residues of the Atu0899 protein. Because it proved difficult to reproduce the well-diffracting crystals, we could not be certain that the crystals undergoing mass spectrometry corresponded to those that diffracted well. We were therefore unable to confidently determine the mass of the crystallized fragment. The histidine-tagged version of Atu2452 (269 residues), annotated as an uncharacterized protein (gi 17936334), formed small clusters of needle-like crystals in 0.2M Mg Acetate, 20%PEG3350; the crystals were too small to test for their diffraction properties. The chymotrypsin-treated Atu2452 formed diamond-shaped crystals in 0.1M Na citrate ph5.6, 0.2M K/Na Tartrate and 2M ammonium sulphate. The crystals diffracted to ~2.5Å at Beamline 19ID (Advanced Photon Source) and SAD data were collected at the Se absorption peak and the structure solved. Electron density was not seen for the 63 N terminal residues, and mass spectroscopic analysis of the crystals confirmed that 60 amino acids, which include the affinity tag, had been removed (Supplementary Table 2). The Atu2452 structure adopts a typical Rossmann fold (Supplementary Figure 1e) Se Met-labeled Atu0870, trans-aconitate 2-methyltransferase, was screened for crystallization in the presence and absence of the hexahistidine tag, and also in the presence and absence of its ligand (S-(5 -Adenosyl)-L-methionine chloride). Crystals of the hexahistidine tagged protein were not single crystals and were not tested for their diffraction properties. Crystals of the protein with the hexahistidine tagged removed and co-crystallized with 5mM ligand were formed in 0.1M Na Acetate ph 4.5 and 25% PEG3350, and diffracted to 2.5Å at the 19ID beamline at the Advanced Photon Source. Diffraction data were collected at Se absorption peak but the data quality was too poor to solve the structure. The histidine tagged version of Se-Met labeled Atu0870 (278 residues) was re-screened in the presence of chymotrypsin and ligand, the crystals was

formed in same condition (0.1M Na Acetate ph 4.5 and 25% PEG3350). SAD data at the Se peak were collected to 1.95Å at Beamline 19ID (APS) and the structure was solved. Mass spectroscopic analysis of the crystals revealed that a fragment comprising 26-278 crystallized (Supplementary Table 2). This translates to the removal of four more N- terminal residues than had been removed by treatment with TEV. The Atu0870 structure is composed of a seven-stranded β sheet sandwiched by 5 helices, three at one side and two at other side and also a smaller domain composed of a four-helix bundle (Supplementary Figure 1f). Interestingly, the N-terminal residues of the fragment that crystallized were involved in dimerization, extending into a groove in another monomer. The co-factor SAH was found in the structure. HP0029 (gi 15644662) formed thin needles in several conditions as a histidine tagged protein. Attempts to remove the histidine tag using TEV failed. Despite extensive refinement of the crystals of the histidine-tagged protein, the diffraction quality remained poor and the resolution never extended beyond 3.2Å. A Se-Met-labelled version of HP0029 was mixed with chymotrypsin crystallized in 0.1M Bis-tris ph5.5, 0.1M ammonium sulphate, 25%PEG3350. Only one crystal diffracted, but the structure was solved from this crystal and refined to 1.47Å (Supplementary Figure 1g). The structure revealed that HP0029 contains a 7-stranded parallel beta sheet flanked by alpha helices, 4 on one side and 5 on the other. The N-terminus had been deleted. There was clear electron density from residue 21 until the C-terminus. Interestingly, the N- terminal tag was removed by chymotrypsin digestion, but could not be removed by TEV treatment, despite extensive digestion. Examination of the crystal packing revealed that the N-terminus points into a region of the crystal that would not be able to accommodate any extra residues; HP0029 may represent an example in which the protease digestion allowed packing of this crystal form. Atu0299 (gi 17934215) also crystallized as needles that could not be optimized. The Se- Met version of the 219 residue histidine-tagged protein was mixed with chymotrypsin and crystals formed in 0.1M sodium acetate ph4.6, 3.5M sodium formate. The structure

was solved using SAD and refined to 1.8Å, and revealed a beta barrel surrounded by 13 helices (Supplementary Figure 1h). Mass spectrometry revealed that the crystals comprised a fragment containing residues 19-219 (Supplementary Table 2). Examination of the crystal packing showed that an extended N-terminus would have disrupted the packing of this crystal form. In situ trypsinolysis of a protein that had formed poor crystals both in the presence and absence of chymotrypsin ATU0434 crystallized in eight different conditions but even after extensive optimization, the diffraction resolution could not be improved beyond 3.5Å. Crystals were also obtained in many conditions after chymotrypsinolysis, but again the crystals were too small to be useful for data collection. The Se-Met version of the 391 residue histidinetagged protein (~10mg/ml) was then treated with 1:600 v/v trypsin (Sigma#T8003), which had been dissolved at 1.5mg/ml in 1mM HCl and 2mM CaCl2, and crystals formed in many conditions. The crystals grown from 0.2M CaCl2, 20%PEG3350 diffracted to 2.3Å and Se Met SAD data were collected, and the structure was solved. Mass spectrometry revealed that 24 and 8 residues were removed from the N- and C- termini respectively (Supplementary Table 2). The structure (Supplementary Figure 1i) comprises almost all beta-sheet, each monomer has 22 beta strands and 4 small helices. In situ proteolysis of a domain of a human protein that had never before been crystallized The human glycinamide ribonucleotide transformylasse (GART) comprises three domains. The structures of the N- and C-terminal domains had been determined previously but the middle domain (aminoimidazole ribonucleotide synthetase; AIRS) was insoluble, despite the use of multiple expression constructs. The full-length protein (residues 1-1003) was crystallized in the presence of chymotrypsin and crystals were obtained in 0.1 M bis-tris ph 5.2, 27% PEG 3350 and 0.2 M ammonium sulfate. The crystals were transferred to a cryo solution containing 0.1 M bis-tris ph 5.2, 27% PEG 3350 and 0.2 M ammonium sulfate, 0.2M NaCl and 20% glycerol and flash frozen in

liquid nitrogen. Remarkably, mass spectrometry of the crystals revealed that the N- terminal and C-terminal domains of GART were removed by chymotrypsin leaving only the AIRS domain in the crystal. Data collection was carried out at BESSY BL14-1 and the data were processed with XDS in space group P21212 (a=80.67 b=80.99 c=98.33). The structure was solved with molecular replacement using PHASER using the E.coli aminoimidazole ribonucleotide synthetase (PDB code: 1CLI) as a search model. Model building and refinement were performed in COOT and REFMAC5. Data in the interval 15-2.1 Å resolution were used and at the end of the refinement. The values for R= 19.4 % and R free= 24 % using one TLS group per molecule. The final model was validated using PROCHECK and MOLPROBITY and residues 475-792 were traced in the electron density (Supplementary Figure 1j). The coordinates for the crystal structure were deposited in the Protein Data Bank, accession code 2V9Y.

References 1. Zhang, R.G. et al. Structure of Thermotoga maritima stationary phase survival protein SurE: a novel acid phosphatase. Structure 9, 1095-1106 (2001). 2. Minor, W., Cymborowski, M., Otwinowski, Z. & Chruszcz, M. HKL-3000: the integration of data reduction and structure solution--from diffraction images to an initial model in minutes. Acta crystallographica 62, 859-866 (2006). 3. Terwilliger, T.C. & Berendzen, J. Automated MAD and MIR structure solution. Acta crystallographica 55, 849-861 (1999). 4. Terwilliger, T.C. Automated main-chain model building by template matching and iterative fragment extension. Acta crystallographica 59, 38-44 (2003). 5. Perrakis, A., Harkiolaki, M., Wilson, K.S. & Lamzin, V.S. ARP/wARP and molecular replacement. Acta crystallographica 57, 1445-1450 (2001). 6. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta crystallographica 60, 2126-2132 (2004). 7. Pannu, N.S., Murshudov, G.N., Dodson, E.J. & Read, R.J. Incorporation of prior phase information strengthens maximum-likelihood structure refinement. Acta crystallographica 54, 1285-1294 (1998). 8. The CCP4 suite: programs for protein crystallography. Acta crystallographica 50, 760-763 (1994). 9. Laskowski, R.A., Rullmannn, J.A., MacArthur, M.W., Kaptein, R. & Thornton, J.M. AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. Journal of biomolecular NMR 8, 477-486 (1996).