New york structural genomics consortium




















We next compared our list of selected targets to TCDB proteins. This is done for the first three levels in the TCDB classification. When considering these numbers one has to take into account that TCDB also contains beta barrel integral membrane proteins that we currently do not taken into consideration as targets for NYCOMPS and that not all membrane proteins are transporters i.

On the other hand, we also see that achieving a comprehensive structural coverage of IMP databases such as TCDB will require scaling-up considerably the number of selected seeds and targets.

One important aspect of any SG effort is novel leverage, i. PSI has overall performed extremely well by this criterion [ 29 ]. In the context of NYCOMPS target selection we estimated the novel leverage that would result if we obtained structures for all, or a fraction of, our families.

Note that here we apply a slightly more restrictive criterion for novel leverage with respect to its original definition [ 50 ], that is, not novel leverage with respect to the time targets were selected but novel leverage as of February UniProtKB is an ever-increasing database of annotated protein sequences mostly open reading frames.

This means that the naked numbers we just reported may not be very meaningful. In other words, while it is true that we can model a large number of proteins if we solve the structure of at least some of the NYCOMPS targets, it is also true that this may simply reflect the sheer size of UniProtKB and the fact that, as more data become available, proteins cluster into increasingly large homologous families.

If we do this Fig. On the y -axis we report the ratio between the respective leverage values. Notations are as in a. Finally, if we consider only human sequences within UniProtKB-TMH, we find that novel structural information could be obtained for 10— proteins Fig. This time, the ratio to current leverage is markedly smaller Fig. This is not very surprising given that all of our targets come from prokaryotic organisms.

New York Consortium On Membrane Protein Structure NYCOMPS , targets alpha helical bundle integral membrane proteins, adopting a strategy that seeks to optimize success while maintaining the commitment to novelty, target relevance and leverage.

In this paper, we have shown that the selected targets cover a wide range of protein lengths, TM topologies and functions. We have also demonstrated that the experimental determination of representative structures for these targets would allow transfer of structural information to a large number of known, but structurally uncharacterized, proteins.

In the near future, we plan to expand the list of valid targets by introducing new genomes and by targeting eukaryotic proteins, non-co-cistronic complexes and beta barrel integral membrane proteins. Last, not least, thanks to all those who deposit their experimental data in public databases, and to those who maintain these databases. This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author s and source are credited.

National Center for Biotechnology Information , U. Journal of Structural and Functional Genomics. J Struct Funct Genomics. Published online Oct Hunt , 3, 4 Lawrence Shapiro , 1, 3 Wayne A. Hendrickson , 1, 3, 5 and Burkhard Rost 1, 2, 3, 6. John F. Wayne A. Author information Article notes Copyright and License information Disclaimer.

Corresponding author. Received Apr 22; Accepted Sep This article has been cited by other articles in PMC. Electronic supplementary material The online version of this article doi Keywords: Membrane proteins, Target selection, Structural genomics, Structure determination. SG adds large sampling of diversity as a new dimension Structural genomics SG tries to increase the odds of experimentally obtaining high-resolution structures by using a pan-genomic approach that adds homology as a new dimension to the structure determination problem.

Signal peptide predictions We run SignalP [ 35 ] on all sequences from Bacteria left in our list note: no SignalP for sequences in Archaea is available [ 35 ] , and excluded all sequences predicted to have two TMHs but for which the first predicted TMH started before a predicted cleavage site. Exclusion of individual subunits of hetero-oligomeric complexes EcoCyc [ 43 ] is an annotated database for E. Open in a separate window.

Cloning families Fig. Central seed selection In central seed selection our main concern was to pick membrane proteins that were more likely to readily provide high-resolution structures and that differed substantially in sequence within the TM region from proteins for which structures had already been determined experimentally.

Nominated seed selection Nominated seeds handpicked by participating groups are special in many respects. Seed expansion The seed expansion procedure is the same for centrally selected and nominated targets and it is based on reciprocal sequence similarity in the predicted TM region between seed and NYCOMPS98 proteins Fig.

Filtering of targets After a seed is expanded into a family of proteins predicted to have similar membrane cores, all family members are subjected to additional filters. Novelty: exclusion of PDB homologs The first filtering step is meant to ensure that target proteins provide novel coverage of the protein universe [ 50 ]. Exclusion of isolated subunits of hetero-oligomeric complexes Occasionally, protein subunits that natively are parts of larger hetero-oligomeric complexes are structurally stable even when expressed in isolation e.

Conclusions New York Consortium On Membrane Protein Structure NYCOMPS , targets alpha helical bundle integral membrane proteins, adopting a strategy that seeks to optimize success while maintaining the commitment to novelty, target relevance and leverage. Electronic supplementary material Below is the link to the electronic supplementary material.

Supplementary material 1 DOC kb K, doc. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author s and source are credited. Notations used Reagent genomes List of entirely sequenced organisms from which PSI clones its targets. References 1. Update on the protein structure initiative.

Structural genomics programs at the US national institute of general medical sciences. Nat Struct Biol. Membrane protein prediction methods. J Intern Med. Global analysis of predicted proteomes: functional adaptation of physical properties. Liu J, Rost B. Comparing function and structure between entire proteomes. Protein Sci. Predicting transmembrane beta-barrels in proteomes. Nucleic Acids Res. Bigelow H, Rost B. PROFtmb: a web server for predicting bacterial transmembrane beta barrel proteins.

Wimley WC. The versatile beta-barrel membrane protein. Curr Opin Struct Biol. Online tools for predicting integral membrane proteins. Methods Mol Biol. Wza the translocon for E. How many drug targets are there? Nat Rev Drug Discov. Overcoming the challenges of membrane protein crystallography. Wang G. NMR of membrane-associated peptides and proteins. Curr Protein Pept Sci.

Wiener MC. A pedestrian guide to membrane protein crystallization. The protein data bank. Acta Crystallogr D Biol Crystallogr. White SH. The progress of membrane protein structure determination.

Consequences of membrane protein overexpression in Escherichia coli. Mol Cell Proteomics. Membrane proteins: from sequence to structure. Annu Rev Biophys Biomol Struct. Eshaghi S. High-throughput expression and detergent screening of integral membrane proteins. A high-throughput method for membrane protein solubility screening: the ultracentrifugation dispersity sedimentation assay.

Automatic target selection for structural genomics on eukaryotes. Montelione GT, Anderson S. Structural genomics: keystone for a human proteome project. Rost B. We discuss how our target list may impact structural coverage of the membrane protein space.

Publication types Research Support, N. Substances Membrane Proteins. Filter 2 criteria are not enforced on nominated targets, leaving the decision of whether or not to pursue a given family of proteins to the nominating group. Our experimental pipeline produces large amounts of informative data at different stages, including results of cloning, expression, crystallization, and structure determination stages.

We continue to refine our analysis and plan to use cloning and expression data to develop methods that will enhance the success at every stage of the pipeline. The vectors are available from the PSI materials repository. These vectors have been modified to introduce sites for ligation independent cloning [ 2 ] into both C- and N-terminal tag encoding versions. Accessibility of the tag and expression levels of the fusion protein can vary depending on the location of the tag in an often unpredictable way, thus warranting testing of both orientations.

In both versions of this expression vector, proteins are produced with a FLAG tag for immuno-detection or purification, a deca-histidine tag for metal affinity chromatography, and a tobacco etch virus protease TEV; [ 33 ] recognition site for tag cleavage.

The deca-hisitidine tag enables tight binding to metal-containing resins allowing for the use of high imidazole levels in wash buffers, resulting in higher purity over the more commonly used hexa-histidine version. We have chosen the TEV protease as its activity has proven to be relatively insensitive to detergent [ 30 ].

PCR amplification primers are designed using an automated procedure to amplify the full length coding sequence. Vectors are transformed into a DH10B phage-resistant strain of E. After overnight growth, single colonies are manually picked and grown for automated plasmid purification. These plasmids are sequenced to confirm the integrity and identity of each insert.

The expression constructs are transformed into BL21 DE3 pLysS phage resistant cells and grown in 96 well deep well blocks overnight. Overnight growths are diluted fold into fresh media and grown in shakers at rpm with a 2 mm orbit.

This enables fermentation-like growth conditions to be achieved in deep well block format. Cultures are induced with 0. Cell densities at harvest are typically above 10 OD units at nm OD To test for expression and purification, induced cells are lysed by sonication in deep well blocks, using a robotic system.

This custom apparatus is designed to ensure complete cell lysis without excessive sample heating. Membrane fractions may optionally be isolated by robotically transferring the lysate to 96 ultracentrifuge tubes and pelleting with a high-speed spin. Alternatively, and less frequently, purification results can be evaluated by western blot using anti-FLAG antibodies. Cells were grown in 0. Well-expressed proteins can clearly be identified without western blot or GFP labeling methods.

Clones producing membrane proteins of approximately the correct molecular weight are re-grown at a larger scale prior to detergent stability analysis. Also, SDS-resistant multimers are frequently observed, as is the case for samples in lanes 1, 2, 14, 15, 17 and Samples yielding a band of approximately correct molecular weight and minimal proteolytic breakdown are re-arrayed into new plates.

These targets are transferred to a mid-scale expression and purification platform to produce sufficient protein for detergent selection and stability analysis. Cell pellets are harvested and lysed robotically. The columns are washed and the proteins eluted as described above. Using this protocol, 96 proteins can be purified every 2 days by a single laboratory worker.

DDM is the only detergent employed for solubilization and purification. DDM is considered a relatively mild detergent, however successful crystallization often requires the use of shorter chain detergents. To assess the tolerance of target proteins to shorter chain detergents, the purified proteins are subject to an ad hoc stability assay. Samples are split into aliquots and incubated with a large excess of a second, short chain detergent for 2 h at room temperature.

Subsequently, the samples are clarified by centrifugation, and loaded on a size exclusion chromatography column equilibrated in DDM Fig. Proteins that show a single, symmetrical elution profile after treatment with one or more short chain detergents are prioritized for scale up and crystallization experiments.

Targets suitable for NMR experiments are also screened by size exclusion chromatography, but using a panel of detergents tailored to this method, e.

UV absorbance monitored elution profiles from a size exclusion column for two membrane proteins post detergent stability testing. After a time period, the reactions are clarified to remove large aggregates and the samples are subjected to size exclusion chromatography in a mobile phase containing DDM.

For production of proteins at a scale suitable for structural studies, expression-verified detergent-screened clones are distributed from the Center to the participating research groups. This arrangement recognizes that optimization of protein production, quality, and subsequent steps including crystallization and NMR sample evaluation require an individualized, often time consuming approach.

Expression and purification are based on a standard set of protocols, which can be modified as needed. Typically, proteins are expressed on a scale of more than 1 liter, depending on expression levels. Expression is then allowed to continue for 18 h. After solubilization the solution is clarified by ultracentrifugation, and the fusion protein is purified by metal affinity chromatography, with a washing step with 50—60 mM imidazole, and elution of the fusion protein with mM imidazole.

The affinity tag can then optionally be proteolytically removed with TEV protease, in which case a second pass of the dialyzed sample over metal affinity resin removes the protease, uncleaved fusions, and most contaminants. The protein is concentrated using a Centricon with a YM membrane, and applied to a gel filtration column for further purification. The gel filtration step also serves the purpose of detergent exchange when desired.

Crystallization screening is carried out robotically in well plates. Robotic crystallization allows for rapid parallel screening of multiple parameters, including different substrates, additives, or detergents. NYCOMPS invested in a Mosquito crystallization robot, whose positive displacement mode of action is well suited to working with detergent solubilized membrane proteins. Once leads are discovered using commercial sparse matrix screens, optimization of crystallization conditions is carried out with the vapor diffusion technique in well plates.

Following the protein, the detergent is one of the most important parameters determining the success of crystallization. Therefore, we carry out crystallization experiments in as many short chain detergents validated by the stability assay. NYCOMPS has also designed and built an economical lipidic cubic phase dispensing robot that is available for setting up crystal trials in meso.

We have also been utilizing the excellent crystallization service provided by the Center for High-Throughput Structural Biology CHTSB which conducts crystallization trials on a 1, experiment scale, under oil in batch mode [ 25 ]. X-ray diffraction pattern of a membrane protein crystal. The highest resolution spots are visible to 2. The resolution of the edge of the screen is indicated. NMR is used on a specifically-selected set of small under 20 kDa target proteins, as well as on slightly larger proteins that behave favorably through the stage of detergent screening, but fail to crystallize.

Additional optimizations, if needed and warranted, include pH ranging from 5. The uniformly 2 H 13 C 15 N labeled samples are prepared by expression using a deuterated carbon source and D 2 O, and backbone resonances are assigned using TROSY-based triple resonance methods [ 38 ]. Long range side chain distance constraints are partially recovered by the reintroduction of protons in side chain methyl and aromatic groups [ 23 ], which can be supplemented by other constraint types.

Signals from slowly exchanging amides can sometimes be recovered by an extended incubation in a harsher detergent during the first purification steps.

If complete back exchange is not possible due to protein instability or hyper-stability, separate samples are prepared to selectively examine the slowly exchanging and more rapidly exchanging regions of the protein [ 7 ]. Our major focus has been on using structures of membrane proteins to obtain mechanistic insights.

Since, to date, most solved membrane protein structures have been of bacterial proteins, it has been necessary to use homology modeling to obtain structures for and infer function for human proteins.

To determine the validity of modeling methods, we established a database HOMEP of homologous pairs of structures of integral membrane proteins [ 16 ].

HOMEP is particularly useful resource for testing structure prediction methods for membrane proteins since one member can be used as a template for predicting the structure of the other member, and vice versa. We used HOMEP to compared various sequence alignment approaches for membrane proteins and observed that high-level profile-based sequence alignment methods offer significant improvements over existing methods that have been applied to membrane proteins [ 16 ].



0コメント

  • 1000 / 1000