Comparative Protein Structure Modeling Using MODELLER

互联网2013-12-31

4074

Abstract
Table of Contents
Figures
Literature Cited

Abstract

Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three?dimensional (3?D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3?D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3?D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target?template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. Curr. Protoc. Protein Sci. 50:2.9.1?2.9.31. © 2007 by John Wiley & Sons, Inc.

Keywords: Modeller; protein structure; comparative modeling; structure prediction; protein fold

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Basic Protocol 1: Modeling Lactate Dehydrogenase from Trichomonas vaginalis (TvLDH) Based on a Single Template Using Modeller
Support Protocol 1: Obtaining and Installing Modeller
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 2.9.1 Steps in comparative protein structure modeling. See text for details.

View Image
Figure 2.9.2 File TvLDH.ali. Sequence file in PIR format.

View Image

Figure 2.9.3 File build_profile.py. Input script file that searches for templates against a database of nonredundant PDB sequences.

View Image

Figure 2.9.4 An excerpt from the file build_profile.prf. The aligned sequences have been removed for convenience.

View Image

Figure 2.9.5 Script file compare.py.

View Image
Figure 2.9.6 Excerpts from the log file compare.log.

View Image

Figure 2.9.7 The script file align2d.py, used to align the target sequence against the template structure.

View Image

Figure 2.9.8 The alignment between sequences TvLDH and 1bdmA, in the MODELLER PAP format. File TvLDH‐1bmdA.pap.

View Image

Figure 2.9.9 Script file, model‐single.py, that generates five models.

View Image
Figure 2.9.10 File evaluate_model.py, used to generate a pseudo‐energy profile for the model.

View Image

Figure 2.9.11 A comparison of the pseudo‐energy profiles of the model (red) and the template (green) structures.

View Image

Figure 2.9.12 Typical errors in comparative modeling. (A ) Errors in side chain packing. The Trp 109 residue in the crystal structure of mouse cellular retinoic acid binding protein I (red) is compared with its model (green). (B ) Distortions and shifts in correctly aligned regions. A region in the crystal structure of mouse cellular retinoic acid binding protein I (red) is compared with its model (green) and with the template fatty acid binding protein (blue). (C ) Errors in regions without a template. The C^α trace of the 112–117 loop is shown for the X‐ray structure of human eosinophil neurotoxin (red), its model (green), and the template ribonuclease A structure (residues 111–117; blue). (D ) Errors due to misalignments. The N‐terminal region in the crystal structure of human eosinophil neurotoxin (red) is compared with its model (green). The corresponding region of the alignment with the template ribonuclease A is shown. The red lines show correct equivalences, that is, residues whose C^α atoms are within 5 Å of each other in the optimal least‐squares superposition of the two X‐ray structures. The “a” characters in the bottom line indicate helical residues and “b” characters, the residues in sheets. (E ) Errors due to an incorrect template. The X‐ray structure of α‐trichosanthin (red) is compared with its model (green) that was calculated using indole‐3‐glycerophosphate synthase as the template.

View Image

Figure 2.9.13 Accuracy and application of protein structure models. The vertical axis indicates the different ranges of applicability of comparative protein structure modeling, the corresponding accuracy of protein structure models, and their sample applications. (A ) The docosahexaenoic fatty acid ligand (violet) was docked into a high accuracy comparative model of brain lipid‐binding protein (right), modeled based on its 62% sequence identity to the crystallographic structure of adipocyte lipid‐binding protein (PDB code 1adl ). A number of fatty acids were ranked for their affinity to brain lipid‐binding protein consistently with site‐directed mutagenesis and affinity chromatography experiments (Xu et al., ), even though the ligand specificity profile of this protein is different from that of the template structure. Typical overall accuracy of a comparative model in this range of sequence similarity is indicated by a comparison of a model for adipocyte fatty acid binding protein with its actual structure (left). (B ) A putative proteoglycan binding patch was identified on a medium‐accuracy comparative model of mouse mast cell protease 7 (right), modeled based on its 39% sequence identity to the crystallographic structure of bovine pancreatic trypsin ( 2ptn ) that does not bind proteoglycans. The prediction was confirmed by site‐directed mutagenesis and heparin‐affinity chromatography experiments (Matsumoto et al., ). Typical accuracy of a comparative model in this range of sequence similarity is indicated by a comparison of a trypsin model with the actual structure. (C ) A molecular model of the whole yeast ribosome (right) was calculated by fitting atomic rRNA and protein models into the electron density of the 80S ribosomal particle, obtained by electron microscopy at 15 Å resolution (Spahn et al., ). Most of the models for 40 out of the 75 ribosomal proteins were based on template structures that were approximately 30% sequentially identical. Typical accuracy of a comparative model in this range of sequence similarity is indicated by a comparison of a model for a domain in L2 protein from B. stearothermophilus with the actual structure ( 1rl2 ).

View Image

Videos

Literature Cited

	Abagyan, R. and Totrov, M. 1994. Biased probability Monte Carlo conformational searches and electrostatic calculations for peptides and proteins. J. Mol. Biol. 235:983‐1002.
	Alexandrov, N.N., Nussinov, R., and Zimmer, R.M. 1996. Fast protein fold recognition via sequence to structure alignment and contact capacity potentials. Pac. Symp. Biocomput. 1996:53‐72.
	Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI‐BLAST: A new generation of protein database search programs. Nucl. Acids Res. 25:3389‐3402.
	Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. 2004. SCOP database in 2004: Refinements integrate structure and sequence family data. Nucl. Acids Res. 32:D226‐D229.
	Aszodi, A. and Taylor, W.R. 1994. Secondary structure formation in model polypeptide chains. Protein Eng. 7:633‐644.
	Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O'Donovan, C., Redaschi, N., and Yeh, L.S. 2005. The Universal Protein Resource (UniProt). Nucl. Acids Res. 33:D154‐D159.
	Baker, D. and Sali, A. 2001. Protein structure prediction and structural genomics. Science 294:93‐96.
	Barton, G.J. and Sternberg, M.J. 1987. A strategy for the rapid multiple alignment of protein sequences: Confidence levels from tertiary structure comparisons. J. Mol. Biol. 198:327‐337.
	Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths‐Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. 2004. The Pfam protein families database. Nucl. Acids Res. 32:D138‐D141.
	Bates, P.A., Kelley, L.A., MacCallum, R.M., and Sternberg, M.J. 2001. Enhancement of protein modeling by human intervention in applying the automatic programs 3D‐JIGSAW and 3D‐PSSM. Proteins 5:39‐46.
	Benson, D.A., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., and Wheeler, D.L. 2005. GenBank. Nucl. Acids Res. 33:D34‐D38.
	Blundell, T.L., Sibanda, B.L., Sternberg, M.J., and Thornton, J.M. 1987. Knowledge‐based prediction of protein structures and the design of novel molecules. Nature 326:347‐352.
	Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O'Donovan, C., Phan, I., Pilbout, S., and Schneider, M. 2003. The SWISS‐PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucl. Acids Res. 31:365‐370.
	Boissel, J.P., Lee, W.R., Presnell, S.R., Cohen, F.E., and Bunn, H.F. 1993. Erythropoietin structure‐function relationships: Mutant proteins that test a model of tertiary structure. J. Biol. Chem. 268:15983‐15993.
	Bowie, J.U., Luthy, R., and Eisenberg, D. 1991. A method to identify protein sequences that fold into a known three‐dimensional structure. Science 253:164‐170.
	Braun, W. and Go, N. 1985. Calculation of protein conformations by proton‐proton distance constraints: A new efficient algorithm. J. Mol. Biol. 186:611‐626.
	Brenner, S.E., Chothia, C., and Hubbard, T.J. 1998. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl. Acad. Sci. U.S.A. 95:6073‐6078.
	Browne, W.J., North, A.C., Phillips, D.C., Brew, K., Vanaman, T.C., and Hill, R.L. 1969. A possible three‐dimensional structure of bovine alpha‐lactalbumin based on that of hen's egg‐white lysozyme. J. Mol. Biol. 42:65‐86.
	Bruccoleri, R.E. and Karplus, M. 1987. Prediction of the folding of short polypeptide segments by uniform conformational sampling. Biopolymers 26:137‐168.
	Bruccoleri, R.E. and Karplus, M. 1990. Conformational sampling using high‐temperature molecular dynamics. Biopolymers 29:1847‐1862.
	Bujnicki, J.M., Elofsson, A., Fischer, D., and Rychlewski, L. 2001. LiveBench‐1: Continuous benchmarking of protein structure prediction servers. Protein Sci. 10:352‐361.
	Bystroff, C. and Baker, D. 1998. Prediction of local structure in proteins using a library of sequence‐structure motifs. J. Mol. Biol. 281:565‐577.
	Canutescu, A.A., Shelenkov, A.A., and Dunbrack, R.L. Jr. 2003. A graph‐theory algorithm for rapid protein side‐chain prediction. Protein Sci. 12:2001‐2014.
	Chinea, G., Padron, G., Hooft, R.W., Sander, C., and Vriend, G. 1995. The use of position‐specific rotamers in model building by homology. Proteins 23:415‐421.
	Chothia, C. and Lesk, A.M. 1987. Canonical structures for the hypervariable regions of immunoglobulins. J. Mol. Biol. 196:901‐917.
	Chothia, C., Lesk, A.M., Tramontano, A., Levitt, M., Smith‐Gill, S.J., Air, G., Sheriff, S., Padlan, E.A., Davies, D., Tulip, W.R., Colman, P.M., Spinelli, S., Alzari, P.M., and Poljak, J. 1989. Conformations of immunoglobulin hypervariable regions. Nature 342:877‐883.
	Claessens, M., Van Cutsem, E., Lasters, I., and Wodak, S. 1989. Modelling the polypeptide backbone with ‘spare parts' from known protein structures. Protein Eng. 2:335‐345.
	Claude, J.B., Suhre, K., Notredame, C., Claverie, J.M., and Abergel, C. 2004. CaspR: A web server for automated molecular replacement using homology modelling. Nucl. Acids Res. 32:W606‐W609.
	Clore, G.M., Brunger, A.T., Karplus, M., and Gronenborn, A.M. 1986. Application of molecular dynamics with interproton distance restraints to three‐dimensional protein structure determination: A model study of crambin. J. Mol. Biol. 191:523‐551.
	Cohen, F.E., Gregoret, L., Presnell, S.R., and Kuntz, I.D. 1989. Protein structure predictions: New theoretical approaches. Prog. Clin. Biol. Res. 289:75‐85.
	Collura, V., Higo, J., and Garnier, J. 1993. Modeling of protein loops by simulated annealing. Protein Sci. 2:1502‐1510.
	Colovos, C. and Yeates, T.O. 1993. Verification of protein structures: Patterns of nonbonded atomic interactions. Protein Sci. 2:1511‐1519.
	Corpet, F. 1988. Multiple sequence alignment with hierarchical clustering. Nucl. Acids Res. 16:10881‐10890.
	Deane, C.M. and Blundell, T.L. 2001. CODA: A combined algorithm for predicting the structurally variable regions of protein models. Protein Sci. 10:599‐612.
	de Bakker, P.I., DePristo, M.A., Burke, D.F., and Blundell, T.L. 2003. Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all‐atom statistical potential and the AMBER force field with the Generalized Born solvation model. Proteins 51:21‐40.
	DePristo, M.A., de Bakker, P.I., Lovell, S.C., and Blundell, T.L. 2003. Ab initio construction of polypeptide fragments: Efficient generation of accurate, representative ensembles. Proteins 51:41‐55.
	Deshpande, N., Addess, K.J., Bluhm, W.F., Merino‐Ott, J.C., Townsend‐Merino, W., Zhang, Q., Knezevich, C., Xie, L., Chen, L., Feng, Z., Green, R.K., Flippen‐Anderson, J.L., Westbrook, J., Berman, H.M., and Bourne, P.E. 2005. The RCSB Protein Data Bank: A redesigned query system and relational database based on the mmCIF schema. Nucl. Acids Res. 33:D233‐D237.
	Dietmann, S., Park, J., Notredame, C., Heger, A., Lappe, M., and Holm, L. 2001. A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3. Nucl. Acids Res. 29:55‐57.
	Eddy, S.R. 1998. Profile hidden Markov models. Bioinformatics 14:755‐763.
	Edgar, R.C. 2004. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32:1792‐1797.
	Edgar, R.C. and Sjolander, K. 2004. A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 20:1301‐1308.
	Enyedy, I.J., Ling, Y., Nacro, K., Tomita, Y., Wu, X., Cao, Y., Guo, R., Li, B., Zhu, X., Huang, Y., Long, Y.Q., Roller, P.P., Yang, D., and Wang, S. 2001. Discovery of small‐molecule inhibitors of Bcl‐2 through structure‐based computer screening. J. Med. Chem. 44:4313‐4324.
	Eswar, N., John, B., Mirkovic, N., Fiser, A., Ilyin, V.A., Pieper, U., Stuart, A.C., Marti‐Renom, M.A., Madhusudhan, M.S., Yerkovich, B., and Sali, A. 2003. Tools for comparative protein structure modeling and analysis. Nucl. Acids Res. 31:3375‐3380.
	Eyrich, V.A., Marti‐Renom, M.A., Przybylski, D., Madhusudhan, M.S., Fiser, A., Pazos, F., Valencia, A., Sali, A., and Rost, B. 2001. EVA: Continuous automatic evaluation of protein structure prediction servers. Bioinformatics 17:1242‐1243.
	Felsenstein, J. 1989. PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5:164‐166.
	Felts, A.K., Gallicchio, E., Wallqvist, A., and Levy, R.M. 2002. Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all‐atom force field and the surface generalized born solvent model. Proteins 48:404‐422.
	Fernandez‐Fuentes, N., Oliva, B., and Fiser, A. 2006. A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucl. Acids Res. 34:2085‐2097.
	Fidelis, K., Stern, P.S., Bacon, D., and Moult, J. 1994. Comparison of systematic search and database methods for constructing segments of protein structure. Protein Eng. 7:953‐960.
	Fine, R.M., Wang, H., Shenkin, P.S., Yarmush, D.L., and Levinthal, C. 1986. Predicting antibody hypervariable loop conformations. II: Minimization and molecular dynamics studies of MCPC603 from many randomly generated loop conformations. Proteins 1:342‐362.
	Fischer, D. 2006. Servers for protein structure prediction. Curr. Opin. Struct. Biol. 16:178‐182.
	Fischer, D., Elofsson, A., Rychlewski, L., Pazos, F., Valencia, A., Rost, B., Ortiz, A.R., and Dunbrack, R.L. Jr., 2001. CAFASP2: The second critical assessment of fully automated structure prediction methods. Proteins 5:171‐183.
	Fiser, A. 2004. Protein structure modeling in the proteomics era. Expert Rev. Proteomics 1:97‐110.
	Fiser, A. and Sali, A. 2003a. Modeller: Generation and refinement of homology‐based protein structure models. Methods Enzymol. 374:461‐491.
	Fiser, A. and Sali, A. 2003b. ModLoop: Automated modeling of loops in protein structures. Bioinformatics 19:2500‐2501.
	Fiser, A., Do, R.K., and Sali, A. 2000. Modeling of loops in protein structures. Protein Sci. 9:1753‐1773.
	Fiser, A., Feig, M., Brooks, C.L. 3rd, and Sali, A. 2002. Evolution and physics in comparative protein structure modeling. Acc. Chem. Res. 35:413‐421.
	Gao, H., Sengupta, J., Valle, M., Korostelev, A., Eswar, N., Stagg, S.M., Van Roey, P., Agrawal, R.K., Harvey, S.C., Sali, A., Chapman, M.S., and Frank, J. 2003. Study of the structural dynamics of the E coli 70S ribosome using real‐space refinement. Cell 113:789‐801.
	Godzik, A. 2003. Fold recognition methods. Methods Biochem. Anal. 44:525‐546.
	Gough, J., Karplus, K., Hughey, R., and Chothia, C. 2001. Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. Biol. 313:903‐919.
	Greer, J. 1981. Comparative model‐building of the mammalian serine proteases. J. Mol. Biol. 153:1027‐1042.
	Gribskov, M., McLachlan, A.D., and Eisenberg, D. 1987. Profile analysis: Detection of distantly related proteins. Proc. Natl. Acad. Sci. U.S.A. 84:4355‐4358.
	Havel, T.F. and Snow, M.E. 1991. A new method for building protein conformations from sequence alignments with homologues of known structure. J. Mol. Biol. 217:1‐7.
	Henikoff, J.G. and Henikoff, S. 1996. Using substitution probabilities to improve position‐specific scoring matrices. Comput. Appl. Biosci. 12:135‐143.
	Henikoff, J.G., Pietrokovski, S., McCallum, C.M., and Henikoff, S. 2000. Blocks‐based methods for detecting protein homology. Electrophoresis 21:1700‐1706.
	Henikoff, S. and Henikoff, J.G. 1994. Position‐based sequence weights. J. Mol. Biol. 243:574‐578.
	Higo, J., Collura, V., and Garnier, J. 1992. Development of an extended simulated annealing method: Application to the modeling of complementary determining regions of immunoglobulins. Biopolymers 32:33‐43.
	Holm, L. and Sander, C. 1991. Database algorithm for generating protein backbone and side‐chain co‐ordinates from a C alpha trace application to model building and detection of co‐ordinate errors. J. Mol. Biol. 218:183‐194.
	Hooft, R.W., Vriend, G., Sander, C., and Abola, E.E. 1996. Errors in protein structures. Nature 381:272.
	Howell, P.L., Almo, S.C., Parsons, M.R., Hajdu, J., and Petsko, G.A. 1992. Structure determination of turkey egg‐white lysozyme using Laue diffraction data. Acta Crystallogr. B 48:200‐207.
	Jacobson, M.P., Pincus, D.L., Rapp, C.S., Day, T.J., Honig, B., Shaw, D.E., and Friesner, R.A. 2004. A hierarchical approach to all‐atom protein loop prediction. Proteins 55:351‐367.
	Jaroszewski, L., Rychlewski, L., Li, Z., Li, W., and Godzik, A. 2005. FFAS03: A server for profile–profile sequence alignments. Nucl. Acids Res. 33:W284‐W288.
	John, B. and Sali, A. 2003. Comparative protein structure modeling by iterative alignment, model building and model assessment. Nucl. Acids Res. 31:3982‐3992.
	Jones, D.T. 1999. GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287:797‐815.
	Jones, D.T. 2001. Evaluating the potential of using fold‐recognition models for molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 57:1428‐1434.
	Jones, D.T., Taylor, W.R., and Thornton, J.M. 1992. A new approach to protein fold recognition. Nature 358:86‐89.
	Jones, T.A. and Thirup, S. 1986. Using known substructures in protein model building and crystallography. Embo J. 5:819‐822.
	Kabsch, W. and Sander, C. 1984. On the use of sequence homologies to predict protein structure: Identical pentapeptides can have completely different conformations. Proc. Natl. Acad. Sci. U.S.A. 81:1075‐1078.
	Kahsay, R.Y., Wang, G., Dongre, N., Gao, G., and Dunbrack, R.L. Jr. 2002. CASA: A server for the critical assessment of protein sequence alignment accuracy. Bioinformatics 18:496‐497.
	Karchin, R., Cline, M., Mandel‐Gutfreund, Y., and Karplus, K. 2003. Hidden Markov models that use predicted local structure for fold recognition: Alphabets of backbone geometry. Proteins 51:504‐514.
	Karchin, R., Diekhans, M., Kelly, L., Thomas, D.J., Pieper, U., Eswar, N., Haussler, D., and Sali, A. 2005. LS‐SNP: Large‐scale annotation of coding non‐synonymous SNPs based on multiple information sources. Bioinformatics 21:2814‐2820.
	Karplus, K., Barrett, C., and Hughey, R. 1998. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14:846‐856.
	Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel‐Gutfreund, Y., Diekhans, M., and Hughey, R. 2003. Combining local‐structure, fold‐recognition, and new fold methods for protein structure prediction. Proteins 53:491‐496.
	Kelley, L.A., MacCallum, R.M., and Sternberg, M.J. 2000. Enhanced genome annotation using structural profiles in the program 3D‐PSSM. J. Mol. Biol. 299:499‐520.
	Koehl, P. and Delarue, M. 1995. A self consistent mean field approach to simultaneous gap closure and side‐chain positioning in homology modelling. Nat. Struct. Biol. 2:163‐170.
	Koh, I.‐Y.Y., Eyrich, V.A., Marti‐Renom, M.A., Przybylski, D., Madhusudhan, M.S., Narayanan, E., Grana, O., Pazos, F., Valencia, A., Sali, A., and Rost, B. 2003. EVA: Evaluation of protein structure prediction servers. Nucl. Acids Res. 31:3311‐3315.
	Krogh, A., Brown, M., Mian, I.S., Sjolander, K., and Haussler, D. 1994. Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235:1501‐1531.
	Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. 1993. PROCHECK: A program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26:283‐291.
	Laskowski, R.A., Rullmannn, J.A., MacArthur, M.W., Kaptein, R., and Thornton, J.M. 1996. AQUA and PROCHECK‐NMR: Programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR 8:477‐486.
	Laskowski, R.A., MacArthur, M.W., and Thornton, J.M. 1998. Validation of protein models derived from experiment. Curr. Opin. Struct. Biol. 8:631‐639.
	Lessel, U. and Schomburg, D. 1994. Similarities between protein 3‐D structures. Protein Eng. 7:1175‐1187.
	Levitt, M. 1992. Accurate modeling of protein conformation by automatic segment matching. J. Mol. Biol. 226:507‐533.
	Li, R., Chen, X., Gong, B., Selzer, P.M., Li, Z., Davidson, E., Kurzban, G., Miller, R.E., Nuzum, E.O., McKerrow, J.H., Fletterick, R.J., Gillmor, S.A., Craik, C.S., Kuntz, I.D., Cohen, F.E., and Kenyon, G.L. 1996. Structure‐based design of parasitic protease inhibitors. Bioorg. Med. Chem. 4:1421‐1427.
	Lin, J., Qian, J., Greenbaum, D., Bertone, P., Das, R., Echols, N., Senes, A., Stenger, B., and Gerstein, M. 2002. GeneCensus: Genome comparisons in terms of metabolic pathway activity and protein family sharing. Nucl. Acids Res. 30:4574‐4582.
	Lindahl, E. and Elofsson, A. 2000. Identification of related proteins on family, superfamily and fold level. J. Mol. Biol. 295:613‐625.
	Luthy, R., Bowie, J.U., and Eisenberg, D. 1992. Assessment of protein models with three‐dimensional profiles. Nature 356:83‐85.
	MacKerell, A.D. Jr., Bashford, D., Bellott, M., Dunbrack, R.L. Jr., Evanseck, J.D., Field, M.J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph‐McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F.T.K., Mattos, C., Michnick, S., Ngo, T., Nguyen, D.T., Prodhom, B., Reiher, W.E. III, Roux, B., Schlenkrich, M., Smith, J.C., Stote, R., Straub, J., Watanabe, M., Wiórkiewicz‐Kuczera, J., Yin, D., and Karplus, M. 1998. All‐atom empirical potential for molecular modleing and dynamics studies of proteins. J. Phys. Chem. B 102:3586‐3616.
	Madhusudhan, M.S., Marti‐Renom, M.A., Sanchez, R., and Sali, A. 2006. Variable gap penalty for protein sequence‐structure alignment. Protein Eng. Des. Sel. 19:129‐133.
	Mallick, P., Weiss, R., and Eisenberg, D. 2002. The directional atomic solvation energy: An atom‐based potential for the assignment of protein sequences to known folds. Proc. Natl. Acad. Sci. U.S.A. 99:16041‐16046.
	Marti‐Renom, M.A., Stuart, A.C., Fiser, A., Sanchez, R., Melo, F., and Sali, A. 2000. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29:291‐325.
	Marti‐Renom, M.A., Ilyin, V.A., and Sali, A. 2001. DBAli: A database of protein structure alignments. Bioinformatics 17:746‐747.
	Marti‐Renom, M.A., Madhusudhan, M.S., Fiser, A., Rost, B., and Sali, A. 2002. Reliability of assessment of protein structure prediction methods. Structure (Camb) 10:435‐440.
	Marti‐Renom, M.A., Madhusudhan, M.S., and Sali, A. 2004. Alignment of protein sequences by their profiles. Protein Sci. 13:1071‐1087.
	Matsumoto, R., Sali, A., Ghildyal, N., Karplus, M., and Stevens, R.L. 1995. Packaging of proteases and proteoglycans in the granules of mast cells and other hematopoietic cells. A cluster of histidines on mouse mast cell protease 7 regulates its binding to heparin serglycin proteoglycans. J. Biol. Chem. 270:19524‐19531.
	McGuffin, L.J. and Jones, D.T. 2003. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19:874‐881.
	McGuffin, L.J., Bryson, K., and Jones, D.T. 2000. The PSIPRED protein structure prediction server. Bioinformatics 16:404‐405.
	Melo, F. and Feytmans, E. 1998. Assessing protein structures with a non‐local atomic interaction energy. J. Mol. Biol. 277:1141‐1152.
	Melo, F., Sanchez, R., and Sali, A. 2002. Statistical potentials for fold assessment. Protein Sci. 11:430‐448.
	Mezei, M. 1998. Chameleon sequences in the PDB. Protein Eng. 11:411‐414.
	Mirkovic, N., Marti‐Renom, M.A., Sali, A., and Monteiro, A.N.A. 2004. Structure‐based assessment of missence mutations in human BRCA1: Implications for breast and ovarian cancer predisposition. Cancer Res. 64:3790‐3797.
	Misura, K.M. and Baker, D. 2005. Progress and challenges in high‐resolution refinement of protein structure models. Proteins 59:15‐29.
	Misura, K.M., Chivian, D., Rohl, C.A., Kim, D.E., and Baker, D. 2006. Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proc. Natl. Acad. Sci. U.S.A. 103:5361‐5366.
	Miwa, J.M., Ibanez‐Tallon, I., Crabtree, G.W., Sanchez, R., Sali, A., Role, L.W., and Heintz, N. 1999. lynx1, an endogenous toxin‐like modulator of nicotinic acetylcholine receptors in the mammalian CNS. Neuron 23:105‐114.
	Modi, S., Paine, M.J., Sutcliffe, M.J., Lian, L.Y., Primrose, W.U., Wolf, C.R., and Roberts, G.C. 1996. A model for human cytochrome P450 2D6 based on homology modeling and NMR studies of substrate binding. Biochemistry 35:4540‐4550.
	Moult, J. 2005. A decade of CASP: Progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15:285‐289.
	Moult, J. and James, M.N. 1986. An algorithm for determining the conformation of polypeptide segments in proteins by systematic search. Proteins 1:146‐163.
	Moult, J., Fidelis, K., Zemla, A., and Hubbard, T. 2003. Critical assessment of methods of protein structure prediction (CASP)‐round V. Proteins 53:334‐339.
	Moult, J., Fidelis, K., Rost, B., Hubbard, T., and Tramontano, A. 2005. Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins 61:3‐7.
	Nagarajaram, H.A., Reddy, B.V., and Blundell, T.L. 1999. Analysis and prediction of inter‐strand packing distances between beta‐sheets of globular proteins. Protein Eng. 12:1055‐1062.
	Needleman, S.B. and Wunsch, C.D. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48:443‐453.
	Notredame, C., Higgins, D.G., and Heringa, J. 2000. T‐Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205‐217.
	Ohlson, T., Wallner, B., and Elofsson, A. 2004. Profile‐profile methods provide improved fold‐recognition: A study of different profile‐profile alignment methods. Proteins 57:188‐197.
	Oldfield, T.J. 1992. SQUID: A program for the analysis and display of data from crystallography and molecular dynamics. J. Mol. Graph. 10:247‐252.
	Oliva, B., Bates, P.A., Querol, E., Aviles, F.X., and Sternberg, M.J. 1997. An automated classification of the structure of protein loops. J. Mol. Biol. 266:814‐830.
	Panchenko, A.R. 2003. Finding weak similarities between proteins by sequence profile comparison. Nucl. Acids Res. 31:683‐689.
	Park, J., Karplus, K., Barrett, C., Hughey, R., Haussler, D., Hubbard, T., and Chothia, C. 1998. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. Biol. 284:1201‐1210.
	Pawlowski, K., Bierzynski, A., and Godzik, A. 1996. Structural diversity in a family of homologous proteins. J. Mol. Biol. 258:349‐366.
	Pearl, F., Todd, A., Sillitoe, I., Dibley, M., Redfern, O., Lewis, T., Bennett, C., Marsden, R., Grant, A., Lee, D., Akpor, A., Maibaum, M., Harrison, A., Dallman, T., Reeves, G., Diboun, I., Addou, S., Lise, S., Johnston, C., Sillero, A., Thornton, J., and Orengo, C. 2005. The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucl. Acids Res. 33:D247‐D251.
	Pearson, W.R. 1994. Using the FASTA program to search protein and DNA sequence databases. Methods Mol. Biol. 24:307‐331.
	Pearson, W.R. 2000. Flexible sequence similarity searching with the FASTA3 program package. Methods Mol. Biol. 132:185‐219.
	Petrey, D. and Honig, B. 2005. Protein structure prediction: Inroads to biology. Mol. Cell. 20:811‐819.
	Petrey, D., Xiang, Z., Tang, C.L., Xie, L., Gimpelev, M., Mitros, T., Soto, C.S., Goldsmith‐Fischman, S., Kernytsky, A., Schlessinger, A., Koh, I.Y., Alexov, E., and Honig, B. 2003. Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling. Proteins 53:430‐435.
	Pieper, U., Eswar, N., Braberg, H., Madhusudhan, M.S., Davis, F.P., Stuart, A.C., Mirkovic, N., Rossi, A., Marti‐Renom, M.A., Fiser, A., Webb, B., Greenblatt, D., Huang, C.C., Ferrin, T.E., and Sali, A. 2004. MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucl. Acids Res. 32:D217‐D222.
	Pieper, U., Eswar, N., Davis, F.P., Braberg, H., Madhusudhan, M.S., Rossi, A., Marti‐Renom, M., Karchin, R., Webb, B.M., Eramian, D., Shen, M.Y., Kelly, L., Melo, F., and Sali, A. 2006. MODBASE: A database of annotated comparative protein structure models and associated resources. Nucl. Acids Res. 34:D291‐D295.
	Pietrokovski, S. 1996. Searching databases of conserved sequence regions by aligning protein multiple‐alignments. Nucl. Acids Res. 24:3836‐3845.
	Pontius, J., Richelle, J., and Wodak, S.J. 1996. Deviations from standard atomic volumes as a quality measure for protein crystal structures. J. Mol. Biol. 264:121‐136.
	Que, X., Brinen, L.S., Perkins, P., Herdman, S., Hirata, K., Torian, B.E., Rubin, H., McKerrow, J.H., and Reed, S.L. 2002. Cysteine proteinases from distinct cellular compartments are recruited to phagocytic vesicles by Entamoeba histolytica. Mol. Biochem. Parasitol. 119:23‐32.
	Ring, C.S., Kneller, D.G., Langridge, R., and Cohen, F.E. 1992. Taxonomy and conformational analysis of loops in proteins. J. Mol. Biol. 224:685‐699.
	Ring, C.S., Sun, E., McKerrow, J.H., Lee, G.K., Rosenthal, P.J., Kuntz, I.D., and Cohen, F.E. 1993. Structure‐based inhibitor design by using protein models for the development of antiparasitic agents. Proc. Natl. Acad. Sci. U.S.A. 90:3583‐3587.
	Rost, B. 1999. Twilight zone of protein sequence alignments. Protein Eng. 12:85‐94.
	Rost, B. and Liu, J. 2003. The PredictProtein server. Nucl. Acids Res. 31:3300‐3304.
	Rufino, S.D., Donate, L.E., Canard, L.H., and Blundell, T.L. 1997. Predicting the conformational class of short and medium size loops connecting regular secondary structures: Application to comparative modelling. J. Mol. Biol. 267:352‐367.
	Rychlewski, L. and Fischer, D. 2005. LiveBench‐8: The large‐scale, continuous assessment of automated protein structure prediction. Protein Sci. 14:240‐245.
	Rychlewski, L., Zhang, B., and Godzik, A. 1998. Fold and function predictions for Mycoplasma genitalium proteins. Fold Des. 3:229‐238.
	Sadreyev, R. and Grishin, N. 2003. COMPASS: A tool for comparison of multiple protein alignments with assessment of statistical significance. J. Mol. Biol. 326:317‐336.
	Sali, A. and Blundell, T.L. 1993. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234:779‐815.
	Sali, A. and Overington, J.P. 1994. Derivation of rules for comparative protein modeling from a database of protein structure alignments. Protein Sci. 3:1582‐1596.
	Samudrala, R. and Moult, J. 1998. A graph‐theoretic algorithm for comparative modeling of protein structure. J. Mol. Biol. 279:287‐302.
	Sanchez, R. and Sali, A. 1997a. Advances in comparative protein‐structure modelling. Curr. Opin. Struct. Biol. 7:206‐214.
	Sanchez, R. and Sali, A. 1997b. Evaluation of comparative protein structure modeling by MODELLER‐3. Proteins 1:50‐58.
	Sanchez, R. and Sali, A. 1998. Large‐scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc. Natl. Acad. Sci. U.S.A. 95:13597‐13602.
	Saqi, M.A., Russell, R.B., and Sternberg, M.J. 1998. Misleading local sequence alignments: Implications for comparative protein modelling. Protein Eng. 11:627‐630.
	Sauder, J.M., Arthur, J.W., and Dunbrack, R.L. Jr. 2000. Large‐scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40:6‐22.
	Schwarzenbacher, R., Godzik, A., Grzechnik, S.K., and Jaroszewski, L. 2004. The importance of alignment accuracy for molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 60:1229‐1236.
	Schwede, T., Kopp, J., Guex, N., and Peitsch, M.C. 2003. SWISS‐MODEL: An automated protein homology‐modeling server. Nucl. Acids Res. 31:3381‐3385.
	Selzer, P.M., Chen, X., Chan, V.J., Cheng, M., Kenyon, G.L., Kuntz, I.D., Sakanari, J.A., Cohen, F.E., and McKerrow, J.H. 1997. Leishmania major: Molecular modeling of cysteine proteases and prediction of new nonpeptide inhibitors. Exp. Parasitol. 87:212‐221.
	Sheng, Y., Sali, A., Herzog, H., Lahnstein, J., and Krilis, S.A. 1996. Site‐directed mutagenesis of recombinant human beta 2‐glycoprotein I identifies a cluster of lysine residues that are critical for phospholipid binding and anti‐cardiolipin antibody activity. J. Immunol. 157:3744‐3751.
	Shenkin, P.S., Yarmush, D.L., Fine, R.M., Wang, H.J., and Levinthal, C. 1987. Predicting antibody hypervariable loop conformation. I. Ensembles of random conformations for ringlike structures. Biopolymers 26:2053‐2085.
	Shi, J., Blundell, T.L., and Mizuguchi, K. 2001. FUGUE: Sequence‐structure homology recognition using environment‐specific substitution tables and structure‐dependent gap penalties. J. Mol. Biol. 310:243‐257.
	Sibanda, B.L., Blundell, T.L., and Thornton, J.M. 1989. Conformation of beta‐hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J. Mol. Biol. 206:759‐777.
	Sippl, M.J. 1990. Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge‐based prediction of local structures in globular proteins. J. Mol. Biol. 213:859‐883.
	Sippl, M.J. 1993. Recognition of errors in three‐dimensional structures of proteins. Proteins 17:355‐362.
	Sippl, M.J. 1995. Knowledge‐based potentials for proteins. Curr. Opin. Struct. Biol. 5:229‐235.
	Skolnick, J. and Kihara, D. 2001. Defrosting the frozen approximation: PROSPECTOR–a new approach to threading. Proteins 42:319‐331.
	Smith, T.F. and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195‐197.
	Spahn, C.M., Beckmann, R., Eswar, N., Penczek, P.A., Sali, A., Blobel, G., and Frank, J. 2001. Structure of the 80S ribosome from Saccharomyces cerevisiae–tRNA‐ribosome and subunit‐subunit interactions. Cell 107:373‐386.
	Srinivasan, N. and Blundell, T.L. 1993. An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. Protein Eng. 6:501‐512.
	Sutcliffe, M.J., Haneef, I., Carney, D., and Blundell, T.L. 1987a. Knowledge based modelling of homologous proteins, Part I: Three‐dimensional frameworks derived from the simultaneous superposition of multiple structures. Protein Eng. 1:377‐384.
	Sutcliffe, M.J., Hayes, F.R., and Blundell, T.L. 1987b. Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. Protein Eng. 1:385‐392.
	Sutcliffe, M.J., Dobson, C.M., and Oswald, R.E. 1992. Solution structure of neuronal bungarotoxin determined by two‐dimensional NMR spectroscopy: Calculation of tertiary structure using systematic homologous model building, dynamical simulated annealing, and restrained molecular dynamics. Biochemistry 31:2962‐2970.
	Taylor, W.R., Flores, T.P., and Orengo, C.A. 1994. Multiple protein structure alignment. Protein Sci. 3:1858‐1870.
	Thompson, J.D., Higgins, D.G., and Gibson, T.J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position‐specific gap penalties and weight matrix choice. Nucl. Acids Res. 22:4673‐4680.
	Thompson, J.D., Plewniak, F., and Poch, O. 1999. BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15:87‐88.
	Topham, C.M., McLeod, A., Eisenmenger, F., Overington, J.P., Johnson, M.S., and Blundell, T.L. 1993. Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. J. Mol. Biol. 229:194‐220.
	Topham, C.M., Srinivasan, N., Thorpe, C.J., Overington, J.P., and Kalsheker, N.A. 1994. Comparative modelling of major house dust mite allergen Der p I: Structure validation using an extended environmental amino acid propensity table. Protein Eng. 7:869‐894.
	Unger, R., Harel, D., Wherland, S., and Sussman, J.L. 1989. A 3D building blocks approach to analyzing and predicting structure of proteins. Proteins 5:355‐373.
	Vakser, I.A. 1995. Protein docking for low‐resolution structures. Protein Eng. 8:371‐377.
	van Gelder, C.W., Leusen, F.J., Leunissen, J.A., and Noordik, J.H. 1994. A molecular dynamics approach for the generation of complete protein structures from limited coordinate data. Proteins 18:174‐185.
	van Vlijmen, H.W. and Karplus, M. 1997. PDB‐based protein loop prediction: Parameters for selection and methods for optimization. J. Mol. Biol. 267:975‐1001.
	Vernal, J., Fiser, A., Sali, A., Muller, M., Cazzulo, J.J., and Nowicki, C. 2002. Probing the specificity of a trypanosomal aromatic alpha‐hydroxy acid dehydrogenase by site‐directed mutagenesis. Biochem. Biophys. Res. Commun. 293:633‐639.
	von Ohsen, N., Sommer, I., and Zimmer, R. 2003. Profile‐profile alignment: A powerful tool for protein structure prediction. Pac. Symp. Biocomput. 2003:252‐263.
	Vriend, G. 1990. WHAT IF: A molecular modeling and drug design program. J. Mol. Graph 8:52‐56, 29.
	Wang, G. and Dunbrack, R.L. Jr. 2004. Scoring profile‐to‐profile sequence alignments. Protein Sci. 13:1612‐1626.
	Wolf, E., Vassilev, A., Makino, Y., Sali, A., Nakatani, Y., and Burley, S.K. 1998. Crystal structure of a GCN5‐related N‐acetyltransferase: Serratia marcescens aminoglycoside 3‐N‐acetyltransferase. Cell 94:439‐449.
	Worley, K.C., Culpepper, P., Wiese, B.A., and Smith, R.F. 1998. BEAUTY‐X: Enhanced BLAST searches for DNA queries. Bioinformatics 14:890‐891.
	Wu, G., Fiser, A., ter Kuile, B., Sali, A., and Muller, M. 1999. Convergent evolution of Trichomonas vaginalis lactate dehydrogenase from malate dehydrogenase. Proc. Natl. Acad. Sci. U.S.A. 96:6285‐6290.
	Xiang, Z., Soto, C.S., and Honig, B. 2002. Evaluating conformational free energies: The colony energy and its application to the problem of loop prediction. Proc. Natl. Acad. Sci. U.S.A. 99:7432‐7437.
	Xu, J., Li, M., Kim, D., and Xu, Y. 2003. RAPTOR: Optimal protein threading by linear programming. J. Bioinform. Comput. Biol. 1:95‐117.
	Xu, L.Z., Sanchez, R., Sali, A., and Heintz, N. 1996. Ligand specificity of brain lipid‐binding protein. J. Biol. Chem. 271:24711‐24719.
	Ye, Y., Jaroszewski, L., Li, W., and Godzik, A. 2003. A segment alignment approach to protein comparison. Bioinformatics 19:742‐749.
	Yona, G. and Levitt, M. 2002. Within the twilight zone: A sensitive profile‐profile comparison tool based on information theory. J. Mol. Biol. 315:1257‐1275.
	Zheng, Q., Rosenfeld, R., Vajda, S., and DeLisi, C. 1993. Determining protein loop conformation using scaling‐relaxation techniques. Protein Sci. 2:1242‐1248.
	Zhou, H. and Zhou, Y. 2002. Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction. Protein Sci. 11:2714‐2726.
	Zhou, H. and Zhou, Y. 2004. Single‐body residue‐level knowledge‐based energy score combined with sequence‐profile and secondary structure information for fold recognition. Proteins 55:1005‐1013.
	Zhou, H., and Zhou, Y. 2005. Fold recognition by combining sequence profiles derived from evolution and from depth‐dependent structural alignment of fragments. Proteins 58:321‐328.
Internet Resources
	http://www.salilab.org/modeller
	Eswar, N., Madhusudhan, M.S., Marti‐Renom, M.A., and Sali, A. 2005. MODELLER, A Protein Structure Modeling Program, Release 9v.2.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Comparative Protein Structure Modeling Using MODELLER

Abstract

Table of Contents

Materials

Figures

Videos

Literature Cited