![]() |
|||
|
|
|||
|
|
Homology Modeling The following document gives some indepth information about homology modeling. A modeling tutorial using DS Modeling (Accelrys) can be found here. STEP 4: Model evaluation The source of errors in comparative modelling is mainly due to the lack of templates and the decrease in sequence identity between the target and the templates. These errors are split in five categories:
Finally, the recent work of Lazaridis and Karplus , shows the improvement on the classical molecular mechanics calculation of the energy by including solvation (environmental) terms to detect wrongly modelled regions. Consequently, the criticism on the potential of mean force can not be applied to this approach that did perform as well as statistical functions in discriminating correct and misfolded models . The experimental evaluation of the model can only be done by site directed mutagenesis or additional information which is not commonly obtained. One way to escape the experiment is by using the knowledge obtained from a highly spread multiple alignments of related sequences introducing the following conditions:
REFERENCES N. Alexandrov and R. Luethy. (1998). Alignment algorithm for homology modeling and threading. Protein Sci 7, 254-258. B. Al-Lazikani, A. Lesk and C. Chothia. (1997). Standard conformations for the canonical structures of immunoglobulins. J. Mol. Biol. 273, 927-948. P. Aloy, J. Mas, M. Martí-Renom, E. Querol, F. Avilés and B. Oliva. (2000). Refinement of modelled structures by knowledge based energy profiles and secondary structure prediction: Application to the Human Procarboxypeptidase A2. J Comput-Aided Molec. Des. 14, 83-92. S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D. Lipman. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. T. Attwood. (2000). The Babel of Bioinformatics. Science 290, 471-473. A. Bairoch and R. Apweiler. (1997). The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acid Res. 25, 31-36. G. Barton and M. Sternberg. (1987). A strategy for the rapid multiple alignmentof protein sequences; confidence levels from tertiary structure comparisons. J. Mol. Biol. 198, 327-337. A. Bateman, E. Birney, R. Durbin, S. Eddy, K. Howe and E. Sonnhammer. (2000). The Pfam protein family database. Nucleic Acid Res. 28, 263-266. P. Bates and M. Sternberg. (1998). From Sequence to Structure. Protein Structure Prediction: A practical approach (M. Sternberg, Ed.), Oxford Univ. Press, Oxford,UK. P. A. Bates and M. Sternberg. (1999). Model building by comparison at CASP3: Using expert knowledge and computer automation. Proteins: Struct., Func. and Gene. Suppl. 3, 47-54. D. Bowie, J. U. Luthy and D. Eisenberg. (1991). A method to identify protein sequences that fold into a known-3D structure. Science 253, 164-170. B. Brooks, R. Bruccoleri, B. Olafson, D. States, S. Swaminathan and M. Karplus. (1983). CHARMM: a program for macromolecular energy minimization and dynamics calculations. J. Comp. Chem. 4, 187-217. R. Bruccoleri and M. Karplus. (1987). Prediction of the foldingof short polypetide segments by uniform conformational sampling. Biopolymers 26, 137-138. A. Brünger. (1992). X-PLOR: A system for X-ray crystallography and NMR. Yale University Press, New haven. V. Collura, J. Higo and J. Garnier. (1993). Modeling of protein loops by simulated annealing. Protein Sci. 2, 1502-1510. R. Copley and P. Bork. (2000). Homology among ba8 barrels: implications for the evolution of metabolic pathways. J. Mol. Biol. 303, 627-640. C. Chothia, A. Lesk, A. Tramontano, M. Levitt, S. Smith-Gill, G. Air, S. Sheriff, E. Padlan, D. Davies, W. Tulip, P. Colman, S. Spinelli, P. Alzari and R. Poljak. (1989). Conformations of Immunoglobulin Hypervariable Regions. Nature 342, 877-883. S. Chung and S. Subbiah. (1996). A structural explanation for the twilight zone of protein sequence homology. Structure 4, 1123-1127. C. Deane, Q. Kaas and T. Blundell. (2001). SCORE: predicting the core of protein models. Bioinformatics 17, 541-550. R. Dima, J. Banavar and A. Maritan. (2000). Scoring functions in protein folding and design. Protein Sci. 9, 812-819. F. S. Domingues, W. A. Koppensteiner, M. jaritz, A. Prlic, C. Weichenberger, M. Wiederstein, H. Floeckner, P. lackner and M. Sippl. (1999). Sustained performance of knwoledge-based potentials in fold recognition. Proteins: Struct., Func. & Gene. Suppl. 3, 112-120. L. Donate, S. Rufino, L. Canard and T. Blundell. (1996). Conformational analysis and clustering of short and medium size loops connecting regular secondary structures. A database for modelling and prediction. Proteins Sci. 5, 2600-2616. M. Dudeck, K. Ramnarayan and J. Ponder. (1998). Protein structure prediction using a combination of sequence homology and global energy minimization: II. Energy functions. J. Comp. Chem. 19, 548-573. S. Eddy. (1998). Profile hidden markov models. Bioinformatics 14, 755-763. K. Fidelis, P. Stern, D. Bacon and J. Moult. (1994). Comparison of systematic search and database methods fro constructing segments of protein structure. Protein Eng. 7, 953-960. D. Fischer and D. Eisenberg. (1996). Protein fold recognition using sequence-derived predictions. Protein Science 5, 947-955. A. Fiser, R. Do and A. Sali. (2000). Modeling of loops in protein structures. Protein Sci. 9, 1753-1773. I. Friedberg, T. Kaplan and H. Margalit. (2000). Evaluation of Psi/Blast algnment accuracy in comparison to structural alignments. Protein Sci 9, 2278-2284. D. W. Gatchell, S. Dennis and S. Vajda. (2000). Discrimination of Near-native Protein Structures from Misfolded Models by Empirical Free Energy Functions. Proteins: Struct., Func. & Gene. 41, 518-534. C. Geourjon, C. Combet, C. Blanchet and G. Deleague. (2001). Identification of related proteins with weak sequence identity using secondary structure information. Protein Sci. 10, 788-797. O. Gotoh. (1996). Significant inprovement in accuracy of multiple sequence alignments by iterative refinements assessed by reference to structural alignments. J. Mol. Biol. 264, 823-838. J. Greer. (1990). Comparative modeling methods: application to the family of the mammalian serine proteases. Proteins: Struc. Func. and Gene. 7, 317-334. W. v. Gunsteren, S. Billeter, A. Eising, P. Hünenberger, P. Früger, A. Mark, W. Scott and I. Tironi. (1996). Biomolecular Simulation: The GROMOS96 Manual and User Guide. Verlag der Fachvereine, Zürich. R. Hooft, G. Vriend and C. Sander. (1996). Verification of protein structures: side-chain planarity. J. Appl. Crystallogr. 29, 714-716. X. Huang and W. Miller. (1991). A time-efficient linear-space local similarity algorithm. Advan. Appl.Math. 12, 337-357. J. Irving, J. Whisstock and A. Lesk. (2001). Protein structural alignments and functional genomics. Proteins: struc. Func and Gene. 42, 378-382. L. Jaroszewski, L. Rychlewski and A. Godzik. (2000). Improving the quality of twilight-zone alignments. Protein Sci. 9, 1487-1496. A. Jennings, C. Edge and M. Sternberg. (2001). An approach to improving multiple alignments of protein sequences using predicted secondary structure. Protein Eng. 14, 227-231. D. Jones. (1999). GenTHREADER: an efficient and reliable protein fold recognition method for genomicsequences. J. Mol. Biol. 287, 797-815. T. A. Jones and S. Thirup. (1986). Using known substructures in protein model building and crystallography. EMBO J. 5, 819-822. K. Karplus, C. Barrett, M. Cline, M. Diekhans, L. Grate and R. Hughey. (1999). Predicting proteins tructure using only sequence information. Proteins: Struc. Func. and Gene. Suppl 3, 121-125. L. A. Kelley, R. M. MacCallum and M. Sternberg. (2000). Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol. 299, 499-520. A. Kidera. (1995). Enhanced conformational sampling in Monte carlo simulations of proteins: Applications to a constrained peptide. Proc. Natl. Acad. Sci. USA 92, 9886-9889. P. Koehl and M. Delarue. (1995). A self-consistent mean field approach to simultneous gap closure and side-chain positioning in protein homology modeling. Nat. Struct. Biol. 2, 163-170. P. Koehl and M. Delarue. (1996). Mean-field minimization methods for biological macromolecules. Curr. Opin. Struct. Biol. 6, 222-226. R. Laskowski, M. MacArthur and J. Thornton. (1998). Validation of Protein models derived from experiment. Curr. Opin. Struct. Biol. 5, 631-639. T. Lazaridis and M. Karplus. (1999). Discrimination of the native from misfolded protein models with an energy function including implicit solvation. J. Mol. Biol. 288, 477-487. J. U. Luthy, D. Bowie and D. Eisenberg. (1992). Assesment of protein models with three dimensional profiles. Nature 356, 83-85. A. Martin, J. Cheetham and A. Rees. (1989). Modeling antibody hypervariable loops: a combined algorithm. Proc. Natl. Acad. Sci. USA 86, 9268-9272. A. Martin and J. Thornton. (1996). Structural Families in Loops of Homologous Proteins: Automatic Classification, Modelling and Application to Antibodies. J.Mol.Biol. 263, 800-815. M. Martí-Renom, J. Mas, P. Aloy, E. Querol, F. Aviles and B. Oliva. (1998). Statistical Analysis of the loop-geometry on a non-redundant database of proteins. J Mol. Mod. 4, 347-354. M. A. Martí-Renom, A. Stuart, A. Fisher, R. Sánchez, F. Melo and A. Sali. (2000). Comparative protein structure modeling of genes and genomes. Ann. Rev. Biophys. Biomolec. Struc. 29, 291-325. C. Mattos, G. Petsko and M. Karplus. (1994). Analysis of two residue turns in proteins. J.Mol. Biol. 238, 733-747. M. McGregor, S. Islam and M. Sternberg. (1987). Analysis of the relationship between side-chain conformation and secondary structure in globular proteins. J. Mol. Biol. 198, 295-310. F. Melo and E. Feytmans. (1997). Novel knowledge-based mean force potential at atomic level. J. Mol. Biol. 267, 207-222. F. Melo and E. Feytmans. (1998). Assessing protein structures with a non local atomic interaction energy. J. Mol. Biol. 277, 1141-1152. V. Morea, A. Tramontano, M. Rustici, C. Chothia and A. Lesk. (1998). Conformations of the third hypervariable region in the VH domain of immunoglobulins. J. Mol. Biol. 275, 265-294. B. Morgenstern. (1999). Dialign2: improvement of the segment-to-segemnt approach to multiple sequence alignment. Bioinformatics 15, 211-218 J. Moult and M. James. (1986). An algorithm for determiningthe conformation of polypeptide segments in proteins by systematic search. Proteins: Struc. Func. and Gene. 1, 156-163. N. Nakajima, J. Higo and A. Kidera. (2000). Free energy landscapes of peptides by enhanced conformational sampling. J. Mol Biol. 296, 197-216. C. Notredame, D. Higgins and J. Heringa. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205-217. T. Oldfield. (1992). Squid: a program for the analysis and display of data from crystallography and molecular dynamics. J. Mol. Graph. 10, 247-252. B. Oliva, P. Bates, E. Querol, F. Avilés and M. Sternberg. (1997). An automatic Classification of the structure of protein loops. J. Mol. Biol. 266, 814-830. B. Oliva, P. Bates, E. Querol, F. Avilés and M. Sternberg. (1998). Automated Classification of Antibody Complementarity Determining Region 3 of the Heavy Chain (H3) Loops into Canonical Forms and Its Application to Protein Structure Prediction. J. Mol. Biol.(279), 1193-1210. O. Olmea, B. Rost and A. Valencia. (1999). Effective use of sequence correlation and conservation in fold recognition. J. Mol. Biol. 293, 1221-1239. A. Panchenko, A. marchler-Bauer and S. H. Bryant. (2000). Combination of threading potentials and sequence profiles improves fold recognition. J. Mol. Biol. 296, 1319-1331. K. Pawlowski, A. Bierzynski and A. Godzik. (1996). Structural diversity in a family of homologous proteins. J. Mol. Biol. 258, 349-366. W. Pearson. (1996). Effective protein sequence comparison. Meth. Enz. 266, 227-258. W. Pearson and D. Lipman. (1988). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444-2448. R. Petrella, T. Lazaridis and M. Karplus. (1998). Protein sidechain conformer prediction: a test of the energy function. Folding and Design 3, 353-377. C. Rapp and R. Friesner. (1999). Prediction of loop geometries using a generalyzed Born model of solvation effect. Proteins: Struc., Func. and Gene. 35, 173-183. C. Ring and F. Cohen. (1994). Conformational sampling of loop structures using genetic algorithm. Isr. J. Chem. 34, 245-252. D. Rosenbach and R. Rosenfeld. (1995). Simultaneous modeling of multiple loops in proteins. Protein Sci. 4, 496-505. B. Rost. (1999). Twilight zone of proteins sequence alignments. Protein Eng. 12, 85-94. S. Rufino, L. Donate, L. Canard and T. Blundell. (1997). Predicting the Conformational Class of Short and Medium Size Loops Connecting Regular Secondary Structures: Application to Comparative Modelling. J. Mol. Biol. 267, 352-367. R. Russell, M. Saqi, R. Sayle, P. Bates and M. Sternberg. (1997). Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mo.l Biol. 269, 423-439. R. Russell, P. Sasieni and M. Sternberg. (1998). Supersites within superfolds. Binding site similarity in the absence of homology. J. Mol. Biol. 282, 903-918. L. Rychlewski, L. Jaroszewski, L. Weizhong and A. Godzik. (2000). Comparison of sequence profiles. Structural prediction with no structure information. Protein Sci. 8, 232-241. G. Salem, E. Hutchinson, C. orengo and J. Thornton. (1999). Correlation of observed Fold frequency with the ocurrence of local structural motifs. J. Mol. Biol. 287, 969-981. A. Sali and T. Blundell. (1993). Comparative protein modeling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779-815. R. Sánchez, U. Pieper, F. Melo, N. Eswar, M. Martí-Renom, M. Madhusudhan, N. Mirkovic and A. Sali. (2000). Protein Structure Modeling for Structural Genomics. Nature Struct. Biol. Suppl. November, 986-990. R. Sánchez and A. Sali. (1997). Advances in comparative protein structure modeling. Curr. Opin. Struct. Biol. 7, 206-214. R. Sánchez and A. Sali. (1997). Evaluation of comparative protein structure modeling by MODELLER-3. Proteins: Struc. Func. and Gene. Suppl 1, 50-58. M. Saqi, R. Russell and M. Sternberg. (1999). Misleading local sequence alignment: implications for comparative modelling. Protein Eng. 11, 627-630. J. Sauder, J. Arthur and R. Dunbrack. (2000). Large-scale comparisson of protein sequence alignment algorithms with structure alignments. Proteins: Struc. Func. and Gene. 40, 6-22. P. Shenkin, D. Yarmush, R. Fine, H. Wang and C. levinthal. (1987). Predicting antibody hypervariable loop conformation: I. Ensembles of random conformation fro ring-like structures. Biopolymers 26, 2053-2085. H. Shirai, A. Kidera and H. Nakamura. (1999). H3-rules: identification of CDR-H3 structures in antibodies. FEBS Letters 455, 188-197. M. Sippl. (1993). Recognition of errors in three-dimensional structures of proteins. Proteins: Struc. Func. and Gene. 17, 355-362. K. Smith and B. Honig. (1994). Evaluation of the conformational free energies of loops in proteins. Proteins: Struc. Func. and Gene. 18, 119-132. T. Smith and M. Waterman. (1981). Identification of common molecular subsequences. J. Mol. Biol. 147, 195-197. M. Sternberg, P. Bates, L. Kelley and R. MacCallum. (1999). Progress in proteins structure prediction: assesment of CASP3. Curr. Opin. Struct. Biol. 9, 368-373. M. Sutcliffe, F. Hayes and T. Blundell. (1987). Knowledge-based modeling of homologous proteins, part II: rules for the conformations of substituted side-chains. Protein Eng. 1, 385-392. M. Sutcliffe, F. Hayes, D. Carney and T. Blundell. (1987). Knowledge-based modeling of homologous proteins, part I. Three dimensional frameorks derived from the simultaneous superposition of multiple structure. Protein Eng.(377-384). W. Taylor. (1988). A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161-169. S. Teichmann, C. Chothia, G. Church and J. Park. (2000). Fast assignements of protein structures to sequences using the intermediate sequence library. Bioinformatics 16, 117-124. J. Thompson, D. Higgins and T. Gibson. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. J. Thompson, F. Plewianiak and O. Poch. (1999). Balibase: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87-88. J. Thompson, F. Plewianiak and O. Poch. (1999). A comprehensive comparison of multiple sequence alignment programs. Nucleic Acid Res. 27, 2682-2690. J. Thompson, F. Plewianiak, J. Thierry and O. Poch. (2000). DbClustal: rapid and reliable global multiple alignments of protein sequence detected by database searches. Nucleic Acids Res. 28, 2919-2926. C. Topham, N. Srinivasan, C. Thorpe, J. Overington and N. Kalsheker. (1994). Comparative modeling of major house dust mite allergen der p I: structure validation using an extended environmental amino acid propensity table. Protein Eng. 7, 869-894. A. Torda. (1997). Perspectives in protein fold recognition. Curr. Opin. Struct. Biol. 7, 200-205. A. Tramontano, C. Chothia and A. Lesk. (1989). Structural determinants of the conformations of medium sized loops in proteins. Proteins: Struc. Func. and Gene. 6, 382-394. S. Vajda and C. DeLisi. (1990). Determining minimum energy conformations of polypetides by dynamic programming. Biopolymers 29, 1755-1772. M. Vasquez. (1996). Modeling side-chain conformation. Curr. Opin. Struct. Biol. 6, 217-221. H. W. v. Vlijmen and M. Karplus. (1997). PDB-based protein loop prediction: parameters for selection and methods for optimization. J. Mol. Biol. 267, 975-1001. J. Wojcik, J. Mornon and J. Chomilier. (1999). New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. J. Mol. Biol. 289, 1469-1490 |
||
|
||
Visits to site: |