Untitled Document
 
     
Untitled Document

UNIX
        SGI IRIX

HOMOLOGY MODELING
        USES & BACKGROUND
        BASIC GUIDE
        FOLD ASSIGNMENT
        TEMPLATE & ALIGNMENT
        BUILDING THE MODEL
        REFINEMENT & EVALUATION

DOCKING
        USES
        BACKGROUND
        SETTING UP THE SYSTEM
        ACCESSING THE RESULTS

RATIONAL DRUG DESIGN
        USES
        BACKGROUND

MOLECULAR DYNAMICS
        USES
        BACKGROUND
        SETTING UP THE SYSTEM
        ACCESSING THE RESULTS

 

 

 

Homology Modeling

The following document gives some indepth information about homology modeling. A modeling tutorial using DS Modeling (Accelrys) can be found here.

STEP 1: Fold assignment

To start the modeling process, we have to identify the template and define an alignment (residue-by-residue equivalences between the target and the template sequences. In homology modelling the stretches to be built are chosen according to their sequence alignment, consequently this is the most crucial step in a modeling process. Any errors at this stage are usually impossible to correct later . The sequences of the fold having the larger similarity with the target sequence will be taken as parents or templates. Currently, around 40% of all protein sequences can have at least one domain modelled on a related known protein structure . In particular, some proteins can have very low sequence identity and yet all share the same fold and a closely related function . The current theory of evolution would hold that such structures, having diverged from a common ancestor, often retain some functional and sequence similarity . In addition, divergent evolution has been recently reported on the basis of a biochemical pathway evolution for some proteins with a common (ba)8 barrel fold for which sequence similarity was not detected .

Originally, searches of homologous sequences to the target were done with local alignement programs as for example: FASTA ; SSEARCH or BLAST that are able to find identities shared between pairs of related sequences. With the high rate at which new sequences become available from genomic initiatives the importance of the sensitive methods of recognizing distant homologies has increased. Such methods are the main source of annotation, hence in the last decade very sensitive approaches have been developed to recognise fold. They have succeeded in different degrees of identification of relationships between remote homologues. These methods include:

    1) Threading approaches evaluating the compatibility between the target sequence and a given structural template .

    2) Advanced sequence comparison procedures that take into account multiple sequence alignments with a position specific scoring system , either provided by a coherent theory for profile methods using machine learning probabilistic models (Hidden Markov Models) ; by a position specific iterative BLAST (PSI-BLAST) ; or by searching in sequence space using intermediate sequences (ISS) . These methods were shown to get better results than simple threading .

    3) Finally, new approaches incorporating sequence profiles and knowledge-based threading potential have been used, improving the recognition of remote homologues

Moreover, any additional information about the structure can improve the recognition by only sequence. As an example, secondary structure prediction can help to validate the alignment and the identification of related proteins with divergent sequences and it permits an increase in the number of potential templates . In recent studies on the comparison and evaluation of searching/aligning methods it was shown that for an E-value set to 10, the percentage of true positives (3D structure similar) ranged from 64.7% (SSEARCH) to 96.1% (BLAST), whereas the percentage of false positives ranged from 35.3% to 3.9% . On the other hand, using the well known position specific alignment method of PSI-BLAST, this succeeded to find remote structural homologues in 21% out of 246 searches . In general, PSI-BLAST correctly aligns 40% of the residues when the sequence identity is larger than 15% . Consequently, PSI-BLAST is aknowledged as one of the most powerful tools for detecting remote evolutionary relationships by sequence considerations only. The reasons explaining the success of the profile methods are the following:
    1) the use of multiple alignment information, hence a larger amount of information than single sequences. The procedure is based on the hypothesis that related sequences by a common ancestor have to preserve those important residues for the function, for the fold, or for both. Therefore, these specific residues have to be shared by all sequences with the same position in the multiple alignment of the related members. Consequently, position specific residues are given a high weight on the alignment scoring, which in turn is obtained by means of a matrix of weights that is derived from those sequences found with higher probability to be related to the query.

    2) It exploits the transitivity of homology like the intermediate sequence search , by which a query sequence is aligned to a database (i.e. SWISS-PROT) . Then, all aligned sequences with high significance similarity (E-values<0.001) are used as new seeds and this is iterated until no new sequences are found. This procedure implies a larger search than the obtained by a single sequence search.

The difference of profile methods with respect to ISS is that those sequences with high similarity are aligned and the profile is used on the next search. The distribution of local alignement scores of random sequences is used to determine the significance of the alignment which is the crucial step to find the next related sequences. Going further, Rychlevsky et al. developed a new procedure with profile-profile searches (FFAS) that according to the authors gave better results than psi-blast, because of being more sensitive and accurate due to the use of Smith-Waterman dynamic programming routine to obtain the optimal alignment.



     

 

Sign Guest Book
View Guest Book

This site is maintained by Arzhang Fallahi
Last Updated: August 2, 2004
Comments/Suggestions


Visits to site: