Difference between revisions of "Homology Modelling GLA"
Line 1: | Line 1: | ||
<sup>by [[User:Drexler|Benjamin Drexler]] and [[User:Grandke|Fabian Grandke]]</sup> |
<sup>by [[User:Drexler|Benjamin Drexler]] and [[User:Grandke|Fabian Grandke]]</sup> |
||
+ | =Introduction= |
||
+ | In this task, we performed homology modelling of the protein [[Fabry_Disease#.CE.B1-galactosidase_A|α-galactosidase A]] with the programs MODELLER, SWISS-MODEL, iTasser and 3D-JIGSAW. Homology modelling relies on the following two assumptions. First, the structure of the protein is determined by its amino acid sequence. Second, the structure of a protein is more conserved than its amino acid sequence. Usually one performs homology modelling of a protein which structure is not known. In this case, we have several PDB structures of the α-galactosidase A available and hence we are able to evaluate the resulting models of the programs afterwards. |
||
+ | |||
=Calculation of Models= |
=Calculation of Models= |
||
==Available Homologous Structures== |
==Available Homologous Structures== |
||
− | The following table lists the best ten hits of the HHpred search of [[Sequence_Alignment_GLA#HHsearch | Task 1]]. We used 3HG3 (97% identity), 1KTB (53%) and 3CC1 (34%) as templates for the |
+ | The following table lists the best ten hits of the HHpred search of [[Sequence_Alignment_GLA#HHsearch | Task 1]]. We used 3HG3 (97% identity), 1KTB (53%) and 3CC1 (34%) as templates for the modelling process. This selection covers a wide range of sequence identity and hence we are able to evaluate how the sequence identity influence the quality of the models. |
{|class="wikitable" border="1" style="text-align:center; border-spacing:0;" |
{|class="wikitable" border="1" style="text-align:center; border-spacing:0;" |
||
Line 47: | Line 50: | ||
====Evaluation==== |
====Evaluation==== |
||
+ | [[File:Fabry_disease_homology_modelling_modeller_combined.png|thumb|500px|center|Figure 1: Representation of the resulting models of MODELLER and the reference PDB structure [http://www.rcsb.org/pdb/explore/explore.do?pdbId=1r47 1R47]. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model is based on the PDB structure 3HG3 (red). (B) The model is based on the PDB structure 1KTB (blue). (C) The model is based on the PDB structure 3CC1 (magenta).]] |
||
+ | |||
{|class="wikitable" border="1" style="text-align:center; border-spacing:0;" |
{|class="wikitable" border="1" style="text-align:center; border-spacing:0;" |
||
|- |
|- |
||
Line 66: | Line 71: | ||
|} |
|} |
||
+ | |||
− | [[File:Fabry_disease_homology_modelling_modeller_combined.png|thumb|500px|center|Representation of the resulting models of MODELLER and the reference PDB structure [http://www.rcsb.org/pdb/explore/explore.do?pdbId=1r47 1R47]. The models are in superposition to the reference structure (green) and are shown in cartoon representation. (A) The model is based on the PDB structure 3HG3 (red). (B) The model is based on the PDB structure 1KTB (blue). (C) The model is based on the PDB structure 3CC1 (magenta).]] |
||
===Multiple Sequence Alignments=== |
===Multiple Sequence Alignments=== |
Revision as of 16:37, 15 June 2011
by Benjamin Drexler and Fabian Grandke
Introduction
In this task, we performed homology modelling of the protein α-galactosidase A with the programs MODELLER, SWISS-MODEL, iTasser and 3D-JIGSAW. Homology modelling relies on the following two assumptions. First, the structure of the protein is determined by its amino acid sequence. Second, the structure of a protein is more conserved than its amino acid sequence. Usually one performs homology modelling of a protein which structure is not known. In this case, we have several PDB structures of the α-galactosidase A available and hence we are able to evaluate the resulting models of the programs afterwards.
Calculation of Models
Available Homologous Structures
The following table lists the best ten hits of the HHpred search of Task 1. We used 3HG3 (97% identity), 1KTB (53%) and 3CC1 (34%) as templates for the modelling process. This selection covers a wide range of sequence identity and hence we are able to evaluate how the sequence identity influence the quality of the models.
PDB-ID | Name | Probability | E-value | P-value | Identity | Template |
---|---|---|---|---|---|---|
> 60% sequence identity | ||||||
3hg3_A | Alpha-galactosidase A | 1.0 | 0 | 0 | 97% | x |
> 40% sequence identity | ||||||
1ktb_A | Alpha-N-acetylgalactosaminidase | 1.0 | 0 | 0 | 53% | x |
< 40% sequence identity | ||||||
1uas_A | Alpha-galactosidase | 1.0 | 0 | 0 | 39% | |
3lrk_A | Alpha-galactosidase 1 | 1.0 | 0 | 0 | 32% | |
3a5v_A | Alpha-galactosidase | 1.0 | 0 | 0 | 35% | |
1szn_A | Alpha-galactosidase | 1.0 | 0 | 0 | 34% | |
3a21_A | Putative secreted alpha-galactosidase | 1.0 | 0 | 0 | 34% | |
3cc1_A | BH1870 protein | 1.0 | 0 | 0 | 26% | x |
3a24_A | Alpha-galactosidase | 1.0 | 0 | 0 | 14% | |
1zy9_A | Alpha-galactosidase | 1.0 | 2.2E-37 | 8.8E-42 | 14% |
MODELLER
We used MODELLER as described in the tutorial Using Modeller for TASK 4.
Pairwise Alignments
In this section, we used a pairwise alignment between the template (i.e. 3HG3, 1KTB and 3CC1) and the target as the input for MODELLER.
Evaluation
Apo (1R46) | Complexed (1R47) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Template | TMS | RMSD | RMSD catalytic site | TMS | RMSD | RMSD catalytic site | |||
3HG3 | 0.141 | 0.498 | 0.326 | 0.1413 | 0.512 | 0.366 | |||
1KTB | 0.140 | 0.901 | 0.439 | 0.1413 | 0.888 | 0.437 | |||
3CC1 | 0.1397 | 2.864 | 3.436 | 0.1397 | 2.853 | 3.405 |
Multiple Sequence Alignments
PDB-ID | Unsupervised | Supervised | Identity | Comment |
---|---|---|---|---|
3LX9_A | 99% | |||
3GXP_A | 99% | |||
3H53_A | 99% | |||
3HG3_A | 97% | |||
3IGU_A | 54% | |||
1KTB_A | 53% | |||
1UAS_A | 39% | |||
3LRK_A | 34% | Was removed due to little sequence identity. Caused huge gaps in alignment. | ||
3CC1_A | 28% | Was removed due to little sequence identity. Caused huge gaps in alignment. |
iTasser
Figure 1 shows, that iTasser takes an amino acid sequence as input and tries to retrieve template proteins from PDB. In the next step fragments from the the templates are reassembled to a complete model. In the last step, the model is reassembled by taking energy calculations into account. Additionally biological function prediction is done, but that was not of interest of this task.<ref name=itasser1>http://zhanglab.ccmb.med.umich.edu/I-TASSER/about.html</ref>
We used the iTasser-server in two different ways:
- Standard parameters: the protein sequence is given as input and the program searches PDB for templates. The found proteins are used to create a template to predict the structure.
- PDB-ID as input: together with the amino acid sequence a template PDB-ID is given as input. The program takes all available information into account and uses them to calculate the structure.
As the iTasser server has very low capacities and only one job commitment at the same time is possible, the results of the second way are not yet present. The standalone version is no option, because it has a size of about 10GB and it does not work properly.
SWISS-MODEL
We used the swissmodel server with two different options:
- Automated Mode: A template sequence is given as input. As no further information are given, the model is directly created from the amino acid sequence. This method should only be used, if the sequence identity between target and template is greater than 50%.
- Aligned Mode: A pairwise alignment of template and target sequence is given as input. We created our alignments using online ClustalW2 from EBI.
Following sequences have been selected:
3hg3_A | 1ktb_A | 3cc1_A | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Automated Mode | Aligned Mode | Automated Mode | Aligned Mode | Automated Mode | Aligned Mode | |||||||||
Identity | Z-score | Model | Z-score | Model | Identity | Z-score | Model | Z-score | Model | Identity | Z-score | Model | Z-score | Model |
97% | 0 | -0.415 | 53% | 0 | -2.261 | 26% | 0 | Error¹ | NA |
¹The sequences are to different to create a useful model from the alignment. In the automated mode the template itself has been used as model, what is useless, because the sequences have only 26% identity.
Aborting: too many unfruitful attempts to rebuild a loop.
This is likely to indicate a misalignment in this region
Use Swiss-PdbViewer to adjust your alignment in this
region and resubmit an optimise mode modelling request.
Evaluation of Models
MODELLER
Numeric Evaluation
Apo (1R46) | Complexed (1R47) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Template | TMS | RMSD | RMSD catalytic site | TMS | RMSD | RMSD catalytic site | |||
3HG3 | 0.141 | 0.498 | 0.326 | 0.1413 | 0.512 | 0.366 | |||
1KTB | 0.140 | 0.901 | 0.439 | 0.1413 | 0.888 | 0.437 | |||
3CC1 | 0.1397 | 2.864 | 3.436 | 0.1397 | 2.853 | 3.405 |
Comparison to Experimental Structure
iTasser
Numeric Evaluation
Comparison to Experimental Structure
SWISS-MODEL
Template 3HG3
Numeric Evaluation
Apo (1R46) | Complexed (1R47) | |||||
---|---|---|---|---|---|---|
Mode | TMS | RMSD | RMSD catalytic site | TMS | RMSD | RMSD catalytic site |
Aligned | 0.1411 | 0.485 | 0.279 | 0.1412 | 0.489 | 0.290 |
Automated | 0.1411 | 0.485 | 0.277 | 0.1412 | 0.489 | 0.291 |
Comparison to Experimental Structure
Template 1KTB
Numeric Evaluation
Apo (1R46) | Complexed (1R47) | |||||
---|---|---|---|---|---|---|
Mode | TMS | RMSD | RMSD catalytic site | TMS | RMSD | RMSD catalytic site |
Aligned | 0.1598 | 0.943 | 5.073 | 0.1606 | 0.932 | 6.409 |
Automated | 0.1361 | 0.981 | 0.417 | 0.1368 | 0.974 | 0.404 |
Comparison to Experimental Structure
Template 3CC1
Numeric Evaluation
Apo (1R46) | Complexed (1R47) | |||||
---|---|---|---|---|---|---|
Mode | TMS | RMSD | RMSD catalytic site | TMS | RMSD | RMSD catalytic site |
Aligned | 0.1302 | 3.279 | 7.107 | 0.1300 | 3.802 | 7.357 |
Automated | N/A | N/A | N/A | N/A | N/A | N/A |
Comparison to Experimental Structure
References
<references />