Difference between revisions of "Canavan Disease: Task 05 - Homology Modelling"

From Bioinformatikpedia
Line 1: Line 1:
  +
'''Homology Modelling''' is also a very important step: Since not always a structure to the protein of interest is known, models can help understanding the protein. Even SNPs in the sequence can make a difference in those models. Let's investigate!
why we do this...
 
 
== [[Canavan_Disease:_Task_05_-_Journal|LabJournal]] ==
 
== [[Canavan_Disease:_Task_05_-_Journal|LabJournal]] ==
   
Line 40: Line 40:
 
|align="center"|
 
|align="center"|
 
<figure id="2O53_md">
 
<figure id="2O53_md">
[[Image:2O53 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue. </caption>]]
+
[[Image:2O53 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue. </caption>]]
 
</figure>
 
</figure>
 
|align="center"|
 
|align="center"|
 
<figure id="2GU2_md">
 
<figure id="2GU2_md">
[[Image:2GU2 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2GU2 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|-
 
|-
 
|}
 
|}
   
Taking a look ate the model generated for 2O4H with the aid of 2QJ8 and 1WY4 as template which have both a sequence similarity below 20%, the results are sill very good. There are visible differences between the target and the models like larger loop regions or secondary structure elements with conformations that are slightly miss predicted. However if the aligned target and models are compared to their original template there is a big difference detectable (compare '''<xr id="2QJ8_md">Figure</xr>''' and '''<xr id="1YW4_md">Figure</xr>''').
+
Taking a look at the model generated for 2O4H with the aid of 2QJ8 and 1WY4 as template which have both a sequence similarity below 20%, the results are sill very good. There are visible differences between the target and the models like larger loop regions or secondary structure elements with conformations that are slightly miss predicted. However if the aligned target and models are compared to their original template there is a big difference detectable (compare '''<xr id="2QJ8_md">Figure</xr>''' and '''<xr id="1YW4_md">Figure</xr>''').
   
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
Line 55: Line 55:
 
|align="center"|
 
|align="center"|
 
<figure id="2QJ8_md">
 
<figure id="2QJ8_md">
[[Image:2QJ8 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2QJ8 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|align="center"|
 
|align="center"|
 
<figure id="1YW4_md">
 
<figure id="1YW4_md">
[[Image:1YW4 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''1YW4''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:1YW4 md.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''1YW4''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|-
 
|-
Line 71: Line 71:
 
|align="center"|
 
|align="center"|
 
<figure id="2O53_sm">
 
<figure id="2O53_sm">
[[Image:2O53 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2O53 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|align="center"|
 
|align="center"|
 
<figure id="2GU2_sm">
 
<figure id="2GU2_sm">
[[Image:2GU2 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2GU2 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|-
 
|-
Line 84: Line 84:
   
 
<figure id="2QJ8_sm">
 
<figure id="2QJ8_sm">
[[Image:2QJ8 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2QJ8 sm.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
   
 
=== iTasser ===
 
=== iTasser ===
iTasser continues to show the trend that is already observable if looking at the results of Modeller and Swissmodel. Taking a look at the models produced out of the high sequence similarity templates, it is visible that iTasser creates accurate models for the target sequence (see '''<xr id="2O53_it">Figure</xr>''' and '''<xr id="2GU2_it">Figure</xr>'''). However if the visualization in pymol is compared to the result Modleler and Swissmodel created iTasser seems to be a little less precise. The spacial orientation of the secondary structure elements seem a bit off compared to the models created by the two other methods.
+
iTasser continues to show the trend that is already observable if looking at the results of Modeller and Swissmodel. Taking a look at the models produced out of the high sequence similarity templates, it is visible that iTasser creates accurate models for the target sequence (see '''<xr id="2O53_it">Figure</xr>''' and '''<xr id="2GU2_it">Figure</xr>'''). However if the visualization in Pymol is compared to the result Modleler and Swissmodel created iTasser seems to be a little less precise. The spacial orientation of the secondary structure elements seem a bit off compared to the models created by the two other methods.
   
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
 
{| border="0" cellpadding="5" cellspacing="0" align="center"
Line 94: Line 94:
 
|align="center"|
 
|align="center"|
 
<figure id="2O53_it">
 
<figure id="2O53_it">
[[Image:2O53 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2O53 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2O53''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|align="center"|
 
|align="center"|
 
<figure id="2GU2_it">
 
<figure id="2GU2_it">
[[Image:2GU2 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2GU2 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2GU2''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|-
 
|-
Line 109: Line 109:
 
|align="center"|
 
|align="center"|
 
<figure id="2QJ8_it">
 
<figure id="2QJ8_it">
[[Image:2QJ8 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:2QJ8 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''2QJ8''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|align="center"|
 
|align="center"|
 
<figure id="1YW4_it">
 
<figure id="1YW4_it">
[[Image:1YW4 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''1YW4''') and the generated model in pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
+
[[Image:1YW4 it.png|centre|thumb|500px|'''<caption>'''Representation of the target ('''2O4H'''), the template ('''1YW4''') and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.</caption>]]
 
</figure>
 
</figure>
 
|-
 
|-
Line 122: Line 122:
 
== Model evaluation ==
 
== Model evaluation ==
   
Comparing the models and methods that created the models, the GDT score and RMSD was calculated for each computed model. As displayed in <xr id="model">Table</xr> the GDT scores are a better measure than the RMSD to grade the models. What can be observed is that the models created from a template with high sequence similarity to the target have a high GDT score. With decreasing sequence similarity the GDT score rapidly decreases. This can not be observed using the RMSD scores. The RMSD scores are overall good with an exception to iTasser, although the models do observably differ in quality. The ranking in performance of the modeling algorithms that had been concluded from the visual examination by using pymol can be confirmed taking the GDT scores into consideration. Moddeler and Swissmodel generate very good models especially if the template has a high sequence similarity to the target. iTasser disappoints in terms of both model quality and amount of computing time needed to create the model.
+
Comparing the models and methods that created the models, the GDT score and RMSD was calculated for each computed model. As displayed in <xr id="model">Table</xr> the GDT scores are a better measure than the RMSD to grade the models. What can be observed is that the models created from a template with high sequence similarity to the target have a high GDT score. With decreasing sequence similarity the GDT score rapidly decreases. This can not be observed using the RMSD scores. The RMSD scores are overall good with an exception to iTasser, although the models do observably differ in quality. The ranking in performance of the modeling algorithms that had been concluded from the visual examination by using Pymol can be confirmed taking the GDT scores into consideration. Modeller and Swissmodel generate very good models especially if the template has a high sequence similarity to the target. iTasser disappoints in terms of both model quality and amount of computing time needed to create the model.
   
 
<figtable id="model">
 
<figtable id="model">
Line 188: Line 188:
 
* Link to Task 09: [[Canavan_Disease:_Task_09_-_Structure-based_Mutation_Analysis|Structure-based Mutation Analysis]]
 
* Link to Task 09: [[Canavan_Disease:_Task_09_-_Structure-based_Mutation_Analysis|Structure-based Mutation Analysis]]
 
* Link to Task 10: [[Canavan_Disease:_Task_10_-_Normal_Mode_Analysis|Normal Mode Analysis]]
 
* Link to Task 10: [[Canavan_Disease:_Task_10_-_Normal_Mode_Analysis|Normal Mode Analysis]]
* Link to Task 11: [[Canavan_Disease:_Task_11_-_Molecular_Dynamics_Simulation|Molecular Dynamics Simulation]]
 

Revision as of 10:47, 29 August 2013

Homology Modelling is also a very important step: Since not always a structure to the protein of interest is known, models can help understanding the protein. Even SNPs in the sequence can make a difference in those models. Let's investigate!

LabJournal

Dataset

The models are calculated with three different modellers: Modeller, SwissModel and iTasser. To compare the modellers two sequences per sequence similarity set were chosen:

<figtable id="dataset">

Dataset composition
PDB-id Description Criterium
2O4H ASPA from Human with bound N-phosphonomethyl-L-aspartate reference structure
2O53 Crystal structure of apo-Aspartoacylase from human brain sequence identity 100%
2GU2 ASPA from Rat sequence identity 84%
2QJ8 ASPA family protein from mesorhizobium loti sequence identity 16%
1YW4 Succinylglutamate Desuccinylase from "Chromobacterium violaceum" sequence identity 14%
Overview of the dataset composition for Task 05, containing a brief description of the the chosen structures and the sequence identity to the reference ACY2 protein

</figtable>

Model creation

Each modelling algorithm was used to produce models for 2HO4 based on four different template proteins. Those models can be examined in the following section except the model that Swissmodel should have created based in 1YW4, as Swissmodel was not able to perform this task.

Modeller

Modeller produced extremely accurate models for the target protein given templates with a high sequence similarity. Both 2O53 and 2GU2 are already highly similar in structure if visually compared to 2O4H. Performing a structural alignment of the template structures to 2O4H result in RMDS below 1Å. Therefore it is to be expected that the models generated should be very accurate, and this is exactly what can be observed (see <xr id="2O53_md">Figure</xr> and <xr id="2GU2_md">Figure</xr>).

</figure>

</figure>

<figure id="2O53_md">

Representation of the target (2O4H), the template (2O53) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

<figure id="2GU2_md">

Representation of the target (2O4H), the template (2GU2) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

Taking a look at the model generated for 2O4H with the aid of 2QJ8 and 1WY4 as template which have both a sequence similarity below 20%, the results are sill very good. There are visible differences between the target and the models like larger loop regions or secondary structure elements with conformations that are slightly miss predicted. However if the aligned target and models are compared to their original template there is a big difference detectable (compare <xr id="2QJ8_md">Figure</xr> and <xr id="1YW4_md">Figure</xr>).

</figure>

</figure>

<figure id="2QJ8_md">

Representation of the target (2O4H), the template (2QJ8) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

<figure id="1YW4_md">

Representation of the target (2O4H), the template (1YW4) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

Swissmodel

Examining the models created by Swissmodel with high sequence similarity templates in Pymol together with the templates and target, reveals that Swissmodel creates very accurate models as well. One visible difference compared the models created by Modeller is that Swissmodel seems to created the model for the length of the target opposed to Modeller where for example the N and C-terminus of the polypeptide is well extended over the length of the actual target (see <xr id="2O53_sm">Figure</xr> and <xr id="2GU2_sm">Figure</xr>).

</figure>

</figure>

<figure id="2O53_sm">

Representation of the target (2O4H), the template (2O53) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

<figure id="2GU2_sm">

Representation of the target (2O4H), the template (2GU2) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

Regarding the models created from templates with a sequence similarity of less than 30% to 2O4H, the remark that Swissmodel was not able to form a model with 1YW4 as template has to be made. The modeling process with 2QJ8 as template has been successful however. Taking a closer look at the model created from 2QJ8 is get visible that Swissmodel at least for this specific example does not perform was well as Modeller using a template with low sequence similarity. The overall positioning of the atoms is correct, but the prediction of secondary structure elements is much worse. Some residues do not even have a predicted spacial position (see <xr id="2QJ8_sm">Figure</xr>).

<figure id="2QJ8_sm">

Representation of the target (2O4H), the template (2QJ8) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

</figure>

iTasser

iTasser continues to show the trend that is already observable if looking at the results of Modeller and Swissmodel. Taking a look at the models produced out of the high sequence similarity templates, it is visible that iTasser creates accurate models for the target sequence (see <xr id="2O53_it">Figure</xr> and <xr id="2GU2_it">Figure</xr>). However if the visualization in Pymol is compared to the result Modleler and Swissmodel created iTasser seems to be a little less precise. The spacial orientation of the secondary structure elements seem a bit off compared to the models created by the two other methods.

</figure>

</figure>

<figure id="2O53_it">

Representation of the target (2O4H), the template (2O53) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

<figure id="2GU2_it">

Representation of the target (2O4H), the template (2GU2) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

Looking at the calculated models based on 2QJ8 and 1YW4, iTasser definitely produces much worse results than Modeller and Swissmodel. Comparing target and template structures, concerning the positions of the atoms, the model seems to fall in between the two structures leaning more towards the template structure (compare to <xr id="2O53_it">Figure</xr> and <xr id="2GU2_it">Figure</xr>).

</figure>

</figure>

<figure id="2QJ8_it">

Representation of the target (2O4H), the template (2QJ8) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

<figure id="1YW4_it">

Representation of the target (2O4H), the template (1YW4) and the generated model in Pymol. 2O4H is displayed in orange including the bound zinc atom and compound at the active site of ASPA. The template used to generate the model is displayed in green. The produced model is displayed in blue.

Therefore the conclusion out of the examination of the structures via Pymol, should be that Modeller's overall performance is exceptional, only being beaten by Swissmodel for templates with a high sequence similarity. iTasser however is performing worse in every example additionally to the fact that the computing time is much longer if compared to the other methods.

Model evaluation

Comparing the models and methods that created the models, the GDT score and RMSD was calculated for each computed model. As displayed in <xr id="model">Table</xr> the GDT scores are a better measure than the RMSD to grade the models. What can be observed is that the models created from a template with high sequence similarity to the target have a high GDT score. With decreasing sequence similarity the GDT score rapidly decreases. This can not be observed using the RMSD scores. The RMSD scores are overall good with an exception to iTasser, although the models do observably differ in quality. The ranking in performance of the modeling algorithms that had been concluded from the visual examination by using Pymol can be confirmed taking the GDT scores into consideration. Modeller and Swissmodel generate very good models especially if the template has a high sequence similarity to the target. iTasser disappoints in terms of both model quality and amount of computing time needed to create the model.

<figtable id="model">

Comparison of modelling algorithms
Modeller SwissModel iTasser
2O53 2GU2 2QJ8 1YW4 2O53 2GU2 2QJ8 2O53 2GU2 2QJ8 1YW4
GDT Score 100.00 54.65 6.40 6.65 100.00 54.65 6.89 89.95 46.68 7.81 7.56
C_alpha RMSD 0.19Å 0.34Å 1.19Å 0.97Å 0.07Å 0.06Å 2.68Å 1.31Å not computable 7.17Å 10.23Å
Comparison of the calculated models, using GDT score and RMSD as quality measure.

</figtable>

Tasks