Difference between revisions of "Homology modelling Gaucher Disease"

From Bioinformatikpedia
(Automated mode)
Line 417: Line 417:
 
| 0.135
 
| 0.135
 
|}
 
|}
  +
</figtable>
  +
  +
== I-TASSER ==
  +
  +
For being able to directly compare the prediction performance of I-TASSER with the other servers, we assigned the selected templates 2wnw_A, 2y24_A, and 3nco_A. The modelling took about 72 hours, compared to less than 5 minutes in case of Modeller and the SWISS-MODEL server.
  +
  +
<figtable id="tab:itasser-models">
  +
{|style="border-style:solid; border-width: 1px"
  +
|[[File:itasser-1ogs_A-2wnw_A.png|thumb|150px|2wnw_A, C-score: 1.47]]
  +
|[[File:itasser-1ogs_A-2y24_A.png|thumb|150px|2y24_A]]
  +
|[[File:itasser-1ogs_A-3nco_A.png|thumb|150px|3nco_A]]
  +
|[[File:itasser-dope.png|thumb|200px|Dope score per residue.]]
  +
|}
  +
<caption>Models built by I-TASSER using different templates. Red: 1ogs_A.</caption>
  +
</figtable>
  +
  +
<figure id="fig:itasser-2wnw_A-loops">
  +
[[File:itasser-2wnw_A-loops.png|thumb|150px|Loop region 1-35 of 2wnw_A]]
  +
</figure>
  +
  +
The resulting models turned out to be highly precise (cf. <xr id="tab:itasser-models"/>, <xr id="tab:itasser-eval"/>). Although the DOPE score and RMSD was not better compared to Modeller and the SWISS-MODEL server, the TM-score and GDT score was. In contrast to the other methods, I-TASSER also modelled the loop regions very accurate. <xr id="fig:itasser-2wnw_A-loops"/> demonstrates the the high coincidence of the loop region 1-35 with 1ogs_A, which was not covered by the template 2wnw_A. This indicated that the molecular dynamics simulation of I-TASSER worked very well in this case. Also the side-chains matched better those of 1ogs_A than the models built by the other methods. This also accounts for the high TM-score and GDT scores. The DOPE score is probably somewhat higher compared to Modeller, since I-TASSER does not minimize the DOPE score like Modeller, but another energy function.
  +
  +
<figtable id="tab:itasser-eval">
  +
{| style="border-collapse: separate; border-style: solid; border-spacing: 0; border-width: 2px 0 2px 0; text-align:right" width="700px"
  +
|- style="background-color: lightgrey; height=0"
  +
! style="border-style:solid; border-width: 0 1px 0 0" | Template
  +
! DOPE
  +
! DOPE z-score
  +
! style="border-style:solid; border-width: 0 1px 0 0" | QMEAN6
  +
! RMSD
  +
! TM-score
  +
! GDT_TS
  +
! GTD_HA
  +
|-
  +
| style="border-style:solid; border-width: 1px 0 0 0" colspan=8|
  +
|-
  +
|style="border-style:solid; border-width: 0 1px 0 0" | 2wnw_A
  +
| -54684
  +
| -0.293
  +
| style="border-style:solid; border-width: 0 1px 0 0" | 0.73
  +
| 1.010
  +
| 0.943
  +
| 0.812
  +
| 0.607
  +
|-
  +
|style="border-style:solid; border-width: 0 1px 0 0" | 2y24_A
  +
| -47194
  +
| 0.777
  +
| style="border-style:solid; border-width: 0 1px 0 0" | 0.376
  +
| 1.222
  +
| 0.550
  +
| 0.294
  +
| 0.223
  +
|-
  +
|style="border-style:solid; border-width: 0 1px 0 0" | 3nco_A
  +
| -44033
  +
| 1.224
  +
| style="border-style:solid; border-width: 0 1px 0 0" | 0.139
  +
| 2.158
  +
| 0.252
  +
| 0.093
  +
| 0.043
  +
|}
  +
<caption>Evaluation of models built by I-TASSER using different templates.</caption>
 
</figtable>
 
</figtable>

Revision as of 10:04, 31 May 2012

The object of this task was to apply homology modeling for predicting the tertiary structure of glycosylceramidase given its sequence. For this, we first selected different templates which were than used to derive the structure of glycosylceramidase using three different homology modeling tools, namely Modeller, SWISS-MODEL, and the I-TASSER server. The resulting models were evaluated using both quality assessment scores and the native crystal structure 1ogs for comparison. Technical details are reported in our protocol.

Template selection

We used HHsearch for searching the PDB for homologous templates. <xr id="tab:templates"/> lists some of the top-ranking templates. 2nt0_A is identical to the target 1ogs_A and was therefore excluded. Although all listed hits are homologous to the target (HHsearch probability > 97%), their sequence identity was below 30%. We therefore selected 2wnw_A (blue) as a close homolog, 2y24_A (green) as an intermediate homolog, and 3nco_A (yellow) as a more distant homolog. Note that the latter two templates to not cover the complete target which makes the homology modeling process harder. <figtable id="tab:templates">

Hit Nr Template Identity Query HMM Prob
> 80% sequence identity
1 2nt0_A 100.0 1-497 100.0
40% - 80% sequence identity
< 30% sequence identity
2 2wnw_A 28.0 36-496 100.0
3 3clw_A 14.0 64-495 100.0
4 2y24_A 18.0 66-495 100.0
5 3kl0_A 19.0 65-495 100.0
6 3zr5_A 17.0 65-494 100.0
7 3ik2_A 14.0 65-495 99.2
22 3nco_A 11.0 113-384 97.7
28 1egz_A 12.0 113-387 97.4

Homologs found by HHsearch. Bold: selected templated used for the following modeling. </figtable>

Modeller

Modeller is a popular tool for building models by the satisfaction of spatial restrains which are derived, for instance, from one or several target-template alignments. The alignments can come from any alignment or homology search tool, or they can be built by modeller itself. We created models for our target protein by (1) using a single template and employing Modeller to compute the alignment, (2) using a single template and the alignment from HHsearch, and (3) using multiple templates. The Model quality was assessed via the DOPE and DOPE z-score reported by Modeller as well as the QMEAN6 score from the SWISS-MODEL workspace. We compared the resulting models to the crystal structure 1ogs_A via the weighted all-atom RMSD score computed by SAP, as well as the TM-score, GDT_TS, and GTD_HA score which we computed by the program TMscore.

Single-template modeling using Modeller alignments

<xr id="tab:modeller-single-models"/> shows the resulting models. 2wnw_A produced the best looking model: all major secondary structure elements coincided with the native structure 1ogs_A (red). Only the target range 1-31 which was not covered by template (cf. <xr id="tab:templates"/>) resulted in some deviating loop regions (at the top right corner). Although 2y24_A is less conserved than 2wnw_A, the model came close to the native structure. 3nco_A shares the same TIM beta/alpha-barrel domain (the tube at the center) than 1ogs_A but is missing the glycosyl hydrolase domain (the sheets at the right side) such that the model was less well structured in this region.

<figtable id="tab:modeller-single-models">

2wnw_A
2y24_A
3nco_A
Dope score per residue.

Models built by Modeller using single templates and alignments computed by Modeller itself. Red: 1ogs_A. </figtable>

<xr id="tab:modeller-single-eval"/> shows the evaluation results. Both the quality assessment scores and the comparison to the native structure 1ogs_A were in line with the observations described above. The model derived from 2wnw_A had a lower energy (was more stable) than the models from 2y24_A and 3nco_A, and it better matched with 1ogs_A. Except for the DOPE z-score which should have been lower for 2y24_A, the assessment scores correlated well with the structure comparison scores. The dope score per residue (cf. <xr id="tab:modeller-single-models"/>) of all three models were correlated and lowest for 2wnw_A.

<figtable id="tab:modeller-single-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -55925 -0.471 0.689 1.006 0.824 0.661 0.479
2y24_A -47194 0.777 0.376 1.222 0.550 0.294 0.223
3nco_A -44033 1.224 0.139 2.158 0.252 0.093 0.043

Evaluation of models built by Modeller using single templates and alignments computed by Modeller itself. </figtable>

Single-template modeling using HHsearch alignments

Modeller computes alignments by aligning the target sequence to the known structure of the template. Hence, no predicted features of the target sequence are used. Instead, HHsearch computes an HMM-to-HMM alignment where the target HMM comprises more features than the sequence alone. It also contains the secondary structure predicted by PSIPRED and information about the conservation of all residues derived from a sequence profile. HHsearch alignments are therefore thought to be more accurate than those produced by Modeller which should lead to better models. Thus we tried to improve the models by using HHsearch alignments.

<figtable id="tab:modeller-hhsearch-alis">

2wnw_A
2y24_A
3nco_A

Comparison of Modeller and HHsearch alignments. </figtable>

<xr id="tab:modeller-hhsearch-alis"/> depicts HHsearch alignments compared to Modeller alignments. The alignments produced by HHsearch were more compact, i.e. they exhibited more gaps at the beginning and at the end of sequences, whereas gaps were more distributed in case of alignments computed by modeller. In comparing the resulting models of <xr id="tab:modeller-hhsearch-models"/> with those of <xr id="tab:modeller-single-models"/>, we found that the core regions better matches the native structure in case of using HHsearch alignments. However, Modeller could not find the correct topology for the ends of the target sequence which were not covered by the template. These regions just became an unfolded threads stretching out over the space. Surprisingly, the dope score per residue was relatively low for unfolded regions.

<figtable id="tab:modeller-hhsearch-models">

2wnw_A
2y24_A
3nco_A
Dope score per residue.

Models built by Modeller using HHsearch alignments covering the complete target. Red: 1ogs_A. </figtable>

The DOPE score of the models of <xr id="tab:modeller-hhsearch-models"/> was slightly higher compared to the models of <xr id="tab:modeller-single-models"/>, but they were more stable according to the QMEAN6 score. The RMSD score increased due to the unfolded ends of the target sequence. However, the TM-score and the GDT scores improved since the actual core to the target became closer to the native structure.

<figtable id="tab:modeller-hhsearch-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -54695 -0.295 0.726 1.079 0.869 0.732 0.538
2y24_A -47256 0.765 0.566 1.640 0.722 0.553 0.386
3nco_A -32577 2.857 0.272 9.757 0.398 0.235 0.131

Evaluation of models built by Modeller using HHsearch alignments covering the complete target. </figtable>

Single-template modeling using local HHsearch alignments

Since the ends of the sequences could not be modelled correctly by Modeller and might impact the model evaluation, we repeated the modeling using only the regions covered the template and not the ends mentioned above. This resulted in more dense models without the long unfolded threads (cf. <xr id="tab:modeller-hhsearch-local-models"/>).

<figtable id="tab:modeller-hhsearch-local-models">

2wnw_A
2y24_A
3nco_A
Dope score per residue.

Models built by Modeller using local HHsearch alignments covering the complete target. Red: 1ogs_A. </figtable>

Although the DOPE score became worse since the target was truncated, the QMEAN6 score could be further increased. The RMSD could be reduced, in particular in case of 3nco_A, whereas the TM-score and GDT score remained the same. This demonstrated the sensitivity of the RMSD for small deviation whereas the TM-score and GDT score better evaluate the actual fit of the core region.

<figtable id="tab:modeller-hhsearch-local-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -53593 -0.744 0.749 1.003 0.866 0.731 0.532
2y24_A -46290 -0.030 0.564 1.162 0.724 0.563 0.381
3nco_A -25047 0.886 0.451 1.827 0.402 0.245 0.138

Evaluation of models built by Modeller using HHsearch alignments covering the complete target. </figtable>

Multiple-template modeling using Modeller alignments

For testing whether multiple templates could improve the model quality, we prepared three groups of templates: (1) 2wnw_A, 2y24_A, and 3kl0_A as close homologs, (2) 2wnw_A and 3nco_A as a close and a more distant homolog, and (3) 1ogs_A, 3ik2_A, and 3nco_A-1egz as remote homologs. A multiple alignment was created for each group via Modeller to which the target was added afterwards. The resulting alignment was used as input for Modeller.

<figtable id="tab:modeller-mult-models">

2wnw_A-2y24_A-3kl0_A
2wnw_A-3nco_A
3ik2_A-3nco_A-1egz_A
Dope score per residue.

Models built by Modeller using multiple templates. Red: 1ogs_A. </figtable>

<xr id="tab:modeller-mult-models"/> shows the resulting models. None of the template combinations led to better model than using 2wnw_A alone (cf. <xr id="tab:modeller-single-models"/>). We assume that 2wnw_A is more similar to 1ogs_A than 2y24_A and 3kl0_A are. The latter two templates rather drew the model away from the true structure 1ogs_A and did not contain additional information which helped building a better model. Surprisingly, using 2wnw_A in combination with 3nco_A resulted in a better model than using two templates with a higher sequence identity. Taking advantage of more than one template improves the model quality in particular if templates cover different region, e.g. different domains, of the target sequence. However, this had not been the case for our target since 2wnw_A already covered the largest fraction of it (cf. <xr id="tab:templates"/>).

<figtable id="tab:modeller-mult-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A-2y24_A-3kl0 -39084 1.930 0.314 1.987 0.465 0.277 0.179
2wnw_A-3nco_A -46962 0.807 0.556 1.256 0.778 0.593 0.409
3ik2_A-3nco_A-1egz_A -25391 0.3881 0.101 6.607 0.241 0.062 0.029

Evaluation of models built by Modeller using multiple templates. </figtable>


SWISS-MODEL

The SWISS-MODEL webserver provides the modes for carrying out the modeling: given the target sequence, the model is built completely automatically without user interference in case of the automated mode whereas the user can specify the target-template alignment by its own in case of the alignment mode.

Automated mode

For testing the automated mode, dictated the SWISS-MODEL server to use the selected templates 2wnw_A, 2y24_A, and 3nco_A, instead of identifying the templates by its own. In this way, we could better compare the methods for aligning the target to the template and building the model with those of the modeller package.

<figtable id="tab:swiss-auto-models">

2wnw_A
2y24_A
3nco_A
Dope score per residue.

Models built by SWISS-MODEL using the automated mode. Red: 1ogs_A. </figtable>

<xr id="tab:swiss-auto-models"/> shows the models build by the SWISS-MODEL server. Since the SWISS-MODEL server does not model starting and trailing ends of the target sequence which were not covered by the target, the results have to be compared to #Single-template modeling using local HHsearch alignments. Altogether, the model quality was quiet the same (cf. <xr id="tab:swiss-auto-eval"/>, <xr id="tab:modeller-hhsearch-local-eval"/>. Hence, the SWISS-MODEL alignments were in our case approximately as accurate as HHsearch alignments and the fragment assembly procedure worked as reliably as Modeller's satisfaction of spatial restrains.

<figtable id="tab:swiss-auto-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -53863 -0.786 0.530 1.003 0.858 0.723 0.540
2y24_A -42674 0.565 0.430 1.100 0.712 0.539 0.382
3nco_A -27101 2.109 0.270 2.158 0.422 0.252 0.138

</figtable>

Alignment mode

For testing the alignment mode, we used the same HHsearch alignments as we did for evaluating Modeller (cf. #Single-template modeling using HHsearch alignments). Modeller failed to built a model in case of 2y24_A an complained about alignment position 84, which was a gap. After removing this alignment column, the SWISS-MODEL server succeeded.

<figtable id="tab:swiss-ali-models">

2wnw_A
2y24_A
3nco_A
Dope score per residue.

Models built by SWISS-MODEL using the alignment mode. Red: 1ogs_A. </figtable>

Except for 2y24_A, the resulting models were only slightly of minor quality compared to the automated mode. This reconfirmed our observation that the SWISS-MODEL alignments were as good as HHsearch alignments. The model for 2y24_A was significantly worse which was probably caused be the deletion of alignment column 84.

<figtable id="tab:swiss-ali-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -53963 -0.788 0.540 1.011 0.859 0.722 0.536
2y24_A -40131 1.055 0.360 1.222 0.510 0.238 0.067
3nco_A -30244 0.689 0.240 2.001 0.426 0.247 0.135

</figtable>

I-TASSER

For being able to directly compare the prediction performance of I-TASSER with the other servers, we assigned the selected templates 2wnw_A, 2y24_A, and 3nco_A. The modelling took about 72 hours, compared to less than 5 minutes in case of Modeller and the SWISS-MODEL server.

<figtable id="tab:itasser-models">

2wnw_A, C-score: 1.47
2y24_A
3nco_A
Dope score per residue.

Models built by I-TASSER using different templates. Red: 1ogs_A. </figtable>

<figure id="fig:itasser-2wnw_A-loops">

Loop region 1-35 of 2wnw_A

</figure>

The resulting models turned out to be highly precise (cf. <xr id="tab:itasser-models"/>, <xr id="tab:itasser-eval"/>). Although the DOPE score and RMSD was not better compared to Modeller and the SWISS-MODEL server, the TM-score and GDT score was. In contrast to the other methods, I-TASSER also modelled the loop regions very accurate. <xr id="fig:itasser-2wnw_A-loops"/> demonstrates the the high coincidence of the loop region 1-35 with 1ogs_A, which was not covered by the template 2wnw_A. This indicated that the molecular dynamics simulation of I-TASSER worked very well in this case. Also the side-chains matched better those of 1ogs_A than the models built by the other methods. This also accounts for the high TM-score and GDT scores. The DOPE score is probably somewhat higher compared to Modeller, since I-TASSER does not minimize the DOPE score like Modeller, but another energy function.

<figtable id="tab:itasser-eval">

Template DOPE DOPE z-score QMEAN6 RMSD TM-score GDT_TS GTD_HA
2wnw_A -54684 -0.293 0.73 1.010 0.943 0.812 0.607
2y24_A -47194 0.777 0.376 1.222 0.550 0.294 0.223
3nco_A -44033 1.224 0.139 2.158 0.252 0.093 0.043

Evaluation of models built by I-TASSER using different templates. </figtable>