Difference between revisions of "Fabry:Homology based structure predictions"

From Bioinformatikpedia
m
(Calculation of models: Model 1)
Line 293: Line 293:
 
==== Default settings ====
 
==== Default settings ====
 
===== Model 1 =====
 
===== Model 1 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0; margin-top:0">
 
<figtable id="tab:pics_1R46_3HG3">
 
<figtable id="tab:pics_1R46_3HG3">
 
<caption>Model 1, visual comparison</caption>
 
<caption>Model 1, visual comparison</caption>
Line 304: Line 304:
 
</div>
 
</div>
   
  +
For the first model we used the template with the highest sequence identity. According to HHPred, the identity is 100%, Modeller only calculates an identity of 96% (see <xr id="tab:Modeller_scores_3hg3_2"/>). This discrepancy might be due to the way of the comparison - 1R46 is completely inclosed in 3HG3, but 3HG3 has a longer sequence (404 residues) and thus only 96% of it can be congruent to 1R46 (398 residues without signal peptide). In the left picture in <xr id="tab:pics_1R46_3HG3"/> the superimposition of the computed Model 1 and the actual target structure are shown. The right picture additionally displays the template structure. One can see, that the three structure almost perfectly superimpose, which is underlined by the scores derived from Modeller (see <xr id="tab:Modeller_scores_3hg3_2"/>). The GA341 score of 1.0 indicates a "native like" model (see basic [http://salilab.org/modeller/tutorial/basic.html tutorial]) and the Compactness <ref name="Compactness"> Foldit Wiki, Compactness (October 23, 2011), [http://foldit.wikia.com/wiki/Compactness http://foldit.wikia.com/wiki/Compactness]; May 26, 2012</ref> as well as the DOPE score (see basic [http://salilab.org/modeller/tutorial/basic.html tutorial]) are the second highest and second lowest of all calculated models, respectively.<br>
  +
The only parts that can not be modelled correctly are both ends of the sequence. Those parts are highlighted blue in the pictures. From our background knowledge we know that the first 31 residues form the signal peptide, that is cleaved off and thus can not be found in the tertiary structure of the target protein. This can not be modelled by the Modeller tool and thus it would be a good amendment to the modelling pipeline to add sequence based analyses like Signal peptide prediction, similiar to the predictions we made in [[Fabry:Sequence-based_analyses | Task 2]]. The lack of modellation of the last bit of the sequence can be pinned to the longer sequence of the 3HG3 structure, since the last 6 residues are craning and the template is 6 amino acids longer than the target.<br>
  +
Inspecting the problematic residues (see <xr id="tab:Modeller_scores_3hg3_1"/>), with a distance of more than 8 angstrom, manually in pymol, we discovered that two of them lie in loop regions (91 and 101) which are hard to model. On the other hand two of the residues are located in a helix (160 and 318) and seem to fit perfectly to the target.<br>
  +
For further evaluation of the model, please see [[Fabry:Homology_based_structure_predictions#Evaluation | Modeller Evaluation]]
   
 
<div style="float:left; border:thin solid lightgrey; margin-right: 20px;">
 
<div style="float:left; border:thin solid lightgrey; margin-right: 20px;">
Line 357: Line 361:
 
<br style="clear:both;">
 
<br style="clear:both;">
 
===== Model 2 =====
 
===== Model 2 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_1KTB">
 
<figtable id="tab:pics_1R46_1KTB">
 
<caption>Model 2, visual comparison</caption>
 
<caption>Model 2, visual comparison</caption>
Line 367: Line 371:
 
</figtable>
 
</figtable>
 
</div>
 
</div>
  +
   
   
Line 411: Line 416:
 
<br style="clear:both;">
 
<br style="clear:both;">
 
===== Model 3 =====
 
===== Model 3 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_3CC1">
 
<figtable id="tab:pics_1R46_3CC1">
 
<caption>Model 3, visual comparison</caption>
 
<caption>Model 3, visual comparison</caption>
Line 421: Line 426:
 
</figtable>
 
</figtable>
 
</div>
 
</div>
  +
GA341 is not sufficiently sensitive to distinguish between two 'very good' models. Any good model will give a score very close to 1 (e.g. running GA341 on most PDB structures will give a 1.0 score). You should use it only to discard 'bad' models (e.g. those with score less than 0.6). DOPE may be more sensitive in your case.
  +
<ref name="GA341"> Salilab - Modeller Usage, modeller ga341 score (February 21, 2006), [http://salilab.org/archives/modeller_usage/2006/msg00060.html http://salilab.org/archives/modeller_usage/2006/msg00060.html]; May 26, 2012</ref>
  +
A value > 0.7 generally indicates a reliable model, defined as ≥ 95% probability of correct fold.
  +
<ref name="Melo2002"> Melo F, Sánchez R, Sali A. (2002). ''Statistical potentials for fold assessment.'' Protein Sci. 2002 Feb;11(2):430-48. PMCID: PMC2373452</ref>
  +
Z-scores: Surface statistical potentials that contribute to the GA341. (http://modbase.compbio.ucsf.edu/modeval/help.cgi?type=help&style=helplink#z-pair)
  +
  +
 
<div style="float:left; border:thin solid lightgrey; margin-right: 20px;">
 
<div style="float:left; border:thin solid lightgrey; margin-right: 20px;">
 
<figtable id="tab:Modeller_scores_3cc1_1">
 
<figtable id="tab:Modeller_scores_3cc1_1">
Line 479: Line 491:
 
==== Multiple templates ====
 
==== Multiple templates ====
 
===== MULTI 1 =====
 
===== MULTI 1 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_multi1">
 
<figtable id="tab:pics_1R46_multi1">
 
<caption>Model MULTI 1, visual comparison</caption>
 
<caption>Model MULTI 1, visual comparison</caption>
Line 491: Line 503:
   
 
===== MULTI 2 =====
 
===== MULTI 2 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_multi2">
 
<figtable id="tab:pics_1R46_multi2">
 
<caption>Model MULTI 2, visual comparison</caption>
 
<caption>Model MULTI 2, visual comparison</caption>
Line 504: Line 516:
   
 
===== MULTI 3 =====
 
===== MULTI 3 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_multi3">
 
<figtable id="tab:pics_1R46_multi3">
 
<caption>Model MULTI 3, visual comparison</caption>
 
<caption>Model MULTI 3, visual comparison</caption>
Line 517: Line 529:
   
 
===== MULTI 4 =====
 
===== MULTI 4 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_multi4">
 
<figtable id="tab:pics_1R46_multi4">
 
<caption>Model MULTI 4, visual comparison</caption>
 
<caption>Model MULTI 4, visual comparison</caption>
Line 532: Line 544:
   
 
===== CHAS and CHAS 2 =====
 
===== CHAS and CHAS 2 =====
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_3HG3_CHAS">
 
<figtable id="tab:pics_1R46_3HG3_CHAS">
 
<caption>Models CHAS and CHAS 2, visual comparison</caption>
 
<caption>Models CHAS and CHAS 2, visual comparison</caption>
Line 646: Line 658:
 
===== CHAS 3 =====
 
===== CHAS 3 =====
   
<div style="float:left; border:thin solid lightgrey; margin: 20px;">
+
<div style="float:left; border:thin solid lightgrey; margin: 20px; margin-left:0">
 
<figtable id="tab:pics_1R46_3HG3_CHAS3">
 
<figtable id="tab:pics_1R46_3HG3_CHAS3">
 
<caption>Model CHAS 3, visual comparison</caption>
 
<caption>Model CHAS 3, visual comparison</caption>

Revision as of 15:32, 27 May 2012

Fabry Disease » Homology based structure predictions



The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.

Dataset preparation and target comparison

Datasets

<figtable id="tab:datasetHHpred"> Dataset HHpred, E-value cutoff 1e-15

pdb ID E-value Identity in %
> 80% sequence identity
3hg3 8.6e-90 100
40% - 80% sequence identity
1ktb 4.2e-85 53
< 30% sequence identity
3cc1 5.5e-74 25
1zy9 3.1e-48 13
3a24 7.8e-40 17
2xn2 5.3e-37 15
2d73 5.7e-36 14
3mi6 1.4e-31 15
2yfo 9.1e-30 13
2f2h 2.7e-20 17
2g3m 2.2e-20 16
3nsx 6e-20 13
3lpp 2.2e-18 15
3l4y 1.9e-18 15
3top 3.6e-18 12
2xvl 3.2e-18 16
2x2h 4.9e-16 13

</figtable>

<figtable id="tab:datasetHHpred"> Additional sequences HHpred, E-value cutoff 0.002

pdb ID E-value Identity in %
3zss 0.00062 10
1j0h 0.0011 15
1ea9 0.00098 12

</figtable>

<figtable id="tab:datasetCOMA"> Dataset COMA, E-value cutoff 0.002

pdb ID E-value Identity in %
> 80% sequence identity
- - -
40% - 80% sequence identity
1ktb 1.7e-61 52
< 30% sequence identity
3lrk 1.2e-66 23
3a21 2.7e-65 26
1szn 3.7e-59 22
3cc1 5.2e-58 19
1zy9 1.7e-39 9
3mi6 4.3e-38 11
2yfn 4.4e-35 10
2d73 1.9e-32 9
3a24 5.6e-30 10
1xsi 1.9e-12 10
2g3m 2.4e-11 10
3pha 2.9e-10 6
3lpo 4.7e-09 8
2x2h 8.2e-09 8
3mo4 1.2e-08 7
2xvg 2.4e-08 8
3ton 4.3e-08 8
2xib 1e-07 7
3eyp 1.6e-06 8
3k1d 3.5e-06 9
2zwy 8.8e-06 9
3gza 1.8e-05 8
3m07 2.3e-05 7
1eh9 0.00013 6
1gvi 0.00035 8
1aqh 0.00039 5
1mwo 0.00058 7
3vmn 0.0018 9
1bf2 0.0019 6
3aml 0.0019 8

</figtable>

We performed a HHpred as well as a COMA search, to generate three distinct datasets. Since COMA did not find any homologue structures with a similarity above 41% (see <xr id="tab:datasetCOMA"/>), we used the dataset created with the HHpred search and the script described in the journal. Hereby we found one structure with a similarity above 80%, one with a similarity between 40 and 80% and 15 with sequence similarity below 30%, of which 14 had a similarity of under 20% (see <xr id="tab:datasetHHpred" />). All HHpred matches had an E-value below 1e-15, for the COMA homologues we tried a less strict threshold of 0.002.
In most of the cases we used the structures 3hg3, 1ktb and 3cc1 for modelling, because either they are the only representatives in their class, or in the case od 3cc1, the sequence identity did not seem too low. For the Model MULTI 3 we also used the structures 3a24 and 3zss. The latter of those has an E-value of 0.00062. We added this structure to examine how a template with an E-value that is worse than the value of all our other structures, but still would fullfill the restrictions of an usual BLAST search (threshold of 0.003), would perform.
In this case it is important to mention, that although the identity of 3hg3 is 100%, it is not the pdb structure annotated for the AGAL protein, but the structure of the substrate bound catalytic mechanism, hence the high similarity.
1ktb is the X-ray structure for the already mentioned α-N-acetylgalactosiminidase in chicken, which in future might be used for enzyme replacement therapy in the treatment of Fabry Disease.
The last one of the frequently used structures, 3cc1, is the x-ray structure of a putative α-N-acetylgalactosiminidase in in Bacillus Halodurans.


Target comparison

<figtable id="tab:compare"> Comparison of apo and complex structure

Superimposed structures of 1R46 (blue) and 1R47 (green) in cartoon representation. Obviously, the structures do not differ much.
Comparison of the residues invoked in the binding of α-galactose in the apo structure (blue) and the complex structure (green)

</figtable>

<figure id="fig:GAL:1R47">

Residues involved in the binding of α-galactose in 1R47 source

</figure>

As an initial step of the evaluation, we compared the apo structure 1R46 and the complex structure (with bound α-galactose) 1R47. Since the alignment of both the chains A of 1R46 and 1R47 in Pymol (see <xr id="tab:compare"/>) revealed a RMSD value of 0.248 and the comparison of the position and direction of the residues involved in the binding of the sugar (see <xr id="fig:GAL:1R47"/>) do not differ significantly, we used only the 1R46 structure for vizualisation, but computed all values and statistics for both structures.
In the right figure in <xr id="tab:compare"/>, the residues Asp92A, Asp93A, LYS168A, ARG227A and ASP231A are depicted in sticks representation (thicker); they are responsible for the binding of the sugar in the complex structures, which is shown in magenta. Clearly, one can see not much difference in this region between 1R46 and 1R47.


Modeller

Calculation of models

With this tool, we created 10 models (see Journal). The first three were produced with the standard settings and workflow of Modeller. The subsequent four models were computed from multiple target files in different combinations and in the last three models we rearranged the alignment files in order to test the quality of the alignment and the influence of the two types of alignment.

Default settings

Model 1

<figtable id="tab:pics_1R46_3HG3"> Model 1, visual comparison

Model 1 (red), created with Modeller with the template 3HG3, superimposed on the x-ray structure of α-Galactosidase A (green)
Model 1 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3HG3 (yellow)

</figtable>

For the first model we used the template with the highest sequence identity. According to HHPred, the identity is 100%, Modeller only calculates an identity of 96% (see <xr id="tab:Modeller_scores_3hg3_2"/>). This discrepancy might be due to the way of the comparison - 1R46 is completely inclosed in 3HG3, but 3HG3 has a longer sequence (404 residues) and thus only 96% of it can be congruent to 1R46 (398 residues without signal peptide). In the left picture in <xr id="tab:pics_1R46_3HG3"/> the superimposition of the computed Model 1 and the actual target structure are shown. The right picture additionally displays the template structure. One can see, that the three structure almost perfectly superimpose, which is underlined by the scores derived from Modeller (see <xr id="tab:Modeller_scores_3hg3_2"/>). The GA341 score of 1.0 indicates a "native like" model (see basic tutorial) and the Compactness <ref name="Compactness"> Foldit Wiki, Compactness (October 23, 2011), http://foldit.wikia.com/wiki/Compactness; May 26, 2012</ref> as well as the DOPE score (see basic tutorial) are the second highest and second lowest of all calculated models, respectively.
The only parts that can not be modelled correctly are both ends of the sequence. Those parts are highlighted blue in the pictures. From our background knowledge we know that the first 31 residues form the signal peptide, that is cleaved off and thus can not be found in the tertiary structure of the target protein. This can not be modelled by the Modeller tool and thus it would be a good amendment to the modelling pipeline to add sequence based analyses like Signal peptide prediction, similiar to the predictions we made in Task 2. The lack of modellation of the last bit of the sequence can be pinned to the longer sequence of the 3HG3 structure, since the last 6 residues are craning and the template is 6 amino acids longer than the target.
Inspecting the problematic residues (see <xr id="tab:Modeller_scores_3hg3_1"/>), with a distance of more than 8 angstrom, manually in pymol, we discovered that two of them lie in loop regions (91 and 101) which are hard to model. On the other hand two of the residues are located in a helix (160 and 318) and seem to fit perfectly to the target.
For further evaluation of the model, please see Modeller Evaluation

<figtable id="tab:Modeller_scores_3hg3_1"> Modeller scores Model 3hg3, Distances

Model Distances > 8.0 Å
in 2d alignment
Distances > 8.0 Å
3hg3 Pos:
428
Dist
76.568
Pos:
1
Dist
28.357
Pos:
91
Dist
8.810
Pos:
101
Dist
17.314
Pos:
112
Dist
25.386
Pos:
160
Dist
32.647
Pos:
318
Dist
27.449
Pos:
333
Dist
42.457

</figtable> <figtable id="tab:Modeller_scores_3hg3_2"> Modeller scores Model 3hg3

% sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined)
GA341 score DOPE score
95.570999 429 0.215183 -213.518650 -9.487873 -5.603112 -10.125743 -6.381974 -11.484159 1.000000 -52607.89844

</figtable>


Model 2

<figtable id="tab:pics_1R46_1KTB"> Model 2, visual comparison

Model 2 (red), created with Modeller with the template 1ktb, superimposed on the x-ray structure of α-Galactosidase A (green)
Model 2 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 1ktb (yellow)

</figtable>


<figtable id="tab:Modeller_scores_1ktb"> Modeller scores Model 1ktb

Model Distances > 8.0 Å
in 2d alignment
Distances > 8.0 Å % sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined)
GA341 score DOPE score
1ktb Pos:
0
Dist
0
Pos:
0
Dist
0
53.351002 429 0.176840 -107.285679 -7.043755 -3.262988 -8.593054 -6.151719 -10.076556 1.000000 -49267.35156

</figtable>



Model 3

<figtable id="tab:pics_1R46_3CC1"> Model 3, visual comparison

Model 3 (red), created with Modeller with the template 3CC1, superimposed on the x-ray structure of α-Galactosidase A (green)
Model 3 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3CC1 (yellow)

</figtable>

GA341 is not sufficiently sensitive to distinguish between two 'very good' models. Any good model will give a score very close to 1 (e.g. running GA341 on most PDB structures will give a 1.0 score). You should use it only to discard 'bad' models (e.g. those with score less than 0.6). DOPE may be more sensitive in your case. <ref name="GA341"> Salilab - Modeller Usage, modeller ga341 score (February 21, 2006), http://salilab.org/archives/modeller_usage/2006/msg00060.html; May 26, 2012</ref> A value > 0.7 generally indicates a reliable model, defined as ≥ 95% probability of correct fold. <ref name="Melo2002"> Melo F, Sánchez R, Sali A. (2002). Statistical potentials for fold assessment. Protein Sci. 2002 Feb;11(2):430-48. PMCID: PMC2373452</ref> Z-scores: Surface statistical potentials that contribute to the GA341. (http://modbase.compbio.ucsf.edu/modeval/help.cgi?type=help&style=helplink#z-pair)


<figtable id="tab:Modeller_scores_3cc1_1"> Modeller scores Model 3cc1, Distances

Model Distances > 8.0 Å
in 2d alignment
Distances > 8.0 Å
3cc1 Pos:
433
Dist
63.967
Pos:
147
Dist
25.085
Pos:
290
Dist
19.238
Pos:
374
Dist
24.356
Pos:
395
Dist
15.007
Pos:
412
Dist
61.733
Pos:
452
Dist
23.680
Pos:
631
Dist
23.283
Pos:
659
Dist
8.421
Pos:
684
Dist
10.763
Pos:
703
Dist
10.204
Pos:
762
Dist
10.753

</figtable> <figtable id="tab:Modeller_scores_3cc1_2"> Modeller scores Model 3cc1

% sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined)
GA341 score DOPE score
24.242001 429 0.139850 198.134571 24.857669 7.885459 -3.800148 -1.528096 -3.572654 0.332343 -38190.22656

</figtable>


Multiple templates

MULTI 1

<figtable id="tab:pics_1R46_multi1"> Model MULTI 1, visual comparison

Model MULTI 1 (red) (templates 3HG3 and 1KTB), superimposed on the x-ray structure of α-Galactosidase A (green)
Model MULTI 1 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3HG3 (yellow) and 1KTB (orange)

</figtable>

MULTI 2

<figtable id="tab:pics_1R46_multi2"> Model MULTI 2, visual comparison

Model MULTI 2 (red), created with Modeller on basis of the templates 3HG3, 1KTB and 3CC1, superimposed on the x-ray structure of α-Galactosidase A (green)
Model MULTI 2 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3HG3 (yellow), 1KTB (orange) and 3CC1 (lightorange)

</figtable>


MULTI 3

<figtable id="tab:pics_1R46_multi3"> Model MULTI 3, visual comparison

Model MULTI 3 (red), created with Modeller on basis of the templates 3CC1, 3ZSS and 3A24, superimposed on the x-ray structure of α-Galactosidase A (green)
Model MULTI 3 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3CC1 (yellow), 3ZSS (orange) and 3A24 (lightorange)

</figtable>


MULTI 4

<figtable id="tab:pics_1R46_multi4"> Model MULTI 4, visual comparison

Model MULTI 4 (red), created with Modeller on basis of the templates 3CC1 and 3HG3, superimposed on the x-ray structure of α-Galactosidase A (green)
Model MULTI 4 (red) superimposed on the x-ray structure of α-Galactosidase A (green) and the structure of 3CC1 (yellow) and 3HG3 (orange)

</figtable>


Edited Alignment input

CHAS and CHAS 2

<figtable id="tab:pics_1R46_3HG3_CHAS"> Models CHAS and CHAS 2, visual comparison

Model CHAS (red), with active site shifted right to next D (7 and 1 positions) in 2d alignment file, superimposed on the x-ray structure of α-Galactosidase A (green)
For comparison with Model CHAS and CHAS 2, Model 1 (orange) which was basis for the edited alignments, superimposed on α-Galactosidase A (green)
Model CHAS 2 (red), with active site shifted right to next D (7 and 1 positions) in both alignment files, superimposed on the x-ray structure of α-Galactosidase A (green)

</figtable>

<figtable id="tab:Modeller_scores_CHAS_1"> Modeller scores Model CHAS, Distances

Model Distances > 8.0 Å
CHAS Pos
1
Dist
28.357
Pos
91
Dist
8.810
Pos
101
Dist
17.314
Pos
112
Dist
25.386
Pos
160
Dist
32.647
Pos
318
Dist
27.449
Pos
333
Dist
42.457

</figtable> <figtable id="tab:Modeller_scores_CHAS_2"> Modeller scores Model CHAS

% sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined)
GA341 score DOPE score
95.570999 429 0.215183 -213.518650 -9.487873 -5.603112 -10.125743 -6.381974 -11.484159 1.000000 -52607.89844

</figtable>

<figtable id="tab:Modeller_scores_CHAS2_1"> Modeller scores Model CHAS2

Model Distances > 8.0 Å
CHAS2 Pos
1
Dist
28.357
Pos
91
Dist
8.810
Pos
101
Dist
17.314
Pos
112
Dist
25.386
Pos
160
Dist
32.647
Pos
318
Dist
27.449
Pos
333
Dist
42.457
Pos
538
Dist
10.921
Pos
601
Dist
10.536

</figtable> <figtable id="tab:Modeller_scores_CHAS2_2"> Modeller scores Model CHAS2

% sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined)
GA341 score DOPE score
40.326000 429 0.204903 122.562212 22.166245 5.785874 -4.522039 -1.645839 -4.686339 0.995974 -40807.12109

</figtable>


CHAS 3

<figtable id="tab:pics_1R46_3HG3_CHAS3"> Model CHAS 3, visual comparison

Model CHAS 3 (red), with active site shifted right to next D (7 and 1 positions) in both alignment files and the substrate binding region (position 203-207) forced to be consecutive, superimposed on of α-Galactosidase A (green)
For comparison with Model CHAS 3, Model 1 (orange) which was basis for the edited alignments, created with Modeller on basis of the templates 3HG3, superimposed on the x-ray structure of α-Galactosidase A (green)

</figtable>

Methods:
alignment.malign() -- align two or more sequences
alignment.align2d() -- align sequences with structures; The alignment.align2d() command is preferred for aligning a sequence with structure(s) in comparative modeling because it tends to place gaps in a better structural context

<figtable id="tab:Modeller_scores_CHAS3_1"> Modeller scores Model CHAS3, Distances

Model Distances > 8.0 Å
CHAS3 Pos
1
Dist
28.357
Pos
91
Dist
8.810
Pos
101
Dist
17.314
Pos
112
Dist
25.386
Pos
160
Dist
32.647
Pos
318
Dist
27.449
Pos
333
Dist
42.457
Pos
538
Dist
10.921
Pos
601
Dist
10.536

</figtable> <figtable id="tab:Modeller_scores_CHAS3_2"> Modeller scores Model CHAS3

% sequID Sequ length Compact-
ness
Native energy
(pair)
Native energy
(surface)
Native energy
(combined)
Z score
(pair)
Z score
(surface)
Z score
(combined))
GA341 score DOPE score
40.326000 429 0.204903 122.562212 22.166245 5.785874 -4.522039 -1.645839 -4.686339 0.995974 -40807.12109

</figtable>



Evaluation

TM-score

<figtable id="tab:TMscore_1R46"> TM-score

Model Number of residues
in common
RMSD of the
common residues
TM-score GDT-TS-score GDT-HA-score
Model 1 390 1.115 0.9841 0.9667 0.8558
Model 2 390 2.098 0.9596 0.9071 0.7635
Model 3 390 22.707 0.4087 0.2699 0.1814
MULTI 1 390 0.575 0.9938 0.9910 0.9128
MULTI 2 390 12.625 0.7364 0.6949 0.6404
MULTI 3 390 21.196 0.2048 0.0673 0.0314
MULTI 4 390 10.798 0.7405 0.6737 0.5833
CHAS 390 1.115 0.9841 0.9667 0.8558
CHAS 2 390 15.292 0.4651 0.3622 0.3038
CHAS 3 390 15.292 0.4651 0.3622 0.3038

</figtable>

<figtable id="tab:TMscore_1R47"> TM-score

Model Number of residues
in common
RMSD of the
common residues
TM-score GDT-TS-score GDT-HA-score
Model 1 390 1.119 0.9840 0.9654 0.8519
Model 2 390 2.093 0.9600 0.9083 0.7647
Model 3 390 22.713 0.4092 0.2731 0.1821
MULTI 1 390 0.575 0.9938 0.9897 0.9115
MULTI 2 390 12.609 0.7363 0.6942 0.6378
MULTI 3 390 21.191 0.2058 0.0679 0.0314
MULTI 4 390 10.793 0.7405 0.6744 0.5846
CHAS 390 1.119 0.9840 0.9654 0.8519
CHAS 2 390 15.290 0.4652 0.3635 0.3019
CHAS 3 390 15.290 0.4652 0.3635 0.3019

</figtable>


RMSD with SAP

<figtable id="tab:SAP_1R46_mod"> RMSD of Modeller models compared to 1R46

Model Number of residues
in common
Weighted RMSd Un-weighted RMSd
Model 1 390 0.532 1.115
Model 2 390 0.571 1.574
Model 3 376 1.833 20.273
MULTI 1 390 0.396 0.575
MULTI 2 390 0.479 2.689
MULTI 3 385 11.003 17.580
MULTI 4 380 0.904 3.833
CHAS 390 0.532 1.115
CHAS 2 378 0.613 1.492
CHAS 3 378 0.613 1.492

</figtable>

<figure id="fig:RMSD_1ktb">

Depiction of the RMSD (green) of Model 2 (magenta) from Modeller and 1R46 (cyan)

</figure>

<figtable id="tab:SAP_1R47_mod"> RMSD of Modeller models compared to 1R47

Model Number of residues
in common
Weighted RMSd Un-weighted RMSd
Model 1 391 nan nan
Model 2 391 0.717 1.569
Model 3 376 1.817 20.281
MULTI 1 390 0.396 0.575
MULTI 2 391 0.472 2.693
MULTI 3 383 9.297 17.430
MULTI 4 380 0.912 3.836
CHAS 391 nan nan
CHAS 2 378 0.618 1.498
CHAS 3 378 0.618 1.498

</figtable>


DOPE score

<figure id="fig:DOPE_Model">

Per residue DOPE score comparison of 1R46 (green) with Model 1, 2 and 3 (red, orange and pink)

</figure> <figure id="fig:DOPE_MULTI">

Per residue DOPE score comparison of 1R46 (green) with MULTI 1-4 (red, orange, pink and purple)

</figure> <figure id="fig:DOPE_CHAS">

Per residue DOPE score comparison of 1R46 (green) with CHAS 1, 2 and 3 (red, orange and pink)

</figure> <figure id="fig:DOPE_Best2">

Per residue DOPE score comparison of 1R46 (green) with the two subjectively best models Model 1 and MULTI 1 (red and orange)

</figure>


Swissmodel

Calculation of models

Evaluation

TM-score

<figtable id="tab:TMscore_1R46_sm"> TM-score Swissmodel 1R46

Model Number of residues
in common
RMSD of the
common residues
TM-score GDT-TS-score GDT-HA-score
output_TMscore/out/1R46_Model_2.out 390 0.512 0.9950 0.9917 0.9218
output_TMscore/out/1R46_Model_3.out 390 1.551 0.9660 0.9032 0.7538

</figtable>

<figtable id="tab:TMscore_1R47_sm"> TM-score Swissmodel 1R47

Model Number of residues
in common
RMSD of the
common residues
TM-score GDT-TS-score GDT-HA-score
output_TMscore/out/1R47_Model_2.out 390 0.515 0.9950 0.9923 0.9231
output_TMscore/out/1R47_Model_3.out 390 1.532 0.9667 0.9058 0.7545

</figtable>


RMSD with SAP


iTasser

3D-Jigsaw

References

<references/>