Difference between revisions of "Fabry:Homology based structure predictions"
Rackersederj (talk | contribs) m (→Dataset preparation and target comparison) |
Rackersederj (talk | contribs) m |
||
Line 1,096: | Line 1,096: | ||
== 3D-Jigsaw == |
== 3D-Jigsaw == |
||
+ | |||
+ | == References == |
||
+ | <references/> |
Revision as of 12:56, 27 May 2012
Fabry Disease » Homology based structure predictions
The following analyses were performed on the basis of the α-Galactosidase A sequence. Please consult the journal for the commands used to generate the results.
Contents
Dataset preparation and target comparison
Datasets
<figtable id="tab:datasetHHpred"> Dataset HHpred, E-value cutoff 1e-15
pdb ID | E-value | Identity in % |
---|---|---|
> 80% sequence identity | ||
3hg3 | 8.6e-90 | 100 |
40% - 80% sequence identity | ||
1ktb | 4.2e-85 | 53 |
< 30% sequence identity | ||
3cc1 | 5.5e-74 | 25 |
1zy9 | 3.1e-48 | 13 |
3a24 | 7.8e-40 | 17 |
2xn2 | 5.3e-37 | 15 |
2d73 | 5.7e-36 | 14 |
3mi6 | 1.4e-31 | 15 |
2yfo | 9.1e-30 | 13 |
2f2h | 2.7e-20 | 17 |
2g3m | 2.2e-20 | 16 |
3nsx | 6e-20 | 13 |
3lpp | 2.2e-18 | 15 |
3l4y | 1.9e-18 | 15 |
3top | 3.6e-18 | 12 |
2xvl | 3.2e-18 | 16 |
2x2h | 4.9e-16 | 13 |
</figtable>
<figtable id="tab:datasetHHpred"> Additional sequences HHpred, E-value cutoff 0.002
pdb ID | E-value | Identity in % |
---|---|---|
3zss | 0.00062 | 10 |
1j0h | 0.0011 | 15 |
1ea9 | 0.00098 | 12 |
</figtable>
<figtable id="tab:datasetCOMA"> Dataset COMA, E-value cutoff 0.002
pdb ID | E-value | Identity in % |
---|---|---|
> 80% sequence identity | ||
- | - | - |
40% - 80% sequence identity | ||
1ktb | 1.7e-61 | 52 |
< 30% sequence identity | ||
3lrk | 1.2e-66 | 23 |
3a21 | 2.7e-65 | 26 |
1szn | 3.7e-59 | 22 |
3cc1 | 5.2e-58 | 19 |
1zy9 | 1.7e-39 | 9 |
3mi6 | 4.3e-38 | 11 |
2yfn | 4.4e-35 | 10 |
2d73 | 1.9e-32 | 9 |
3a24 | 5.6e-30 | 10 |
1xsi | 1.9e-12 | 10 |
2g3m | 2.4e-11 | 10 |
3pha | 2.9e-10 | 6 |
3lpo | 4.7e-09 | 8 |
2x2h | 8.2e-09 | 8 |
3mo4 | 1.2e-08 | 7 |
2xvg | 2.4e-08 | 8 |
3ton | 4.3e-08 | 8 |
2xib | 1e-07 | 7 |
3eyp | 1.6e-06 | 8 |
3k1d | 3.5e-06 | 9 |
2zwy | 8.8e-06 | 9 |
3gza | 1.8e-05 | 8 |
3m07 | 2.3e-05 | 7 |
1eh9 | 0.00013 | 6 |
1gvi | 0.00035 | 8 |
1aqh | 0.00039 | 5 |
1mwo | 0.00058 | 7 |
3vmn | 0.0018 | 9 |
1bf2 | 0.0019 | 6 |
3aml | 0.0019 | 8 |
</figtable>
We performed a HHpred as well as a COMA search, to generate three distinct datasets. Since COMA did not find any homologue structures with a similarity above 41% (see <xr id="tab:datasetCOMA"/>), we used the dataset created with the HHpred search and the script described in the journal. Hereby we found one structure with a similarity above 80%, one with a similarity between 40 and 80% and 15 with sequence similarity below 30%, of which 14 had a similarity of under 20% (see <xr id="tab:datasetHHpred" />). All HHpred matches had an E-value below 1e-15, for the COMA homologues we tried a less strict threshold of 0.002.
In most of the cases we used the structures 3hg3, 1ktb and 3cc1 for modelling, because either they are the only representatives in their class, or in the case od 3cc1, the sequence identity did not seem too low. For the Model MULTI 3 we also used the structures 3a24 and 3zss. The latter of those has an E-value of 0.00062. We added this structure to examine how a template with an E-value that is worse than the value of all our other structures, but still would fullfill the restrictions of an usual BLAST search (threshold of 0.003), would perform.
In this case it is important to mention, that although the identity of 3hg3 is 100%, it is not the pdb structure annotated for the AGAL protein, but the structure of the substrate bound catalytic mechanism, hence the high similarity.
1ktb is the X-ray structure for the already mentioned α-N-acetylgalactosiminidase in chicken, which in future might be used for enzyme replacement therapy in the treatment of Fabry Disease.
The last one of the frequently used structures, 3cc1, is the x-ray structure of a putative α-N-acetylgalactosiminidase in in Bacillus Halodurans.
Target comparison
<figure id="fig:GAL:1R47">
</figure>
As an initial step of the evaluation, we compared the apo structure 1R46 and the complex structure (with bound α-galactose) 1R47. Since the alignment of both the chains A of 1R46 and 1R47 in Pymol (see <xr id="tab:compare"/>) revealed a RMSD value of 0.248 and the comparison of the position and direction of the residues involved in the binding of the sugar (see <xr id="fig:GAL:1R47"/>) do not differ significantly, we used only the 1R46 structure for vizualisation, but computed all values and statistics for both structures.
In the right figure in <xr id="tab:compare"/>, the residues Asp92A, Asp93A, LYS168A, ARG227A and ASP231A are depicted in sticks representation (thicker); they are responsible for the binding of the sugar in the complex structures, which is shown in magenta. Clearly, one can see not much difference in this region between 1R46 and 1R47.
Modeller
Calculation of models
With this tool, we created 10 models (see Journal). The first three were produced with the standard settings and workflow of Modeller. The subsequent four models were computed from multiple target files in different combinations and in the last three models we rearranged the alignment files in order to test the quality of the alignment and the influence of the two types of alignment.
Default settings
Model 1
<figtable id="tab:Modeller_scores_3hg3_1"> Modeller scores Model 3hg3, Distances
Model | Distances > 8.0 Å in 2d alignment |
Distances > 8.0 Å | ||||||
---|---|---|---|---|---|---|---|---|
3hg3 | Pos: 428 Dist 76.568 |
Pos: 1 Dist 28.357 |
Pos: 91 Dist 8.810 |
Pos: 101 Dist 17.314 |
Pos: 112 Dist 25.386 |
Pos: 160 Dist 32.647 |
Pos: 318 Dist 27.449 |
Pos: 333 Dist 42.457 |
</figtable> <figtable id="tab:Modeller_scores_3hg3_2"> Modeller scores Model 3hg3
% sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|
95.570999 | 429 | 0.215183 | -213.518650 | -9.487873 | -5.603112 | -10.125743 | -6.381974 | -11.484159 | 1.000000 | -52607.89844 |
</figtable>
Model 2
<figtable id="tab:Modeller_scores_1ktb"> Modeller scores Model 1ktb
Model | Distances > 8.0 Å in 2d alignment |
Distances > 8.0 Å | % sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1ktb | Pos: 0 Dist 0 |
Pos: 0 Dist 0 |
53.351002 | 429 | 0.176840 | -107.285679 | -7.043755 | -3.262988 | -8.593054 | -6.151719 | -10.076556 | 1.000000 | -49267.35156 |
</figtable>
Model 3
<figtable id="tab:Modeller_scores_3cc1_1"> Modeller scores Model 3cc1, Distances
Model | Distances > 8.0 Å in 2d alignment |
Distances > 8.0 Å | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
3cc1 | Pos: 433 Dist 63.967 |
Pos: 147 Dist 25.085 |
Pos: 290 Dist 19.238 |
Pos: 374 Dist 24.356 |
Pos: 395 Dist 15.007 |
Pos: 412 Dist 61.733 |
Pos: 452 Dist 23.680 |
Pos: 631 Dist 23.283 |
Pos: 659 Dist 8.421 |
Pos: 684 Dist 10.763 |
Pos: 703 Dist 10.204 |
Pos: 762 Dist 10.753 |
</figtable> <figtable id="tab:Modeller_scores_3cc1_2"> Modeller scores Model 3cc1
% sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|
24.242001 | 429 | 0.139850 | 198.134571 | 24.857669 | 7.885459 | -3.800148 | -1.528096 | -3.572654 | 0.332343 | -38190.22656 |
</figtable>
Multiple templates
MULTI 1
MULTI 2
MULTI 3
MULTI 4
Edited Alignment input
CHAS and CHAS 2
<figtable id="tab:Modeller_scores_CHAS_1"> Modeller scores Model CHAS, Distances
Model | Distances > 8.0 Å | ||||||
---|---|---|---|---|---|---|---|
CHAS | Pos 1 Dist 28.357 |
Pos 91 Dist 8.810 |
Pos 101 Dist 17.314 |
Pos 112 Dist 25.386 |
Pos 160 Dist 32.647 |
Pos 318 Dist 27.449 |
Pos 333 Dist 42.457 |
</figtable> <figtable id="tab:Modeller_scores_CHAS_2"> Modeller scores Model CHAS
% sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|
95.570999 | 429 | 0.215183 | -213.518650 | -9.487873 | -5.603112 | -10.125743 | -6.381974 | -11.484159 | 1.000000 | -52607.89844 |
</figtable>
<figtable id="tab:Modeller_scores_CHAS2_1"> Modeller scores Model CHAS2
Model | Distances > 8.0 Å | ||||||||
---|---|---|---|---|---|---|---|---|---|
CHAS2 | Pos 1 Dist 28.357 |
Pos 91 Dist 8.810 |
Pos 101 Dist 17.314 |
Pos 112 Dist 25.386 |
Pos 160 Dist 32.647 |
Pos 318 Dist 27.449 |
Pos 333 Dist 42.457 |
Pos 538 Dist 10.921 |
Pos 601 Dist 10.536 |
</figtable> <figtable id="tab:Modeller_scores_CHAS2_2"> Modeller scores Model CHAS2
% sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|
40.326000 | 429 | 0.204903 | 122.562212 | 22.166245 | 5.785874 | -4.522039 | -1.645839 | -4.686339 | 0.995974 | -40807.12109 |
</figtable>
CHAS 3
Methods:
alignment.malign() -- align two or more sequences
alignment.align2d() -- align sequences with structures; The alignment.align2d() command is preferred for aligning a sequence with structure(s) in comparative modeling because it tends to place gaps in a better structural context
<figtable id="tab:Modeller_scores_CHAS3_1"> Modeller scores Model CHAS3, Distances
Model | Distances > 8.0 Å | ||||||||
---|---|---|---|---|---|---|---|---|---|
CHAS3 | Pos 1 Dist 28.357 |
Pos 91 Dist 8.810 |
Pos 101 Dist 17.314 |
Pos 112 Dist 25.386 |
Pos 160 Dist 32.647 |
Pos 318 Dist 27.449 |
Pos 333 Dist 42.457 |
Pos 538 Dist 10.921 |
Pos 601 Dist 10.536 |
</figtable> <figtable id="tab:Modeller_scores_CHAS3_2"> Modeller scores Model CHAS3
% sequID | Sequ length | Compact- ness |
Native energy (pair) |
Native energy (surface) |
Native energy (combined) |
Z score (pair) |
Z score (surface) |
Z score (combined)) |
GA341 score | DOPE score |
---|---|---|---|---|---|---|---|---|---|---|
40.326000 | 429 | 0.204903 | 122.562212 | 22.166245 | 5.785874 | -4.522039 | -1.645839 | -4.686339 | 0.995974 | -40807.12109 |
</figtable>
Evaluation
TM-score
<figtable id="tab:TMscore_1R46"> TM-score
Model | Number of residues in common |
RMSD of the common residues |
TM-score | GDT-TS-score | GDT-HA-score |
---|---|---|---|---|---|
Model 1 | 390 | 1.115 | 0.9841 | 0.9667 | 0.8558 |
Model 2 | 390 | 2.098 | 0.9596 | 0.9071 | 0.7635 |
Model 3 | 390 | 22.707 | 0.4087 | 0.2699 | 0.1814 |
MULTI 1 | 390 | 0.575 | 0.9938 | 0.9910 | 0.9128 |
MULTI 2 | 390 | 12.625 | 0.7364 | 0.6949 | 0.6404 |
MULTI 3 | 390 | 21.196 | 0.2048 | 0.0673 | 0.0314 |
MULTI 4 | 390 | 10.798 | 0.7405 | 0.6737 | 0.5833 |
CHAS | 390 | 1.115 | 0.9841 | 0.9667 | 0.8558 |
CHAS 2 | 390 | 15.292 | 0.4651 | 0.3622 | 0.3038 |
CHAS 3 | 390 | 15.292 | 0.4651 | 0.3622 | 0.3038 |
</figtable>
<figtable id="tab:TMscore_1R47"> TM-score
Model | Number of residues in common |
RMSD of the common residues |
TM-score | GDT-TS-score | GDT-HA-score |
---|---|---|---|---|---|
Model 1 | 390 | 1.119 | 0.9840 | 0.9654 | 0.8519 |
Model 2 | 390 | 2.093 | 0.9600 | 0.9083 | 0.7647 |
Model 3 | 390 | 22.713 | 0.4092 | 0.2731 | 0.1821 |
MULTI 1 | 390 | 0.575 | 0.9938 | 0.9897 | 0.9115 |
MULTI 2 | 390 | 12.609 | 0.7363 | 0.6942 | 0.6378 |
MULTI 3 | 390 | 21.191 | 0.2058 | 0.0679 | 0.0314 |
MULTI 4 | 390 | 10.793 | 0.7405 | 0.6744 | 0.5846 |
CHAS | 390 | 1.119 | 0.9840 | 0.9654 | 0.8519 |
CHAS 2 | 390 | 15.290 | 0.4652 | 0.3635 | 0.3019 |
CHAS 3 | 390 | 15.290 | 0.4652 | 0.3635 | 0.3019 |
</figtable>
RMSD with SAP
<figtable id="tab:SAP_1R46_mod"> RMSD of Modeller models compared to 1R46
Model | Number of residues in common |
Weighted RMSd | Un-weighted RMSd |
---|---|---|---|
Model 1 | 390 | 0.532 | 1.115 |
Model 2 | 390 | 0.571 | 1.574 |
Model 3 | 376 | 1.833 | 20.273 |
MULTI 1 | 390 | 0.396 | 0.575 |
MULTI 2 | 390 | 0.479 | 2.689 |
MULTI 3 | 385 | 11.003 | 17.580 |
MULTI 4 | 380 | 0.904 | 3.833 |
CHAS | 390 | 0.532 | 1.115 |
CHAS 2 | 378 | 0.613 | 1.492 |
CHAS 3 | 378 | 0.613 | 1.492 |
</figtable>
<figure id="fig:RMSD_1ktb">
</figure>
<figtable id="tab:SAP_1R47_mod"> RMSD of Modeller models compared to 1R47
Model | Number of residues in common |
Weighted RMSd | Un-weighted RMSd |
---|---|---|---|
Model 1 | 391 | nan | nan |
Model 2 | 391 | 0.717 | 1.569 |
Model 3 | 376 | 1.817 | 20.281 |
MULTI 1 | 390 | 0.396 | 0.575 |
MULTI 2 | 391 | 0.472 | 2.693 |
MULTI 3 | 383 | 9.297 | 17.430 |
MULTI 4 | 380 | 0.912 | 3.836 |
CHAS | 391 | nan | nan |
CHAS 2 | 378 | 0.618 | 1.498 |
CHAS 3 | 378 | 0.618 | 1.498 |
</figtable>
DOPE score
<figure id="fig:DOPE_Model">
</figure> <figure id="fig:DOPE_MULTI">
</figure> <figure id="fig:DOPE_CHAS">
</figure> <figure id="fig:DOPE_Best2">
</figure>
Swissmodel
Calculation of models
Evaluation
TM-score
<figtable id="tab:TMscore_1R46_sm"> TM-score Swissmodel 1R46
Model | Number of residues in common |
RMSD of the common residues |
TM-score | GDT-TS-score | GDT-HA-score |
---|---|---|---|---|---|
output_TMscore/out/1R46_Model_2.out | 390 | 0.512 | 0.9950 | 0.9917 | 0.9218 |
output_TMscore/out/1R46_Model_3.out | 390 | 1.551 | 0.9660 | 0.9032 | 0.7538 |
</figtable>
<figtable id="tab:TMscore_1R47_sm"> TM-score Swissmodel 1R47
Model | Number of residues in common |
RMSD of the common residues |
TM-score | GDT-TS-score | GDT-HA-score |
---|---|---|---|---|---|
output_TMscore/out/1R47_Model_2.out | 390 | 0.515 | 0.9950 | 0.9923 | 0.9231 |
output_TMscore/out/1R47_Model_3.out | 390 | 1.532 | 0.9667 | 0.9058 | 0.7545 |
</figtable>
RMSD with SAP
iTasser
3D-Jigsaw
References
<references/>