Difference between revisions of "Homology-modelling HEXA"
(→Discussion) |
(→Results) |
||
Line 103: | Line 103: | ||
|} |
|} |
||
+ | As we can see on the first picture, our model is beyond the curve of the different Z-scores, so our model is not very good. If we have a look at the next picture, we can see, that the Q-means score of the model is left of the gaussian curve, which also shows that this model is not very good. Next it is possible to compare the quality of our model with X-ray structure models. Normally, X-ray structure models have a Z-score about 0. In our case all scores are significant less than 0, so again this picture shows us, that the model is bad. The best part of our model is the C-beta interactions, which have a score of -4.65, which is far away from 0. The other scores are lower than the C-beta interactions, so therefore, they are worse. We can therefore suggest, that the model is quite bad. The last picture is a plot, which shows the wrong predicted residues. Most of the residues have a prediction error more than 10, which is extremly high.<br> |
||
− | |||
+ | So in general, we can see that our model does not have a very good quality. That should be kept in mind by analysing the prediction result, because it is nearly impossible to get a good prediction with a bad template. |
||
+ | |||
'''3LUT:''' |
'''3LUT:''' |
Revision as of 19:45, 30 August 2011
Contents
Homology structure groups
We choosed one protein from each sequence identity group which is shown in the following table. This proteins were used for almost every homology based applications which were discribed below.
(The complete HHsearch output can be found [here ])
> 60% sequence identity | |||
PDB id | name | similarity | |
3bc9_A | AMYB, alpha amylase | 80.8% | |
> 40% | |||
3cui_A | EXO-beta-1,4-glucanase; | 49.5% | |
< 40% | |||
3hn3_A | Beta-G1, beta-glucuroni | 25.1% | |
3lut_A | Voltage-gated potassium | 20.1% |
Swissmodel
Calculation
To calculate the models with Swiss-Model we used the [Webserver]. For the template with high sequence identity, we used the automated and the alignment method, for the other two templates we only used the alignment method.
The used alignments can be found [here].
Results
3BC9:
Sadly, Swissmodel was not able to align our template 3BC9 with our target sequence. Even when we tried to get a prediction with other templates with high sequence identity (1L8N, 1ZJA), Swissmodel was not able to align these two sequences. Therefore, we do not have a prediction result for high identical template and target.
3CUI:
The detailed prediction can be found [here]
Swiss-Model returns some scores to give the user the possibility to estimate the quality of the predicted model. This scores are shown in the next table. The most important score is the QMEAN4 score, because this score consists of the other scores above and gives the user the possibility to compare the different results.
Global Score | ||
Scoring function term | Raw score | Z score |
C_beta interaction energy | 202.24 | -4.65 |
All-atom pairwise energy | 9942.28 | -6.16 |
Solvation energy | 67.79 | -8.08 |
Torsion angle energy | 76.36 | -7.72 |
QMEAN4 score | 0.057 | -11.76 |
Furthermore, Swiss-Modeler returns two different structure predictions, one of the HEXA-HUMAN with 3CUI as template structure and one of teh wrong predicted residue. This two predictions are displayed in the following figures:
Besides, Swissmodel creates some pictures, which show the qualitity of the model, as well. This ones were shown in the following figures:
As we can see on the first picture, our model is beyond the curve of the different Z-scores, so our model is not very good. If we have a look at the next picture, we can see, that the Q-means score of the model is left of the gaussian curve, which also shows that this model is not very good. Next it is possible to compare the quality of our model with X-ray structure models. Normally, X-ray structure models have a Z-score about 0. In our case all scores are significant less than 0, so again this picture shows us, that the model is bad. The best part of our model is the C-beta interactions, which have a score of -4.65, which is far away from 0. The other scores are lower than the C-beta interactions, so therefore, they are worse. We can therefore suggest, that the model is quite bad. The last picture is a plot, which shows the wrong predicted residues. Most of the residues have a prediction error more than 10, which is extremly high.
So in general, we can see that our model does not have a very good quality. That should be kept in mind by analysing the prediction result, because it is nearly impossible to get a good prediction with a bad template.
3LUT:
We decided to model the 3D structure with the template structure which has a very low sequence identity. Therefore, we decided to model the structure with 3LUT. Sadly, Swissmodel could not model the structure with this template, because Swissmodel was not able to create an alignment which can be used as basic for the model. Because of this, we decided to calculate a model with 3HN3 as template, which was the next template with a little bit more sequence identity. Therefore, we present here the results of the modelling with 3HN3, whereas by the other methods, we used 3LUT.
3HN3:
The detailed prediction can be found [here]
Swiss-Model returns some scores to give the user the possibility to estimate the quality of the predicted model. This scores are shown in the next table. The most important score is the QMEAN4 score, because this score consists of the other scores above and gives the user the possibility to compare the different results.
Global Score | ||
Scoring function term | Raw score | Z score |
C_beta interaction energy | 120.70 | -5.31 |
All-atom pairwise energy | 2585.98 | -5.22 |
Solvation energy | 71.87 | -9.92 |
Torsion angle energy | 80.43 | -8.44 |
QMEAN4 score | 0.010 | -12.80 |
Furthermore, Swiss-Modeler returns two different structure predictions, one of the HEXA-HUMAN with 3hn3 as template structure and one of teh wrong predicted residue. This two predictions are displayed in the following figures
Besides, Swissmodel creates some pictures, which show the qualitity of the model, as well. This ones were shown in the following figures:
RMSD and TM-Score
The next step after the use of Swissmodel is to check the quality of the predicted structure. Therefore, we calculated the RMSD and the TM score. The RMSD (root-mean square deviation) calculates the distance between two aligned residues. A RMSD near to 0 is a very good result, because than there are only less deviation between template and target. But the RMSD score weights the distance between all residue pairs equally. This means, that some very distant residues can arise the RMSD value dramatically, although the overall topology of the two proteins is quite similar. Another problem with the RMSD is, that the length of the two proteins don't receive attention by the calculation. Therefore, long proteins have almost a worse RMSD value in contrast to short ones, even if the topology of both protein pairs is equal.
We used the RMSD calculation by Pymol and also by TM-align. The aligned structures where always the original hexosaminidase chain A and the predicted structure. The different RMSD were displayed in the first two rows of the following table. As you can see, there is a big difference between these two different RMSD values. This can be explained by different calculation methods to caluclate the RMSD.
So first of all, it is important to clarify how these two methods calculate the RMSD. Pymol first does a sequence alignment and then try to align the structures to minimize the RMSD between all aligned residues. TM-align indeed first rotates one structure to the other in an optimal way and in the next step the RSMD between the corresponding residues is calculated. These two approaches are totally different and also lead to different results.
The TM-score additional pays attention to the length of the protein structures. A TM-Score of 1 means that template and target have the same structure, a TM-Score > 0.5 means, both structures have the same fold, whereas a TM-Score < 0.2 means that both structures are totally different.
The TM-Score has some problems, as well. The most important one is, that if it is impossible to align any residues between the two structures, the Score will be 1. So keep in mind, if there is a score of 1, look at the picture to see, if the structures are really identical or if the TM-score failed.
3BC9 | 3CUI | 3HN3 | |
RMSD (Pymol) | no result | 24.447 | 27.968 |
RMSD (TM-align) | no result | 5.49 | 4.30 |
TM Score (TM-align) | no result | 0.45333 | 0.40661 |
Structural alignment (Pymol) | no result | ||
Structural alignment (TM-align) | no result |
Discussion
Modeller
Calculation
We used Modeller from the command line. Therefore we followed the instructions, described [here].
First of all, we created an alignment for each of our three selected sequences. In the next step we used Modeller to model the 3D structure of the protein.
For Modeller we used the Pir Alignment format, which can be found here: [3BC9], [3CUI], [3LUT]
Results
Modeller calculated for each structure (3BC9, 3CUI and 3LUT) one model which can be seen in the next pictures:
3BC9 | 3CUI | 3LUT |
RMSD and TM-Score
The next step after the use of Swissmodel is to check the quality of the predicted structure. Therefore, we calculated the RMSD and the TM score. The RMSD (root-mean square deviation) calculates the distance between two aligned residues. A RMSD near to 0 is a very good result, because than there are only less deviation between template and target. But the RMSD score weights the distance between all residue pairs equally. This means, that some very distant residues can arise the RMSD value dramatically, although the overall topology of the two proteins is quite similar. Another problem with the RMSD is, that the length of the two proteins don't receive attention by the calculation. Therefore, long proteins have almost a worse RMSD value in contrast to short ones, even if the topology of both protein pairs is equal.
We used the RMSD calculation by Pymol and also by TM-align. The aligned structures where always the original hexosaminidase chain A and the predicted structure. The different RMSD were displayed in the first two rows of the following table. As you can see, there is a big difference between these two different RMSD values. This can be explained by different calculation methods to caluclate the RMSD.
So first of all, it is important to clarify how these two methods calculate the RMSD. Pymol first does a sequence alignment and then try to align the structures to minimize the RMSD between all aligned residues. TM-align indeed first rotates one structure to the other in an optimal way and in the next step the RSMD between the corresponding residues is calculated. These two approaches are totally different and also lead to different results.
The TM-score additional pays attention to the length of the protein structures. A TM-Score of 1 means that template and target have the same structure, a TM-Score > 0.5 means, both structures have the same fold, whereas a TM-Score < 0.2 means that both structures are totally different.
The TM-Score has some problems, as well. The most important one is, that if it is impossible to align any residues between the two structures, the Score will be 1. So keep in mind, if there is a score of 1, look at the picture to see, if the structures are really identical or if the TM-score failed.
3BC9 | 3CUI | 3LUT | |
RMSD (Pymol) | 26.271 | 23.856 | 24.153 |
RMSD (TM-align) | 5.94 | 5.46 | 5.29 |
TM Score (TM-align) | 0.43072 | 0.44048 | 0.38126 |
Structural alignment (Pymol) | |||
Structural alignment (TM-align) |
Discussion
iTasser
Calculation of the models
To calculate our models with iTasser we used the [Webserver].
For the calculation we had to define the target and template sequence. Therefore, the target sequence had to be pasted into a frame whereas the tamplate was specified by the PDB-id. Furthermode, we exclude our own sequence and very similar as a template from the iTasser library. Therefore, we first define a cutoff of 80%. Afterwards we also created a further input-file wich contains our own sequence and the similar ones. This should prevent that this sequences were not used. To get the similar sequence we used 3D-Blast. In our case we did not found any other sequence with a similar structure at a high score. This means that our input file only contained our own structure.
Results
iTasser delivers a wide range of result with many predicted informations. The first ones are the predicted secondary structure and the predicted solvent accessibility. Furhtermore it provides the first top 5 predicted models, the predicted function, predicted GO terms and the predicted binding site. The predicted secondary structure elements are shown as H for alpha helix (red),S for beta sheet (blue) & C for coil (yellow). The predicted solvent accessibility has values range from 0 (buried residue) to 9 (highly exposed residue) which describes the solvent accessibility. The predicted function are the predicted EC numbers which are the TM-score, the RMSD score etc. The predicted GO terms are the molecular function, biological process or the cellular location. There are many different predicted GO terms for each protein.
3BC9:
The following three pictures show the predicted secondary structure, the predicted solvent accessibility and the predicted binding site of 3BC9.
The predicted secondary structur contains 17 alpha-helices and 18 beta-sheets. At first sight it agrees in the number with the predicted secondary structure from the last task. There, the number for alpha-helices was 14-16 and the number of sheets was 15. This is a good sign and displays that the predicted structure is probably good.
The predicted solvent accessibility is displayed by solvent accesibility value that ranges from 0 (buried residue) to 9 (highly exposed residue).
The predicted binding site displays the ligand which binds and the residues that interact with the ligand. The color of the ligand correspond to the CPK colors of Jmol. This means grey stands for a C-atom, red for O-atome and blue for N-atome. The residues were displayed in violet and the first character is the corresponding amino acid whereas the following number delivers the position in the sequence. For 3BC9 were 10 amino acids predicted to be involved in the binding site. This ones are 1 Histidine, 1 Arginine, 2 Glutamatic acid, 3 Tryptophane, 1 Aspartic acid, 1 Lysin and 1 Tyrosine.
The following pictures show the top 5 models predicted by iTasser. This five model display the best predicted overall 3D-structure. Furthermore, each model has a different C-score. The C-score estimates the quality of the predicted models. A high value of the C-score signifies a model with high confidence and vice-versa. The range of the C-score goes normally from -5 to 2. For 3BC9 the C-score goes from about -2.1 to 0.2. The C-score indicates that the most confident model is the first one and the worst confident is model 4. Outstanding is that model 2 and 3 have the same C-score.
The detailed prediction can be found [here]
3CUI:
The following three pictures show the predicted secondary structure, the predicted solvent accessibility and the predicted binding site of 3CUI.
The predicted secondary structur contains 17 alpha-helices and 18 beta-sheets. At first sight it agrees in the number with the predicted secondary structure from the last task as well. There, the number for alpha-helices was 14-16 and the number of sheets was 15. Further more it corresponds to the number of helices predicted for 3BC9. This is a good sign and displays that the predicted structure is probably good.
The predicted solvent accessibility is displayed by solvent accesibility value that ranges from 0 (buried residue) to 9 (highly exposed residue).
The predicted binding site displays the ligand which binds and the residues that interact with the ligand. The color of the ligand correspond to the CPK colors of Jmol. This means grey stands for a C-atom, red for O-atome and blue for N-atome. The residues were displayed in violet and the first character is the corresponding amino acid whereas the following number delivers the position in the sequence. For 3CUI were 9 amino acids predicted to be involved in the binding site. This ones are 1 Histidine, 1 Arginine, 2 Glutamatic acid, 3 Tryptophane, 1 Aspartic acid and 1 Alanine.
The following pictures show the top 5 models predicted by iTasser. This five model display the best predicted overall 3D-structure. Furthermore, each model has a different C-score. The C-score estimates the quality of the predicted models. A high value of the C-score signifies a model with high confidence and vice-versa. The range of the C-score goes normally from -5 to 2. For 3CUI the C-score goes from about -3.4 to 0.4. The C-score indicates that the most confident model is the first one and the worst confident is model 5.
The detailed prediction can be found [here]
3LUT:
The following three pictures show the predicted secondary structure, the predicted solvent accessibility and the predicted binding site of 3LUT.
The predicted secondary structur contains 17 alpha-helices and 18 beta-sheets. At first sight it agrees in the number with the predicted secondary structure from the last task as well. There, the number for alpha-helices was 14-16 and the number of sheets was 15. Further more it corresponds to the number of helices predicted for 3BC9 and 3CUI. This is a good sign and displays that the predicted structure is probably good.
The predicted solvent accessibility is displayed by solvent accesibility value that ranges from 0 (buried residue) to 9 (highly exposed residue).
The predicted binding site displays the ligand which binds and the residues that interact with the ligand. The color of the ligand correspond to the CPK colors of Jmol. This means grey stands for a C-atom, red for O-atome and blue for N-atome. The residues were displayed in violet and the first character is the corresponding amino acid whereas the following number delivers the position in the sequence. For 3CUI were 9 amino acids predicted to be involved in the binding site. This ones are 1 Histidine, 1 Arginine, 1 Glutamatic acid, 3 Tryptophane, 1 Aspartic acid, 1 Tyrosine and 1 Asparagine.
The following pictures show the top 5 models predicted by iTasser. This five model display the best predicted overall 3D-structure. Furthermore, each model has a different C-score. The C-score estimates the quality of the predicted models. A high value of the C-score signifies a model with high confidence and vice-versa. The range of the C-score goes normally from -5 to 2. For 3LUT the C-score goes from about -3.6 to 0.3. The C-score indicates that the most confident model is the first one and the worst confident is model 5.
The detailed prediction can be found [here]
RMSD and TM-Score
The next step after the use of Swissmodel is to check the quality of the predicted structure. Therefore, we calculated the RMSD and the TM score. The RMSD (root-mean square deviation) calculates the distance between two aligned residues. A RMSD near to 0 is a very good result, because than there are only less deviation between template and target. But the RMSD score weights the distance between all residue pairs equally. This means, that some very distant residues can arise the RMSD value dramatically, although the overall topology of the two proteins is quite similar. Another problem with the RMSD is, that the length of the two proteins don't receive attention by the calculation. Therefore, long proteins have almost a worse RMSD value in contrast to short ones, even if the topology of both protein pairs is equal.
We used the RMSD calculation by Pymol and also by TM-align. The aligned structures where always the original hexosaminidase chain A and the predicted structure. The different RMSD were displayed in the first two rows of the following table. As you can see, there is a big difference between these two different RMSD values. This can be explained by different calculation methods to caluclate the RMSD.
So first of all, it is important to clarify how these two methods calculate the RMSD. Pymol first does a sequence alignment and then try to align the structures to minimize the RMSD between all aligned residues. TM-align indeed first rotates one structure to the other in an optimal way and in the next step the RSMD between the corresponding residues is calculated. These two approaches are totally different and also lead to different results.
The TM-score additional pays attention to the length of the protein structures. A TM-Score of 1 means that template and target have the same structure, a TM-Score > 0.5 means, both structures have the same fold, whereas a TM-Score < 0.2 means that both structures are totally different.
The TM-Score has some problems, as well. The most important one is, that if it is impossible to align any residues between the two structures, the Score will be 1. So keep in mind, if there is a score of 1, look at the picture to see, if the structures are really identical or if the TM-score failed.
3BC9
The following table displays the RMSD and the TM-score from Pymol and TM-aling. Furthermore it contains the structure alignments of the predicted model of both methods (red) and the original hexosaminidase chain A (green).
Looking at the RMSD from Pymol, the best models are the first and the fifth one. The other three model have a very high RMSD. This agrees with the structure alignment of pymol where these three models align very bad and many structure elements do not align at all.
In contrast, the RMSD of TM-align is in the most cases much better. Furthermore the fourth model does not belong to the best ones and is almost similar to the other model 2-5. Only model 1 has an very good RMSD.
The TM-score agrees more with the RMSD of PyMol which means the best score is received for model 1 which followed by model 4. The other three models have a worser TM-score.
The structure alignment of TM-align agrees with the TM-score and nearly with the RMSD. For all cases the alignments seems to be only a few better than the one with PyMol and not remarkable more structure elements do agree with the original structure of hexosaminidase chain A. Structure three is outstanding, because there is a part which does not match with the hexosaminidase chain A structure at all. A closer look at the other two structures which have a bad RMSD as well show that they have only a few matches as well.
iTasser Model 1 | iTasser Model 2 | iTasser Model 3 | iTasser Model 4 | iTasser Model 5 | |
RMSD (Pymol) | 1.118 | 15.047 | 19.744 | 6.309 | 15.694 |
RMSD (TM-align) | 1.45 | 5.88 | 5.77 | 5.25 | 5.80 |
TM Score | 0.90210 | 0.41932 | 0.39410 | 0.62190 | 0.41661 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
3CUI
The following table displays the RMSD and the TM-score from Pymol and TM-aling. Furthermore it contains the structure alignments of the predicted model of both methods (red) and the original hexosaminidase chain A (green).
Looking at the RMSD from Pymol, the best models are the first two ones. The other three model have a higher RMSD whereas model 5 has the worst one. This agrees with the structure alignment of pymol where the first two model display a good alignment and the other alingment become more and more worser. The last one has the fewest matchest which corresponds to the high RMSD.
In contrast, the RMSD of TM-align has a different order. Here, the second model has the lowest RMSD followed by the first model.
Furthermore the fifth model is the worst one, but it is not so striking higher than the other two. This means that the three last models are almost similar.
The TM-score agrees more with the RMSD of TM-align which means the best score is received for model 2 followed by model 1. The other three models have a worser TM-score.
The structure alignment of TM-align agrees with the TM-score and with the RMSD. For all cases the alignments seems to be only a few better than the one with PyMol and not remarkable more structure elements do agree with the original structure of hexosaminidase chain A.
Structure five is outstanding, because there is a part which does not match with the hexosaminidase chain A structure at all. Furthermore it really aligns bad, which corresponds to the RMSD and the TM-score. A closer look at the other two structures of model 3 and 4 which have a bad RMSD as well show that they have only a few matches as well.
iTasser Model 1 | iTasser Model 2 | iTasser Model 3 | iTasser Model 4 | iTasser Model 5 | |
RMSD (Pymol) | 1.883 | 1.175 | 8.615 | 4.361 | 16.829 |
RMSD (TM-align) | 2.44 | 1.64 | 4.89 | 4.49 | 5.66 |
TM Score | 0.85287 | 0.89462 | 0.59559 | 0.69673 | 0.40966 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
3LUT
The following table displays the RMSD and the TM-score from Pymol and TM-aling. Furthermore it contains the structure alignments of the predicted model of both methods (red) and the original hexosaminidase chain A (green).
Looking at the RMSD from Pymol, the best models are the third, the second one and the first one with a really low RMSD. The other two model have a very high RMSD. This agrees with the structure alignment of pymol where the first three models align very good and many elements match with the one in the structure of hexosaminidase chain A. The other two structure alignmen are really bad which agrees with the RMSDs.
In contrast, the RMSD of TM-align has the same order of the models. The best models are the first three and the one with the highest RMSD are the last two. Striking is that in contrast to the PyMol-RMSD the RMSD of the last two models are not so extremly high which means that the difference of highest RMSD to the lowest is not so large.
The TM-score agrees more with both RMSDs which means the best score is received for the first three models. The other three models have a worser TM-score whereas model 5 has a much worser TM-score than model 4.
The structure alignment of TM-align agrees with the TM-score and the RMSD. The first three models deliver really good structure alignments whereas model 4 and 5 align badly. The structure of model 5 is outstanding, because there is a part which does not match with the hexosaminidase chain A structure at all. A closer look at the last two structures show that this alignments are really worse and that there are almost no matches with the hexosaminidase chain A.
iTasser Model 1 | iTasser Model 2 | iTasser Model 3 | iTasser Model 4 | iTasser Model 5 | |
RMSD (Pymol) | 1.956 | 1.128 | 1.006 | 10.479 | 19.258 |
RMSD (TM-align) | 2.35 | 1.78 | 1.57 | 6.03 | 5.52 |
TM Score | 0.85521 | 0.89678 | 0.89900 | 0.51449 | 0.38080 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
Discussion
3D-Jigsaw
Calculation of the models
For the 3D-Jigsaw calculation we used the [Webserver].
For this calculation we had to create a PDB-file which contains the models which should be considered. We took for each structure the best five models. To get those five best models we decided to look mainly at the tm-score, because this score has fewer disadvanteges. For 3lut we decided to make an exception and to take not the swissmodel result which had a good tm-score. The reason is that we took 3HN3 for the swissmodel calculation and not 3LUT itself.
This means for 3BC9 we took the iTasser model 1, 2, 4, 5 and modeller. For 3CUI we took iTasser model 1-4 and swissmodel. Contrary, for 3LUT we took all iTasser models.
Results
3BC9
3CUI
3LUT
RMSD and TM-Score
The next step after the use of Swissmodel is to check the quality of the predicted structure. Therefore, we calculated the RMSD and the TM score. The RMSD (root-mean square deviation) calculates the distance between two aligned residues. A RMSD near to 0 is a very good result, because than there are only less deviation between template and target. But the RMSD score weights the distance between all residue pairs equally. This means, that some very distant residues can arise the RMSD value dramatically, although the overall topology of the two proteins is quite similar. Another problem with the RMSD is, that the length of the two proteins don't receive attention by the calculation. Therefore, long proteins have almost a worse RMSD value in contrast to short ones, even if the topology of both protein pairs is equal. We used the RMSD calculation by Pymol and also by TM-align. The aligned structures where always the original hexosaminidase chain A and the predicted structure. The different RMSD were displayed in the first two rows of the following table. As you can see, there is a big difference between these two different RMSD values. This can be explained by different calculation methods to caluclate the RMSD. So first of all, it is important to clarify how these two methods calculate the RMSD. Pymol first does a sequence alignment and then try to align the structures to minimize the RMSD between all aligned residues. TM-align indeed first rotates one structure to the other in an optimal way and in the next step the RSMD between the corresponding residues is calculated. These two approaches are totally different and also lead to different results.
The TM-score additional pays attention to the length of the protein structures. A TM-Score of 1 means that template and target have the same structure, a TM-Score > 0.5 means, both structures have the same fold, whereas a TM-Score < 0.2 means that both structures are totally different. The TM-Score has some problems, as well. The most important one is, that if it is impossible to align any residues between the two structures, the Score will be 1. So keep in mind, if there is a score of 1, look at the picture to see, if the structures are really identical or if the TM-score failed.
3BC9
3D-Jigsaw Model 1 | 3D-Jigsaw Model 2 | 3D-Jigsaw Model 3 | 3D-Jigsaw Model 4 | 3D-Jigsaw Model 5 | |
RMSD (Pymol) | 2.773 | 1.790 | 2.847 | 2.847 | 3.252 |
RMSD (TM-align) | 1.60 | 2.35 | 2.93 | 2.93 | 1.98 |
TM Score | 0.64803 | 0.64999 | 0.63853 | 0.63851 | 0.64907 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
3CUI
3D-Jigsaw Model 1 | 3D-Jigsaw Model 2 | 3D-Jigsaw Model 3 | 3D-Jigsaw Model 4 | 3D-Jigsaw Model 5 | |
RMSD (Pymol) | 3.575 | 3.628 | 1.216 | 1.209 | 1.218 |
RMSD (TM-align) | 3.48 | 3.48 | 2.73 | 2.74 | 2.67 |
TM Score | 0.79749 | 0.79749 | 0.84098 | 0.84064 | 0.84109 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
3LUT
3D-Jigsaw Model 1 | 3D-Jigsaw Model 2 | 3D-Jigsaw Model 3 | 3D-Jigsaw Model 4 | 3D-Jigsaw Model 5 | |
RMSD (Pymol) | 1.813 | 1.813 | 1.173 | 1.224 | 1.170 |
RMSD (TM-align) | 2.45 | 2.45 | 2.95 | 2.90 | 2.86 |
TM Score | 0.83585 | 0.83585 | 0.83302 | 0.83034 | 0.83574 |
Structural alignment (Pymol) | |||||
Structural alignment (TM-align) |
Summary and Discussion
The first we can see on the tables above is, that the RMSD score calculated by Pymol is always much higher than the RMSD score calulcated by TM-align. Therefore, it is more effective to rotate the structure to each other, than to use sequence and structure alignment. This can be seen by looking at the RMSD score, but also by looking at the pictures, which show the superposed structures.
Furthermore, Modeller and Swissmodel both predict the structure bad. Both methods always have a very high RMSD and a very low TM-Score. To learn more about the prediction results, we analysed the scores for each template.
- 3BC9:
3BC9 is the template with the highest sequence identity. Therefore, the predicted results should be very similar to our structure. Unfortunately Swissmodel could not return a result, because the method was not able to align target and template sequence. This is very surprisingly, because an alignment between two very identical sequences should be easy to do. Even if we used the alignment mode in swissmodel, it was not able to return a prediction. The prediction of Modeller is really bad and also iTasser predicted wrong structures. Only model 1 of iTasser is very similar to the real structure, which can also be seen in the RMSD (near to 0) and the TM-Score (near to 1).
The best result with 3BC9 as target was the iTasser model1 prediction.
- 3CUI:
3CUI has a sequence identity of 49.5%, which is not that much, but it should be possible to predict a structure which is almost similar to the real structure. As before, Swissmodeller and Modeller predict structures which fit not very well to our real structure. But iTasser predicted two models, which are very similar to our structure. Model1 and Model4 have very low RMSD values, high TM-Scores and with a look to the pictures it is clear, that target and template structure are really similar.
So again, in this case we got the best result from iTasser.
- 3LUT /3HN3:
Swissmodel was not able to predict the structure of our target with 3LUT as template. Therefore, we used 3HN3, which has with 25% a bit more sequence identity than 3LUT (20%).
We suggest, that this prediction result is the worst result, because of this low sequence identity. Interesstingly, the prediction results of Modeller and Swissmodel are not much worse than their result with 3CUI as template. Furthermore, iTasser predicted two models, which fit very well to our real structure and also has very low RMSD scores and high TM-Scores.
We want to highlight, that this result is not the norm. We aligned the structure of 2GJX:A and 3LUT:A and the TM-Score between these two structures is 0.50014, the RMSD 5.04, which is a very good result regarding that the sequence identity is that low. So in this case we were lucky to get such a good result, but in general, the results by predicting two that much distinct sequences is much worse.
In agreement with the two results from above, iTasser again gave the best results.
In sum, iTasser is the best prediction method from the three used methods. But iTasser also needs a lot of time to predict the sequences and also allows only one sequence per user to predict in the same time. Therefore, if there is enough time, iTasser is the best choice. If there is not that much time, Modeller and Swissmodel can be used. Both methods have approximalty the same prediction results. Modeller can only run on the command line, which means Modeller have to be installed on the system. If the user just want to install Modeller, it will take a while, because Modeller sends a licence per E-Mail which can take up to one day. Swissmodel is available on the Internet and can be used without any delay. So if the user only want to get an approximat estimation of the structure of the protein and do not have that much time, Swissmodel will be the right choice.
Furthermore, we used Jigsaw to get a final model, which is build with five different models. In case of 3BC9, the Jigsaw result is very good, with a RMSD score less than 1 and a TM-Score near by 1. We used Jigsaw with the first four iTasser predictions and the prediction of Modeller. We could not use the Swissmodel prediction, because Swissmodel did not work. So, we have one very good model, three moderate models and only one bad model. This could explain, why the result of Jigsaw is that good. In the case of 3CUI is the result very bad, and this result is worse than the results of iTasser. In this case we used the models of Swissmodel, Modeller and the first three iTasser models. The prediction results of Swissmodel and Modeller are very bad and the iTasser results are also not that good. Therefore, it seems that Jigsaw can not compensate four very bad models, although if one very good model is available. In our last case with 3LUT the result is very good with again an RMSD less than 1 and a TM-Score near by 1. Here again the models of Swissmodel and Modeller are very bad, but the three iTasser results are quit good and therefore it seems, that Jigsaw can compensate two very bad models, if three models are very good and similar. In sum, Jigsaw can improve the prediction results if the models which are the basic for the Jigsaw prediction are quite good. Otherwise, the prediction of Jigsaw is worse than the predictions of the single methods.