Sequence and structure based mutation analysis of GBA

From Bioinformatikpedia
Revision as of 19:39, 29 August 2011 by Braunt (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

In this section we want to combine the results of sequence- and structure-based mutation analysis. Therefore the results of task 6 and task 7 are used. In Figure 1 the mutations are highlighted in the protein structure of 2NT0.


Figure 1: 2NT0 with hilighted mutation positions (red) and active site residues (blue).

Sequence-based mutation analysis

The following table summarizes the results of the sequence-based mutation analysis.

Mutation Amino-Acid Properties Substitution Matrices PSSM Conservation Secondary Structure SNAP SIFT PolyPhen-2
BLOSUM62 PAM1 PAM250 HumDiv HumVar
1 non-neutral neutral neutral neutral non-neutral non-neutral non-neutral neutral neutral non-neutral non-neutral
2 non-neutral neutral neutral neutral non-neutral non-neutral neutral non-neutral neutral non-neutral non-neutral
3 neutral neutral neutral neutral non-neutral neutral neutral neutral neutral neutral neutral
4 non-neutral neutral neutral neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral
5 non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral neutral non-neutral non-neutral non-neutral non-neutral
6 neutral neutral neutral neutral neutral non-neutral neutral neutral neutral neutral neutral
7 neutral neutral neutral neutral neutral non-neutral non-neutral non-neutral non-neutral non-neutral neutral
8 non-neutral neutral neutral neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral
9 non-neutral neutral neutral non-neutral non-neutral non-neutral non-neutral non-neutral neutral non-neutral non-neutral
10 non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral non-neutral

Structure-based mutation analysis

The following table summarizes the results of the structure-based mutation analysis.

Mutation SCWRL Minimise Gromacs FoldX
Energy
Polar
Interactions
Clashes
Holes
Energy Polar
Interactions
Clashes
Holes
Energy Polar
Interactions
Clashes
Holes
Energy
1 neutral neutral neutral neutral neutral neutral neutral non-neutral neutral neutral
2 neutral neutral neutral neutral neutral neutral neutral neutral neutral neutral
3 non-neutral non-neutral neutral non-neutral non-neutral neutral non-neutral neutral neutral neutral
4 non-neutral neutral neutral non-neutral neutral neutral non-neutral non-neutral non-neutral neutral
5 neutral neutral non-neutral neutral neutral non-neutral neutral neutral non-neutral non-neutral
6 non-neutral non-neutral neutral non-neutral non-neutral neutral non-neutral non-neutral non-neutral neutral
7 neutral neutral neutral neutral neutral neutral neutral neutral neutral neutral
8 non-neutral non-neutral neutral non-neutral neutral non-neutral non-neutral neutral non-neutral neutral
9 neutral non-neutral neutral neutral neutral non-neutral neutral non-neutral non-neutral neutral
10 neutral non-neutral neutral neutral non-neutral neutral neutral neutral non-neutral neutral

Predictions of sequence-based and structure-based mutation analysis

The following table shows the predictions made based on either the sequence-based or the structure-based mutation analysis. It is furthermore shown, whether both methods agree and whether the mutation is listed in HGMD and is therefore damaging in reality.

Mutation Sequence-based mutation analysis Structure-based mutation analysis in HGMD? prediction
1 neutral neutral yes wrong
2 non-neutral neutral yes partly correct
3 neutral neutral no correct
4 non-neutral non-neutral yes correct
5 non-neutral non-neutral yes correct
6 neutral non-neutral yes partly correct
7 non-neutral neutral yes partly correct
8 non-neutral non-neutral yes correct
9 non-neutral neutral yes partly correct
10 non-neutral neutral no partly correct

Discussion

Mutation 1

The first mutation is the only one we predicted totally wrong. With both, sequence- and structure-based analysis, the mutations was predicted as being neutral, but as it is listed in HGMD it is damaging. In sequence-based analysis the amino-acid properties, the PSSM, the conservation, the secondary structure and the prediction of Polyphen-2 indicated that the mutation would be damaging. So it was not easy to decide whether we classify the mutation as neutral or damaging. But as the affected amino acid is located at the exterior of the protein and there were also many results indicating that this mutation is harmless, the mutation was predicted as being neutral. The structure-based mutation analysis almost each result led to the conclusion that the mutation is harmless. Only the structure obtained with Gromacs showed a different surface. It is interesting, that we did not find any significant changes in the structure-based mutation analysis as we had expected after having investigated the results of the sequence-based mutation analysis. All in all there were too little signs for a damaging mutation. For this mutation our prediction was totally wrong. We failed in both methods. It would be interesting to use more methods to see if we would be able to find the reason for the damaging effect.

With both analyses: wrong prediction

Mutation 2

We predicted the second mutation partly correct. It is listed in HGMD and is therefore damaging. The sequence-based mutation analysis led to the same conclusion. In contrast, the structure-based analysis indicated the mutation as being harmless. The steps of the sequence-based analysis produced contradicting results and it was not sure which results are the most important. Mainly because of the change from an acidic to a neutral amino acid we predicted the mutation as damaging. In structure-based analysis all results indicated a neutral substitution. So we classified the mutation as harmless, which was wrong. All together we would classify the mutation as harmless because the sequence-based analysis was not clear and the structure-based analysis tended to a neutral mutation, which is not correct. The effect must be directly at the amino acid with no structural changes. Maybe the binding differs somehow or the loss of the acidic character is damaging.

With both analyses: wrong prediction

Mutation 3

We predicted the third mutation correct with both sequence- and structure-based mutation analysis. Although it was hard to decide in structure-based analysis it was predicted correctly as being neutral.

With both analyses: correct prediction

Mutation 4

Also the effect of the fourth mutation was predicted correclty. In both tasks we classified it as damaging, which is correct. In sequence- and structure-based mutation analysis it was easy to decide.

With both analyses: correct prediction

Mutation 5

The fifth mutation we also predicted correctly as damaging. In sequence-based analysis it was absolutely clear to predict it as damaging and in structure-based analysis we also were sure to classify it correctly because of the high difference in energy comparison.

With both analyses: correct prediction

Mutation 6

For the sixth mutation we had two different results. In sequence-based mutation analysis we predicted it as neutral, which was wrong, whereas the structure-based mutation analysis predicted it as damaging, which is correct as it is listed in HGMD. In sequence-based analysis only the conservation indicated a damaging mutation whereas all the other methods led to the conclusion, that the mutation is neutral. The structure-based analysis showed that the formed polar interactions had changed after the mutation, that the surface changed and also that the energy was too high. So we decided to classify it as damaging, which is correct. This is a good example to see that you have to consider all possible effects of a mutation. Although the sequence-based analysis does not show a hint that the mutation could be damaging it is very clear if you consider the structure-based analysis. Combining both methods we would classify the mutation as damaging, which is correct.

With both analyses: correct prediction

Mutation 7

We predicted the seventh mutation as non-neutral based on the sequence-based mutation analysis, which is correct, and as neutral based on the structure-based mutation analysis. The comparison of the substitution matrices etc. indicated a neutral mutation but all prediction tools applied in the sequence-based mutation analysis classified the mutation as damaging. This was the reason why we decided to predict it as non-neutral. In this case, it was not clear whether to classify the mutation as damaging or neutral. The results of the structure-based mutation analysis led to the assumption that the mutation is harmless. As it is the most common mutation in Gaucher Disease it is really interesting that we have so many problems to classify it correctly. Even when combining both methods we tend to classify it as neutral. To get a better insight into the mutation's effect, we made a Molecular Dynamics analysis. Maybe it helps to understand the damaging effect.

With both analyses: wrong prediction

Mutation 8

We predicted mutation eight as damaging with both sequence-based mutation analysis and structure-based mutation analysis, which is correct. It is listed in the HGMD and associated to Gaucher Disease 2. In both cases it was easy to decide to classify the mutation as damaging.

With both analyses: correct prediction

Mutation 9

The 9th mutation is damaging, which we also predicted based on the sequence-based mutation analysis. But in structure-based mutation analysis we classified it as neutral. In sequence-based analysis we had many results that indicated a damaging mutation. The amino acid properties, the PSSM and also the prediction tools gave us enough reason to classify it like that. In structure-based analysis it was hard to decide. There were some reasons for classifying it as damaging and some to classify it as harmless. It seems that we did not judge the importance of the different results correctly. With all results together there are so many signs for a damaging mutation, that this should be enough to classify the mutation as non-neutral.

With both analyses: correct prediction

Mutation 10

For mutation ten we also yielded different results: Based on the sequence-based analysis we predicted it as damaging whereas the structure-based analysis led to a harmless mutation, which is correct. In sequence-based analysis all results led us to the putative clear prediction, that the mutation is damaging. But that is wrong. In structure-based analysis we found only a little higher energy but no other signs for a damaging mutation. So we classified it as neutral. All in all we also would classify it as damaging when combining both methods as the results of the sequence-based mutation analysis are that explicit. But that is wrong. For this mutation we also made a Molecular Dynamics anaysis because we have very different results in sequence-based and structure-based analysis.

With both analyses: wrong prediction

Summary

sequence-based mutation analysis structure-based mutation analysis
correctly predicted wrong predicted correctly predicted wrong predicted
7 3 6 4
sequence- and structure-based mutation analysis
correctly predicted wrong predicted
6 4

In our case the prediction results are best if only the sequence-based mutation analysis is used. If both methods are taken into account, the number of correct predictions is not better than ones of the structure-based analysis. This shows, that the sequence- and structure based mutation analyses shown in this section are not enough to clearly identify the impact of a mutation. Further methods are needed to get an impression of what a mutation could cause.

For two of the mutations Molecular Dynamics simulations were applied, to learn more about their effect. But that may not be enough. The mutations could also have effects we do not really understand and then it is hard to classify such a mutation as damaging or harmless if you do not exactly know what it does. This may also be a reason why we have sometimes inconsistent results.


References