Rs1054374

From Bioinformatikpedia
Revision as of 21:40, 31 August 2011 by Uskat (talk | contribs) (SNAP Prediction)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

General Information

SNP-id rs1054374
Codon 293
Mutation Codon Ser -> Ile
Mutation Triplet AGT -> ATT

Back to [Sequence-based mutation analysis]

Sequence-based Mutation Analysis

Pysicochemical Properties

First of all, we explored the amino acid properties and compared them for the original and the mutated amino acid. Therefore we concluded the possible effect that the mutation could have on the protein.

Ser Ile consequences
polar, tiny, hydrophilic, neutral aliphatic, hydrophobic, neutral Ile is much bigger than Ser and also branched, because it is an aliphatic amino acid. Therefore the structure of both amino acids is really different and Ile is too big for the position where Ser was. Therefore, there has to be a big change in the 3D structure of the protein and the protein probably will loose its function.


Back to [Sequence-based mutation analysis]


Visualization of the Mutation

In the next step, we created the visualization of the mutation with PyMol. Therefore we created a picture for the original amino acid (Figure 1), for the new mutated amino acid (Figure 2) and finally for both together in one picture whereas the mutation is white colored (Figure 3). The following pictures display that the original amino acid Serine looks different to Isoleucine. Serine is very small whereas Isoleucine is bigger and has two spreading chains. The first part of the rest agrees in both amino acids. In this case the difference is not so heavy, but can also cause some structural changes which can have affects on the protein function. All in all, the mutation will probably have no structural or functional changes.

picture original amino acid picture mutated amino acid combined picture
Figure 1: Amino acid Serine
Figure 2: Amino acid Isoleucine
Figure 3: Picture which visualize the mutation



Substitution Matrices Values

Afterwards, we looked at the values of the substitution matrices PAM1, PAM250 and BLOSSUM62. Therefore we looked detailed at the three values: the value for this amino acid substitution, the most frequent value for the substitution of the examined amino acid and the rarest substitution.

In this case, the substitution of Serine to Isoleucine acid has very low value that is nearer to the values for the rarest substitution for PAM1. Contrary, for PAM250 the value for the amino acid substitution Histidine to Aspartic acid is average. This means the most frequent substitution value is almost as far away as the rarest substitution value. The difference between the two PAMs can be ascribed to the different preparations of these two kind of substitutions matrices. For the PAM1-matrix the substitution rate is 1% which means the probability that one amino acid changes is 1% and that there is 99% similarity. Contrary, PAM250 means that 250 mutations have been fixed per 100 residues which has as result only similarity about 20%. A possible reason that PAM250 has a better value for the amino acid substitution is that the similarity is low and the amino acids are probably dissimilar. BLOSOUM62 has like PAM1 for this substitution a very low value that is nearer to the values for the rarest substitution. Therefore, according to PAM1 and BLOSOUM62 a mutation at this position will almost certainly cause structural changes which can affect functional changes. The value from PAM250 is not really significant and therefore we are not able to determine effects on the protein.

PAM 1 Pam 250 BLOSOUM 62
value amino acid most frequent substitution rarest substitution value amino acid most frequent substitution rarest substitution value amino acid most frequent substitution rarest substitution
2 38 (Thr) 1 (Leu) 5 9 (Ala, Gly, Pro, Thr) 3 (Phe) -2 1 (Ala, Asn, Thr) -3 (Trp)


Back to [Sequence-based mutation analysis]


PSSM Analysis

Besides, we looked additional at the position specific scoring matrix (PSSM) for our sequence. In contrast to PAM and BLOSOUM, the PSSM contains a specific substitution rate for each position in the sequence. Therefore, the PSSM is more position specific than PAM or BLOSOUM. We extracted the substitution value for the underlying mutation, the value for the most frequent substitution and the rarest substitution.

In this case the substitution rate for Serine to Isoleucine at this position is very low and near to the value for the rarest substitution. This means this substitution at this position is likely very uncommon which indicates that this substitution has bad effects as a consequence. Therefore, we concluded that this mutation will probably cause protein structure changes as well as functional changes.


PSSM
value amino acid most frequent substitution rarest substitution
-3 4 -5


Back to [Sequence-based mutation analysis]


Conservation Analysis with Multiple Alignments

As a next step we created a multiple alignment which contains the HEXA sequence and 9 other mammalian homologous sequences from [UniProt]. Afterwards we looked at the position of the different mutations and looked at the conservation level on this position. The regarded mutation is presented by the colored column on Figure 4. Here we can see, that many other mammalians have another amino acid at this position. Only four other mammalian agrees and have a Serine at this position. Therefore, the mutation at this position is not highly conserved and a mutation there will probably cause no structural and functional changes in the protein.

Figure 4: Mutation in the multiple alignment


Back to [Sequence-based mutation analysis]


Secondary Structure Mutation Analysis

As a next step we compared the different results of the secondary structure prediction tools JPred and PsiPred. Afterwards we can examine in which secondary structure element and where therein the mutation takes place. This can give an overview of how drastic the mutation can be. In this case both tools agree and predict a coil at the position of the mutation. This has as a result, that the mutation at this position would not destroy or split a secondary structure element. It will probably only changes the coil between two secondary structure elements, but this can sometimes also cause a change of the the following secondary structure. Our protein do not posses any disordered regions and therefore, this mutation will not destroy any functional very important coil regions. We think that a drastic change of the protein structure and its function is unlikely because the mutation does not affect a secondary structure element. The change of the coil will probably only take places between two secondary structure elements which will probably not changes the protein.

JPred:
...HHHHHHHHCCCEEEECCCCCHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCC...
PsiPred:
...HHHHHHHHCCCEEEECCCCCHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCH...

Comparison with the real Structure:

Afterwards we also visualize the position of the mutation (red) in the real 3D-structure of [PDB] and compare it with the predicted secondary structure. The visualization can therefore like above the predicted secondary structure display if the mutation is in a secondary structure element or in some other regions.

Here in this case the mutation position almost agree with the position of the predicted secondary structure and is within a coil, which can be seen on Figure 5 and Figure 6. Like explained above this means a mutation will probably not destroy a secondary structure element which affects no drastic structural change. Otherwise it can cause a change of the position of the two nearest secondary structure element which can has a functional loose as a consequence. We think that a structural change is unlikely, because it is not within a secondary structure element and will therefore not cause extreme changes of the protein.

Figure 5: Mutation at position 293
Figure 6: Mutation at position 293 - detailed view


Back to [Sequence-based mutation analysis]


SNAP Prediction

Next, we looked at the result of the SNAP prediction. For this prediction we took the amino acid of the certain position and checked every possible amino acid mutation. Afterwards we extract the result for Isoleucine which is the real mutation in this case. SNAP has as a result that the exchange from Serine to Isoleucine at this position is neutral with a relative high accuracy. This means that this certain mutation at this position cause very likely no structural and functional changes of the protein.

Substitution Prediction Reliability Index Expected Accuracy
I Neutral 2 69%

A detailed list of all possible substitutions can be found [here]


Back to [Sequence-based mutation analysis]


SIFT Prediction

Next, we used SIFT Prediction which displays if a mutation is neutral or not. Therefore, it first shows a row which contains a score for the particular mutation position to a certain amino acid. The amino acid which are not tolerated at this position are colored red. Besides, it also constructs a table which lists the amino acids that are predicted as tolerated and not-tolerated.

In this case, there are seven substitutions that are tolerated: Methionine, Isoleucine, Alanine, Serine, Valine, Threonine and Leucine, which can be seen on Figure 8. The substitution to Isoleucine is tolerated at this position. This means that this mutation at this position is probably neutral and will not cause any structural and function changes of the protein.

SIFT Matrix:
Each entry contains the score at a particular position (row) for an amino acid substitution (column). Substitutions predicted to be intolerant are highlighted in red.

Figure 7: Legend
Figure 8: SIFT Table
Threshold for intolerance is 0.05.
Amino acid color code: non-polar, uncharged polar, basic, acidic.
Capital letters indicate amino acids appearing in the alignment, lower case letters result from prediction.




Predict Not ToleratedPositionSeq RepPredict Tolerated
wghydrnfqekcp293S1.00MlASVTI




Back to [Sequence-based mutation analysis]


PolyPhen2 Prediction

Finally, we also regarded the PolyPhen2 prediction for this mutation. This prediction visualizes have strongly damaging the mutation probably will be. Therefore it gives the result for two possible cases: HumDiv and HumVar. HumDiv is the preferred model for evaluation rare alleles, dense mapping of regions identified by genome-wide association studies and analysis of neutral selection. In contrast, HumVar is the preferred model for diagnostic of Mendelian diseases which require distinguishing mutations with drastic effects from all remaining human variations including abundant mildly deleterious alleles. We decided to look at both possible models, which are agrees in the most cases.

In this case both models predict that the mutation is benign (Figure 9 and Figure 10). This means that the mutation is neutral and will probably not damage the structure and the function of the protein.

Figure 9: HumDiv prediction
Figure 10: HumVar prediction


Back to [Sequence-based mutation analysis]


Structure-based Mutation Analysis

Mapping onto Crystal Structure

Figure 11: Visualization of the mutation and important functional sites
Color declaration:
* red: position of mutation
* green: position of active side
* yellow: position of glycolysation
* cyan: position of Cysteine

First of all, we colored the important residues and also the mutated residue in the crystal structure, to see if the mutation is near of far away from the functional residues. As you can see on Figure 11, the mutation is located within a loop near to a cysteine residue. Therefore, if the mutation causes changes in the structure it is possible, that the cysteine can not build a disulfid bond any longer. This could causes dramatical changes in the structure of the protein which leads to an inactive protein structure.

Back to [Structure-based mutation analysis]


SCWRL Prediction

Next, we analysed the mutation in more detail. Therefore we looked directly to the structure and orientation of the residue.

picture original amino acid picture mutated amino acid combined picture
Figure 12: Amino acid Serine
Figure 13: Amino acid Isoleucine
Figure 14: Picture which visualize the mutation

On the picture you can see that the amino acids are very different. Isoleucine (mutated amino acid on figure 13) is bigger than the original amino acid Serine (on Figure 12). This could lead to clashes and additional H-Bonds within the protein. If there are clashes, the protein has to fold in another way, which destroy the disulfid bond between the two cysteine residues which could destroy the complete structure and function of the protein.

Back to [Structure-based mutation analysis]



FoldX Energy Comparison

One important point in the analysis of mutations is to look at the energy of the protein with the original and with the mutated amino acid. Often the energy increases dramatically with the mutated amino acid. This means, that the protein becomes very unstable and therefore, it is often possible that the protein can not bind its ligands any longer. Otherwise, it is also possible, that the protein with the mutated amino acid has a lower energy than the original protein. This means, that the protein is too rigid and loses its flexibility. Than it is also possible, that the protein can not bind the ligands any longer.

Therefore, we compared the energy of our protein with different methods. Here we want to present the result of FoldX.

Original total energy Total energy for the mutated protein Strongest energy changes within the mutated protein
-154.17 -152.15 Energy_vdwclash

The energy of the mutated structure is a little bit higher than the energy of the original structure, but the values do not differ that much. The strongest energy changes within the mutated protein are vdw clashes, which means that there is a clash in the van-der-Waals binding. So it seems that there is not a problem with the H-Bonds but instead there is a problem with electrostatic interactions within the protein. This is a strong hint, that this mutation destroy the protein. To have the possibility to compare the energy values with values from other programs, we calculated a ratio between these two values.

Ratio of the original structure Ratio of the mutated structure Differences
100 98.79 1.21


Back to [Structure-based mutation analysis]


Minimise Energy Comparison

Next we use the minimise energy tool to compare the energy values of the two different structures.

Comparing Energy:

Original total energy Total energy for the mutated protein
-9610.467157 -6189.246312

As before the energy of the structure is higher. But in this case the difference between the two energy values is much stronger than by our analysis with FoldX. To compare the values in a stricter way we calculated again a ratio between the energy values.

Ratio of the original structure Ratio of the mutated structure Difference
100 64.40 35.60

In this case the ratio between the energy is about 35%, which is a very high value.

Comparing Structure:

This tool also gives as output a [PDB] file with the position of the original and the mutated amino acid.

picture original amino acid picture mutated amino acid combined picture
Figure 15: Amino acid Serine
Figure 16: Amino acid Isoleucine
Figure 17: Picture which visualize the mutation

The pictures (Figure 15, Figure 16, Figure 17) show, that the structure of the amino acids differ extremely. The orientation of the amino acids is similar to that shown with SCWRL.


Visualization of H-bonds and Clashes:

To get more insight in the effects of the mutated amino acid on the structure, we also analysed the H-bonds and clashes of the Asparagine residue.

H-bonds of the original amino acid H-bonds near the mutation Clashes of the mutation
Figure 18: H-bonds of the original amino acid (colored in magenta)
Figure 19: H-bonds of the mutated amino acid (colored in red)
Figure 20: Possible clashes

We can see, that the original amino acid has two H-Bonds to another residue in the protein (Figure 18). These two bonds are lost, if we look at the picture with the mutated amino acid (Figure 19). Therefore, there is also a problem with missing H-Bonds in the protein. This means, that the protein with the mutated amino acid is not that stable as the original structure. Furthermore, on Figure 20, we can see that there is no problem with possible clashes.

Back to [Structure-based mutation analysis]


Gromacs Energy Comparison

Comparing Energy:

To analyse the energy values calculated by Gromacs, we used the AMBER99SB-ILDN force field.

Here are the values of the original structure:

Energy Average Err.Est. RMSD Tot-Drift
Bond 1091.57 270 -nan -1622.75
Angle 3326.81 62 -nan 404.076
Potential -61304.1 960 -nan -6402.44

Here you can see the values which gromacs calculated for the structure with the mutated amino acid:

Energy Average Err.Est. RMSD Tot-Drift
Bond 1156.62 400 3388.76 -2308.27
Angle 3267.23 48 208.209 308.52
Potential -48652.6 1200 5442.19 -7274.41

In this case the values between the original and mutated structure differs extremely, as before by our analysis with minimise.

One difference between gromacs and the other tools we used is, that gromacs also calculated the energy for the bonds and the angles. To compare the energies between the different tools we only consider the potential energy in our analysis, because the potential energy is the energy of the complete protein. Therefore, we calculated the ratio between the energies only for the potential energy.

Ratio original structure Ratio mutated structure Difference
100 79.36 20.64


Comparing Structure:

Gromacs also offers pictures of the mutated amino acids which can be seen in the following section.

picture original amino acid picture mutated amino acid combined picture
Figure 21: Amino acid Serine
Figure 22: Amino acid Isoleucine
Figure 23: Picture which visualize the mutation

The pictures (Figure 21, Figure 22, Figure 23) are analog to the pictures we saw before. The amino acids differ extremely in size and orientation.

Visualization of H-bonds and Clashes:

To check if this is the case, we analysed the H-Bonds and clashes between the mutated amino acid and the rest of the protein.

H-Bonds of the original protein H-bonds near the mutation Clashes of the mutation
Figure 24: H-bonds
Figure 25: H-bonds
Figure 26: Possible clashes

In contrast to the original amino acid (Figure 24), the model from gromacs shows no h-bonds between the mutated amino acid and the rest of the protein (Figure 25). So again there are two missing H-bonds in the mutated structure, which means that the mutated structure is more unstable than the original structure. As seen before, there is no problem with clashes between the mutated amino acid and the rest of the protein as can be seen in Figure 26.

Back to [Structure-based mutation analysis]