Molecular Dynamics Simulations HEXA

From Bioinformatikpedia
Revision as of 19:56, 30 September 2011 by Uskat (talk | contribs) (RMSF for protein and C-alpha and PyMol analysis of average and B-factor)

Run the MD simulation

A detailed description of how to run the MD analysis software to get the same results as we did, can be found [here].

Detailed results

The detailed results and the discussion of the single results can be found for each run on their own page.


Back to [top].
Back to [Tay-Sachs Disease].

Comparison of the results

In this section, we want to compare the different results of the MD analysis to look if there are differences between the wild type structure and the structures with the mutation. For more information about the single result analysis please look at the point “Detailed results”.

check the trajectory

Wildtype Mutation 436 Mutation 485
Item #frames Timesteps (ps) Item #frames Timesteps (ps) Item #frames Timesteps (ps)
Step 2001 5 Step 2001 5 Step 2001 5
Time 2001 5 Time 2001 5 Time 2001 5
Coords 2001 5 Coords 2001 5 Coords 2001 5

As you can see in the table above, each simulation has the same number of frames on the different items. Therefore, the different results of the different MD simulation runs are comparable. We used the results of these runs for the following comparison of the different results.
Back to [top].
Back to [Tay-Sachs Disease].

Visualize in PyMol

Next, we want to compare the pictures of the tertiary structure with PyMol.

wildtype Mutation 436 Mutation 485
Figure 1: Visualization of the MD simulation for the wildtype with PyMol
Figure 2: Visualization of the MD simulation for the wildtype with PyMol
Figure 3: Visualization of the MD simulation for the wildtype with PyMol

In general, the structure of the different simulation results is equal (which can be seen in Figure 1, Figure 2 and Figure 3), which was expected, because we only mutated one amino acid in the complete system.
Back to [top].
Back to [Tay-Sachs Disease].

create a movie

The MD simulation gave the possibility to compare the movies of the protein motions. Therefore, we created for each simulation result one movie in stick view and one movie in cartoon view, which can be see in Figure 4 – 9.

wildtype Mutation 436 Mutation 485
Figure 4: Movie of the motion of the wildtype in stick view.
Figure 5: Movie of the motion of mutation 436 in stick view.
Figure 6: Movie of the motion of mutation 485 in stick view.
Figure 7: Movie of the motion of the wildtype in cartoon view
Figure 8: Movie of the motion of mutation 436 in cartoon view
Figure 9: Movie of the motion of mutation 485 in cartoon view

The motion of the complete protein seems to be very similar (compare Figure 4 - Figure 9). Therefore, it is not possible to see a difference between them.
We also want to have a closer look at the motion of the different residues. The MD simulation is not an approximated model and calculates the motion of each residue. This makes it possible to have a closer look to the single residues and to compare the motion of the original amino acid and the mutated amino acid. This comparisons can be seen on Figure 10 - Figure 13.

  • Mutation at position 436
wildtype Mutation 436
Figure 10: Detailed view of the motion of the original amino acid at position 436
Figure 11: Detailed view of the motion of the mutated amino acid at position 436

The amino acids seem to be very similar and the motion of the amino acid is very similar as well, which can be seen in Figure 10 and Figure 11. Therefore, we suggest that there is no big difference between these two amino acids and the motion of the protein. Hence, we think, that the substitution of the amino acid may not change the function and the motion of the protein.

  • Mutation at position 485
wildtype Mutation 485
Figure 12: Detailed view of the motion of the original amino acid at position 485
Figure 13: Detailed view of the motion of the mutated amino acid at position 485

First of all, we can see that the amino acids are totally different. Secondly, we can see that the original amino acid is more flexible than the mutated one, which can be seen on Figure 12 and Figure 13. The original amino acid shows more motion in the simulation than the mutated amino acid. Therefore, because of the different motion of the amino acids, we suggest that the mutated amino acid may change the function and locally the structure of the protein.
Back to [top].
Back to [Tay-Sachs Disease].

energy calculations for pressure, temperature, potential and total energy

In this section we compare the pressure, temperature, potential and total energy of the different runs.

Pressure
Wildtype Mutation 436 Mutation 485
Average (bar)
1.00711 1.0066 0.998385
Minimum (bar)
-217.3543 -219.7197 -230.0158
Maximum (bar)
231.9909 238.8288 243.7419
Figure 14: Pressure distribution of the wildtype.
Figure 15: Pressure distribution of the Mutation 436.
Figure 16: Pressure distribution of the Mutation 485.

There are differences between the pressure of the different systems, but these differences are very low and therefore, it should not change the structure a lot. Therefore, we think, that such small differences between the three different structures do not explain why two of them do not function any longer, because of the mutation. Otherwise, there are big differences in the minimum and maximum values between these three systems. There is a difference of more than 10 bar in both peaks. Looking at Figure 14, Figure 15, and Figure 16 we can see that the distribution of the pressure over the simulation time are very similar which indicates that there is no big differences between the three different simulation results.

Temperature
Wildtype Mutation 436 Mutation 485
Average (in K)
297.94 297.94 297.936
Minimum (in K)
294.82 294.63 294.99
Maximum (in K)
301.31 300.83 301.08
Figure 17: Temperature distribution of the Wildtype.
Figure 18: Temperature distribution of Mutation 436.
Figure 19: Temperature distribution of Mutation 485.

The temperature of the system is nearly the same, only the temperature of the Mutation 485 is little bit lower. But this difference is that low, so therefore, we can say, the three models have the same temperature. If we have a look at the different plots and the table of the temperatures over the simulation time, all three plots (Figure 17, Figure 18 and Figure 19) show nearly the same picture. There are in each plot some outliers to higher or lower degrees, but in general almost the complete time the system has a temperature of about 298K.

Potential
Wildtype Mutation 436 Mutation 485
Average (in kJ/mol)
-1.2815e+06 -1.28165e+06 -1.28176e+06
Minimum (in kJ/mol)
-1.2853e+06 -1.2852e+06 -1.28513e+06
Maximum (in kJ/mol)
-1.2778e+06 -1.2771e+06 -1.27769e+06
Figure 20: Potential energy distribution of the Wildtype.
Figure 21: Potential energy distribution of Mutation 436.
Figure 22: Potential energy distribution of the Mutation 485.

The average potential of the three different structures is very similar. Although there are very small differences between the wildtype structure and the structures with the mutation. The Wildtype has the highest potential energy, whereas Mutation 436 has a potential energy which is a little bit lower. The structure with the mutation at position 485 has the lowest potential. The values we can see on the table above, are only average values which makes a detailed analysis necessary. Figure 20 - 21 display a detailed course of the potential energy distribution over the time. Especially comparing Figure 20 and Figure 22, we can see that almost during the complete simulation, the potential is lower than for the wildtype. Looking at Figure 21, we can see the same result which is not as clear as in Figure 22. All in all, both mutations change the potential energy a bit.

Total Energy
Wildtype Mutation 436 Mutation 485
Average (in kJ/mol)
-1.0517e+06 -1.0519e+06 -1.05203e+06
Minimum (in kJ/mol)
-1.0559e+06 -1.0557e+06 -1.0569e+06
Maximum (in kJ/mol)
-1.0472e+06 -1.0463e+06 -1.0468e+06
Figure 23: Total energy distribution of the Wildtype.
Figure 24: Total energy distribution of Mutation 436.
Figure 25: Total energy distribution of Mutation 485.

Looking at the total energy of our different systems, we can see that the trend is the similar but a bit stronger than the one we had already observed for the potential energy. Therefore, the both mutations have an effect on the protein structure and energy. The change is not that high here again, but nevertheless, there is a change, and also only little changes in the energy of the protein can damage the function. The mutation at position 485 has significantly more effect on the energy of the system, than the mutation at position 436, but both mutations decrease the energy of the structure. Therefore, it is possible, that the structure become too rigid and can not bind to their targets as without the mutation.

If we look at the plots (Figure 23 - 25) it is easy to see, that the distribution seems to be similar, but average axis is lower in Figure 24 and Figure 25 than in the wildtype plot (Figure 23). This result agrees with the one for the potential energy. This was expected, because the potentail energy is part of the total energy. Therefore, a difference in the potential energy would cause changes in the total energy as well. In our case, the total energy shows more clear that there exists a energy difference between the wildtype structure and the mutation structures.
Back to [top].
Back to [Tay-Sachs Disease].

minimum distance between periodic boundary cells

Now we want to compare the calculations of the minimum distance between periodic boundary cells. First of all, the distance should not be 0, because than some parts of the protein will interact with itself, which should not occur in a protein. So therefore, these minimum distance values should not be too low. Second, also small differences between the values of the different system could have big effects on the protein structure, because if some parts of the protein interact with itself, they could not interact with the original partner any longer and therefore, the shape of the protein could be changed or destroy.

Wildtype Mutation 436 Mutation 485
Average (in nm)
3.139 2.415 3.215
Minimum (in nm)
1.770 1.408 1.772
Maximum (in nm)
4.081 4.096 4.217
Figure 26: Plot of the minimum distance between periodic boundary cells of the wildtype.
Figure 27: Plot of the minimum distance between periodic boundary cells of mutation 436.
Figure 28: Plot of the minimum distance between periodic boundary cells of mutation 485.

On the first view, we can see that Figure 26, Figure 27 and Figure 28 show totally different plots. First of all, it is important to keep in mind, that the MD simulation is a non-deterministic algorithm. Therefore, we can not compare the time line itself, but we can compare the values and the distribution of the values. So therefore, we can see that on the wildtype plot the values are between 2 and 4. Most of the time the values are about 3.7, and only some values are lower than 3.
If we compare the plot for wildtype with the one for the mutation at position 346, we can see that almost all of the minimum distances during the simulation are lower then 3. Therefore, the distance between interacting parts of the protein is significantly lower than for the wildtype (2.415 average for mutation 436, 3.139 average for wildtype). Because there is only one change in the complete sequence of the protein, we suggest, that the part with the mutation causes this changes. Therefore, the mutation lead to significantly different interaction within the protein and therefore it probably change the shape of the protein.
If we compare the wildtype structure to the structure with the mutation at position 485, we can see that most of the distance is about 3. At first sight the two plots seems to be totally different, but a closer look indicates that these two plots are more similar than the wildtype plot compared to the plot of mutation 436. In this case the minimum distance increases a little bit (about 0.2 nm), but the difference is not that strong as we could observed for mutation 436. Nevertheless, also only small changes have influence on the function and the shape of the protein. Therefore, the interacting parts seems to be farther away than for the wildtyp.
Interestingly, the two mutations have different effects on the interactions within protein. Mutation 436 decrease the minimum distance between interacting atoms of the protein, whereas the Mutation at position 485 increase the minimum distance. Nevertheless, if the minimum distance decrease or increase, in both cases the mutation changes the distances and therefore, we suggest that both mutations have an effect on the protein structure and function.
Back to [top].
Back to [Tay-Sachs Disease].

RMSF for protein and C-alpha and PyMol analysis of average and B-factor

Next we want to check if the mutations change the protein flexibility. Therefore, we calculate the RMSF for the complete protein and the C-alpha atoms to have the possibility to differ between flexibility at the side chains and flexibility of the back bone. Furthermore, the program calculated an average protein structure, which consists of all structures which are calculated during the simulation. The program calculates additional the B-factor values on basis of the simulated structures. Furthermore, we want to visualize the most interesting results with PyMol.

original & average (protein) original & B-Factors (protein) average & B-Factors (protein) original & average (c-alpha) original & B-Factors (c-alpha) average & B-Factors (c-alpha)
Wildtype
1.556 0.349 1.684 1.373 0.279 -
Mutation 436
1.525 0.348 1.671 1.324 0.277 1.334
Mutation 485
1.519 0.349 1.727 1.258 0.283 1.297

First of all, we compare the RMSD between the different systems and secondly, we compare the RMSD between the different structures.

Regarding the RMSD values which are calculated by aligning the original and average structure, we can see that the RMSD value for the wildtype alignment is the highest one. The RMSD values of the mutations are similar, but the lowest RMSD value is the value for the structure with the mutation at position 485. A low RMSD value means, that there is less motion during the simulation. Therefore, our wildtype structure seems to move most during the simulation and therefore, this structures seems to be most flexible.
Comparing the RMSD values between original and B-factors, we can see, that this RMSD value is lower than the one of the alignment between original and average. Furthermore, there exists no difference between the different systems. Therefore, the mutations do not change the flexibility of any residue. The wild type structure moves more than the structures with the mutations, but it seems to be independent of the flexibility of the single residues. The alignments between the average structure and the B-factor structures gave higher RMSD values than the one between average and original. So, the difference between the structure with the average B-factors and the average structure seems to be more different than the original and the average structure. Nevertheless, this fact is not that important.
It is more important to recognize, that the wildtype structure shows more motion than the mutated structures although there is no difference in the flexibility of the different residues. This indicates that the protein does not move more because of more flexibility, it moves more because of other reasons like for example different energies or higher kinetic rate. One possible explanation for the different behavior is, that the mutation changes the energy and bonds in the protein and this avoids a strong motion of the backbone.

Next we want to look if the motions of the protein are high because of the motion of the different side chains or of the backbone. Therefore, we calculate the RMSF for the protein with only using the c-alpha atoms. Therefore, we do not regard the side chains any longer.
If we look at the table we can see, that the RMSD values are lower. But the difference is not very high, therefore, most of the motion is because of the backbone motion and not of different positioning residues. The trend is the same like for the side chain calculated the RMSD. Therefore, the backbone of the wildtype structure shows the most motion, whereas the backbone of the mutated structures show a significantly lower motion.

Furthermore, we also got a plot where we can see the RMS fluctuation at the different positions within the protein. Residues with high RMS fluctuation have a high B-factor value and therefore are very flexible. We want to compare, if there are any changes in the flexible residues in the wildtype structure and the mutated structures.
In general, there are less peaks if we look at the RMS fluctuation calculated with the c-alpha atoms (which can be seen in the detailed results), but for the comparison of our results, we only look at the RMS fluctuation of the different residues calculated for the complete protein.

Wildtype Mutation 436 Mutation 485
Figure 29: Plot of the RMSF values over the whole protein of the wildtype.
Figure 30: Plot of the RMSF values over the whole protein of mutation 436.
Figure 31: Plot of the RMSF values over the whole protein of mutation 485.
Number of peaks (nm > 0.2)
7 2 8

At first sight the plots (Figure 29 - 31) are relatively similar. All three plots show the same distribution and have similar peaks, although the height of the peaks differs. We decided to make a cutoff at 0.2nm to decide if this residue is flexible. Therefore, the wildtype has 7 very flexible residues, whereas mutation 436 has only 2 very flexible regions. Mutation 485 has 8 very flexible residues and is therefore very similar to the result of the wildtype.
In general, the distribution of the flexibility is similar for all three structures. But there are big differences in the height of the peaks and in the intensity of the flexibility. Therefore, we can see that especially mutation 436 change the flexibility of the different residues within the protein and therefore, the flexibility of the complete protein.
Back to [top].
Back to [Tay-Sachs Disease].

Radius of gyration

The radius of gyration is the RMS distance of the protein parts from their center. So therefore, it is possible to get a good insight into the shape of the protein during simulation, because if the radius is higher, this means the distance between the different protein parts and the protein center is higher and therefore the protein has a bigger shape than before.
As result of the calculation we got a plot for each structure in which we can see the radius of gyration and also the components of the complete radius. The first component of the plot correspond to the longest axis of the molecule. Therefore, we not only know the radius of gyration, but also which axes are the main components of this radius.

wildtype Mutation 436 Mutation 485
Average (Rg in nm)
2.407 2.408 2.416
Minimum (Rg in nm)
2.346 2.344 2.347
Maximum (Rg in nm)
2.440 2.436 2.339
Figure 32: Distribution of the radius of gyration over time of the wildtype
Figure 33: Distribution of the radius of gyration over time for of Mutation 436
Figure 34: Distribution of the radius of gyration over time of Mutation 485

First of all, the radius of gyration (on Figure 32, Figure 33 and Figure 34) seems to be very similar between the different structures. Therefore, all structures need almost the same space, which was expected, because the structures has the same length and therefore they should approximately need the same space.

Wildtype Mutation 436 Mutation 485 Wildtype Mutation 436 Mutation 485 Wildtype Mutation 436 Mutation 485
RgX (in nm) RgY (in nm) RgZ (in nm)
Average 2.153 2.094 2.145 1.609 1.853 1.630 2.084 1.929 2.094
Minimum 2.012 1.986 1.992 1.423 1.581 1.444 1.945 1.618 1.809
Maximum 2.214 2.179 2.212 1.807 2.102 1.927 2.238 2.212 2.219

If we have a closer look at the different axes of the protein we can see that there are big differences. The x axis seems to be similar between the different structures. But the y and z axes differ extremely between the structures. On the wildtype, the value of the z axis is almost similar to the x axis value, whereas the value for the y axis is very low. The axes on the plot of mutation 436 are more flexible. There are some situation in which the value of the y and the z axes is almost the same, some situations in which the value of y is near by the value of x and some situations in which the value of z is near by the value of x. Therefore, these structures seem to pulsate in a way, because there are always changes in the radius of gyration for the y and the z axes. For the mutation 485, most of the time the z axis value is similar to the x axis value and the y axis value is very low. This is very similar to the wildtype. There are some situations in which the y and the z axis values are very similar. So therefore, again this structure seems to move more along the axes than the wildtype.
So in general, we already know that our wildtype structure is very flexible and has a lot of motion. Nevertheless, this system seems not to slide along any axes, as the structures with the mutation do.
Back to [top].
Back to [Tay-Sachs Disease].

solvent accessible surface area

As a next step, we analyzed the solvent accessible surface area of the wildtype and the two mutations for each residue of the protein and for all atoms of the protein. Furthermore, we looked at the solvent accessibility of the whole protein during the simulations. Therefore we received three plots for each mutation and two for the wildtype which contains the solvent accessibility area with standard deviation for all residues (Figure 35-37), for all atoms (Figure 38-40) and for the whole protein (Figure 41-43). Furthermore, we calculated the minimum, the maximum and the average of those distributions which allows a detailed comparison.

wildtype Mutation 436 Mutation 485
Average (in nm²)
0.537 0.553 0.542
Minimum (in nm²)
0.004 0.003 0.007
Maximum (in nm²)
2.058 2.005 2.014
Figure 35: Solvent accessibility of each residue in the protein with standard deviation for the wildtyp
Figure 36: Solvent accessibility of each residue in the protein with standard deviation for Mutation 436
Figure 37: Solvent accessibility of each residue in the protein with standard deviation for Mutation 485

First of all, we analyze the solvent accessibility of each residue. Looking at the table above, we can see that the values for the minimum, the maximum and the average agree mostly for wildtyp, mutation 436 and mutation 484. The same can be seen by regarding the plots of the solvent accessibility of each residue. This indicates that both mutations do not change the solvent accessibility of the residues dramatically. Furthermore, the plots display that in all three cases the amplitude of the fluctuation is not that high with some exceptions. Besides, the standard deviation is very low in all three plots which indicates that there are no extreme outliers in there. Both curves point out that there are mainly sparse moving residues during the complete simulation with some exceptions where the residues are very flexible. All in all, this suggest that the real movement which is sparse will not be strong influenced by the two mutations.


wildtype Mutation 436 Mutation 485
Average (in nm²)
0.031 0.032 0.032
Minimum (in nm²)
0 0 0
Maximum (in nm²)
0.560 0.558 0.561
Figure 38: Solvent accessibility of each atom of the complete protein with standard deviation for the wildtyp
Figure 39: Solvent accessibility of each atom of the complete protein with standard deviation for Mutation 436
Figure 40: Solvent accessibility of each atom of the complete protein with standard deviation for Mutation 485

Next, we have a closer look at the solvent accessibility of each atom of the protein. The minimum, maximum and average values for the two mutations and the wildtype are very similar as well as the according solvent accessibility plots. This indicates that the mutations do not cause extreme changes for the solvent accessibility which is the same result as we got by the residues. The amplitude of the fluctuations is most of the time low with some exceptions. This indicates that the protein movement is most of the time sparse. Finally, this this suggest that the real movement which is sparse will not be strong influenced by the two mutations.

wildtype Mutation 436 Mutation 485
Average (in nm²)
135.036 138.727 135.452
Minimum (in nm²)
129.084 127.066 129.167
Maximum (in nm²)
142.218 146.571 142.977
Figure 41: Area of the protein which is accessible to the surface during the simulation with standard deviation for the wildtyp
Figure 42: Area of the protein which is accessible to the surface during the simulation with standard deviation for Mutation 436
Figure 43: Area of the protein which is accessible to the surface during the simulation with standard deviation for Mutation 485

Finally, we examine the solvent accessibility of the whole protein during the whole simulation. Looking at the minimum, the maximum and the average values, we can see that they are very similar between wildtyp, mutation 436 and mutation 485. One exception is mutation 436 where the values are a little bit higher. The plots which describes the solvent accessibility for different physicochemical properties look very similar with no outstanding differences for mutation 436. This indicates that both mutations do not cause huge changes of the solvent accessibility of the whole protein. Only mutation 436 causes eventually some very small changes. The fluctuation for all different physicochemical properties is very constant with no outstanding outliers. Therefore, we suggest that the movement of the protein is sparse, because there exist no big changes which can be caused by structural change.

To sum up, all three different solvent accessibility analysis deliver same results. The mutation seem not to cause huge changes of the solvent accessibility. Furthermore, the movement seems to be very sparse, because the fluctuations stay very constant during the whole simulation.
Back to [top].
Back to [Tay-Sachs Disease].

hydrogen-bonds between protein and protein / protein and water

Afterwards, we had a look at the hydrogen bonds where we differentiate between hydrogen bonds within a protein and hydrogen bonds between the protein and the surounding water. The following plots displays the number of the hydrogen bonds as well as possible pairs within 0.35nm during the simulation. Furthermore, we determine the minimum, maximum and average number of hydrogen bonds for a better comparison.

wildtype Mutation 436 Mutation 485
Figure 44: Number of hydrogen-bonds between the protein and the surrounding water for the wildtyp
Figure 45: Number of hydrogen-bonds between the protein and the surrounding water for Mutation 436
Figure 46: Number of hydrogen-bonds between the protein and the surrounding water for Mutation 485
Figure 47: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for the wildtyp
Figure 48: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for Mutation 436
Figure 49: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for Mutation 485

The plots above show the number of hydrogens formed within a protein during the whole simulation. Looking only at the hydrogen bonds itself we can see a different development between the wildtype, mutation 436 and mutation 485. The number for the wildtype (Figure 44) and mutation 436 (Figure 45) decrease during the simulation in some different ways. Contrary, mutation 485 goes only a bit down and increases at the end again (Figure 46). The number stays for all three cases mostly between 300 and 350. An interesting fact is that the wildtype has most of the time a higher number of hydrogen bonds than both mutated structures. The reason therefore is probably that the mutations causes structural changes which influences the hydrogen bonds.

The other plots (Figure 47-49) which contain additional the number of pairs within 0.35nm look almost similar. This is probably because of the scale. However, this shows that the difference is not that huge compared to the possible hydrogen bonds that could be formed.

Wildtype Mutation 436 Mutation 485
bonds in the protein possible bonds in the protein bonds in the protein possible bonds in the protein bonds in the protein possible bonds in the protein
Average 328.758 1537.77 319.787 1534.866 323.337 1543.024
Minimum 294 1486 292 1483 300 1491
Maximum 361 1587 356 1584 354 1602

The table above contains the minimum, maximum and average number of hydrogen bonds formed within a protein for the wildtype and the both mutations. Comparing this values, we can see that there is no big difference between the wildtype and the mutations. This indicates that even if the fluctuation development differs, that the number of formed hydrogen is very similar. Furthermore, this shows that the result for the wildtype is not outstanding higher.

wildtype Mutation 436 Mutation 485
Figure 50: Number of hydrogen-bonds between the protein and the surrounding water for the wildtyp
Figure 51: Number of hydrogen-bonds between the protein and the surrounding water for Mutation 436
Figure 52: Number of hydrogen-bonds between the protein and the surrounding water for Mutation 485
Figure 53: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for the wildtyp
Figure 54: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for Mutation 436
Figure 55: Number of hydrogen-bonds and possible hydrogen-bonds between the protein and the surrounding water for Mutation 485

The plots above show the number of hydrogen bonds formed between the protein and the surrounding water during the whole simulation. Looking only at the hydrogen bonds itself we can see a different development between the wildtype, mutation 436 and mutation 485. The number of hydrogen bonds of wildtype (Figure 50) as well as mutation 436 (Figure 51) increases during the simulation whereas the number of hydrogen bonds of mutation 485 stays almost constant (Figure 52). Besides, the slop of the wildtype seems to be more extreme. The number of formed hydrogen between the protein and the surrounding water is for all three cases between 800 and 900.

The other plots (Figure 53-55) which contain additional the number of pairs within 0.35nm look almost similar. This is probably because of the scale. However, this shows that the difference is not that huge compared to the possible hydrogen bonds that could be formed.

Wildtype Mutation 436 Mutation 485
bonds in the protein possible bonds in the protein bonds in the protein possible bonds in the protein bonds in the protein possible bonds in the protein
Average 836.94 981.18 853.403 999.847 847.965 999.310
Minimum 768 853 778 905 783 882
Maximum 916 1091 907 1106 912 1126

The table above contains the minimum, maximum and average number of hydrogen bonds formed between the protein and the surrounding water for the wildtype and the both mutations. Comparing this values, we can see that there is no big difference between the wildtype and the mutations. A outstanding difference exists only between the different average values, where the wildtype has a smaller value. This indicates that the mutation probably changes the protein structure in way that less hydrogen bonds with the surrounding water were formed.

To sum it up, the number of the hydrogen bonds within a protein and between the protein and the surrounding water is very similar. Having a closer look, we can see that the wildtype differs a bit which indicates that the mutation cause small structural changes which cause a different number of hydrogen bonds. Furthermore, looking at the bonds formed within the protein, we see that the wildtype seems to form more hydrogen bonds in the protein than between the protein and the water, compared to the mutated structures. Therefore, it could be possible that the mutated proteins form more bonds with the surrounding water instead of forming them within the protein.
Back to [top].
Back to [Tay-Sachs Disease].

Ramachandran plot

Next, we analyzed the resulting Ramachandran Plots for the wildtyp, mutation 436 and mutation 485. Besides we compared them with a typical Ramachandran Plot. This plot displays which group of points stand for a certain secondary structure element.

wildtype Mutation 436 Mutation 485 Typical Ramachandran Plot
Figure 44: Ramachandran Plot of the wildtyp
Figure 45: Ramachandran Plot of Mutation 436
Figure 46: Ramachandran Plot of Mutation 485
Figure 47:Typical Ramachandran Plot (Source: [wikipedia])

The Ramachandran plots which were created during our simulations contain a lot of points. This points build no clear regions like in the typical ramachandran plot. Here one regions goes over in another one. The only clear boundary exist in the vertical. On the left site are the typical secondary structure elements. For all three cases this regions looks very smeared whereas mutation 485 differs a bit, because it contains not this white region at the bottom where no points are. The right half contains in the middle the left-handed helices which were presented a lot in all three plots. In this regions the three plots differ as well, whereby the wildtype has the clearest contours and mutation 485 displays the most blurred region. The rest of the point on both sides display some kind of other secondary structure. The high number of dots and smeared look can be explained by the fact that these ramachandran plots were created during the whole simulation. Therefore, every movement and structure change causes different secondary structure elements. The ramachandran plot displays clear differences between the wildtyp, mutation 436 and mutation 485. This indicates that both mutations causes structural changes as well as they influence the movement of the protein. The ramachandran plot of mutation 485 deviates most which indicates that this mutation has the greatest effect on the structure and its movement.
Back to [top].
Back to [Tay-Sachs Disease].

RMSD matrix

As an next step, we compared the RMSD matrices of the wildtyp, mutation 436 and mutation485.

wildtype Mutation 436 Mutation 485
Figure 48: RMSD matrix of our structures during the simulation for the wildtyp
Figure 49: RMSD matrix of our structures during the simulation for Mutation 436
Figure 50: RMSD matrix of our structures during the simulation for Mutation 485

The RMSD matrices display some differences between the different types. The first RMSD matrix, the wildtype RMSD matrix, has the smallest RMSD values around the diagonal. Contrary, the RMSD matrix of mutation 436 contains a square around the diagonal which contains the lowest values. Having a closer look, this difference is not extreme which means that the protein with mutation 436 differs only a bit compared to the wildtyp. The highest variation from the wildtype displays the RMSD matrix for mutation 485 where a square at the right top contains really low RMSD scores. It is the only RMSD matrix which contains light blue colored regions far away from the diagonal. This different RMSD matrices indicates that the protein with the mutation 485 has most regions with higher density. Mutation 436 causes some density changes as well, but they are probably harmful. Consequentially, mutation 485 causes highest density changes which probably influences the movement capability a lot. We suggest that the protein with this mutation will probably be more rigid and moves less.
Back to [top].
Back to [Tay-Sachs Disease].

cluster analysis

Afterwards, we regard the different clusters for the wildtyp, mutation 436 and mutation 485. Therefore, we compared the visualized cluster and the distribution of the RMSD value over the different clusters. At last, we aligned some certain structures from the cluster, calculate the RMSD and compared them as well.

wildtype Mutation 436 Mutation 485
Figure 51: Visualization of the 225 different clusters for the wildtyp
Figure 52: Visualization of the 231 different clusters for Mutation 436
Figure 53: Visualization of the 225 different clusters for Mutation 485
Figure 54: Distribution of the RMSD values over the different clusters for the wildtyp
Figure 55: Distribution of the RMSD values over the different clusters for Mutation 436
Figure 56: Distribution of the RMSD values over the different clusters for Mutation 485

The first pictures (Figure 51, Figure 52, Figure 53) display the clusters for the wildtyp, mutation 436 and mutation 485. In these pictures it is very difficult to extract small difference, because of the scale. It is possible to see that they look similar in general. The distributions of the RMSD values (Figure 54, Figure 55, Figure 56) is more comparable because we can directly see some differences. The highest difference exists for the curve shape. The wildtype increases slow and has a fast decrease. Contrary, the RMSD distribution for both mutations acts contrary. This means that it increases fast and decreases very slow with one break within. Mutation 485 differs most and contains a groove. This displays the influence of both mutations on the cluster and the associated structure or movement.

Cluster 1 Cluster 2 RMSD for the Wildtype RMSD for Mutation 436 RMSD for Mutation 485
cluster 1 cluster 2 0.654 0.880 0.790
cluster 1 cluster 5 0.899 0.068 0.755

Comparing the build alignment of certain structures out of the clusters, we can see that the RMSD does not differ dramatically. For cluster 1 and 2 the smallest RMSD value is achieved by the structure of the wildtype cluster. Contrary, the smallest RMSD value for cluster 1 and 5 is reached by the structure of mutation 436 which is outstanding small. These two examples show that the mutations have influence on the structure and the associated movement of the protein. With these to samples it is not possible to determine the influence strength of the mutations.

To sum up, we can see that both mutations have some influence on the protein structure and associated movements whereas mutation 485 will cause the strongest damage.
Back to [top].
Back to [Tay-Sachs Disease].

internal RMSD

Finally, we analyzed the internal RMSD. The internal RMSD describes the distances between the single atoms within protein, which helps to obtain the structure of the protein. The following plots (Figure blub) display the internal RMSD during the whole simulation for the wildtyp, mutation 436 and mutation 485. Additional we extract the minimal, maximal and average internal RMSD values.

Wildtype Mutation 436 Mutation 485
Average (RMSD in nm)
0.238 0.242 0.243
Minimum (RMSD in nm)
4.89e-7 0.141 4.906e-07
Maximum (RMSD in nm)
0.312 0.409 0.289
Figure 32: Plot of the distance RMS values in the protein for the wildtype
Figure 33: Plot of the distance RMS values in the protein for Mutation 436
Figure 34: Plot of the distance RMS values in the protein for Mutation 485

Looking at the internal RMSD plots, we can see that there exist only some small differences between the wildtype and the two mutations. All three plots increase very fast at the beginning till they reach some kind of stagnation where the internal RMSD arise slow. They reach all about 0.3nm their stagnation. The only small difference between the wildtype and both mutations is the slop of the wildtype which is stronger from the point where the stagnation begins. Looking at the average values, we can see that the three curves are really similar. The minimum and maximum values are the only one which display some differences. The mutation seems to influence the internal RMSD only less which indicates that he mutations have no huge influence on the protein structure.
Back to [top].
Back to [Tay-Sachs Disease].

Comparison to Normal Mode Analysis

The comparison of Molecular Dynamics to the the Normal Mode Analysis can be found on the [Normal Mode Analysis site].

Discussion

In this section we want to discuss if the MD simulation could give us hints, that the mutations are bad for the protein.

In the following table we want to list if there are any differences between the wildtype and the mutation.

By the visualization in PyMol we can see, that there is a difference in the motion of the residues of the wildtype and the mutated structure at position 485, but there is no difference between wildtype and mutation 436.

protein motion
Difference wt - mut 436 Difference wt - mut 485
Difference no yes

Now we want to compare the energy calculations for the different structures. By analyzing pressure there is a difference, if the deflection is more than 0.001 in average. By the minimum and maximum comparison there we count a difference by a deflection of more than -3.
By the temperature there has to be a difference of more than 1 K.
By comparison the potential and the total energy, we count a difference, if there is a deflection of more than 20 kJ/mol.

energy calculations
Wildtype Mutation 436 Mutation 485
Pressure
Average (bar) 1.00711 1.0066 0.998385
Minimum (bar) -217.3543 -219.7197 -230.0158
Maximum (bar) 231.9909 238.8288 243.7419
Temperature
Average (K) 297.94 297.94 297.936
Minimum (K) 294.82 294.63 294.99
Maximum (K) 301.31 300.83 301.08
Potential
Average (kJ/mol) -1.2815e+06 -1.28165e+06 -1.28176e+06
Minimum (kJ/mol) -1.2853e+06 -1.2852e+06 -1.28513e+06
Maximum (kJ/mol) -1.2778e+06 -1.2771e+06 -1.27769e+06
Total energy
Average (kJ/mol) -1.05177e+06 -1.0519e+06 -1.05203e+06
Minimum (kJ/mol) -1.05599e+06 -1.0557e+06 -1.05687e+06
Maximum (kJ/mol) -1.04718e+06 -1.0463e+06 -1.04680e+06
energy calculations
Difference wt - mut 436 Difference wt - mut 485
Pressure
Average no no
Minimum no yes
Maximum yes yes
Temperature
Average no no
Minimum no no
Maximum no no
Potential energy
Average no yes
Minimum no no
Maximum no no
Total energy
Average no no
Minimum yes yes
Maximum yes yes

There is no difference in the pressure of the different systems. There are some differences in the minimum and maximum value, but the average value is nearly the same, so therefore, we count differences in pressures as no. There is also no difference in the temperature if we compare the different systems. If we have a look to the potential energy of the different structures, we can see, that there are differences in average. The minimum and maximum values are nearly the same, but the average value differs. The difference is more significant in the second mutation. Differences in the potential energy are very important for function, because if the protein has a difference in energy, the function could change. If we look at the total energy of the protein, we can see that the differences are very small and therefore, there is no difference in average, although there are differences in the minimum and maximum values.

Now we compare the minimum distance between periodic boundary cells of the different structures. We decided to see a difference between the two structures if the deviation between the two values is more than 0.1nm.

minimum distance between periodic boundary cells
Wildtype Mutation 436 Mutation 485
Average (nm) 3.139 2.415 3.215
Minimum (nm) 1.770 1.408 1.772
Maximum (nm) 4.081 4.096 4.217
minimum distance between periodic boundary cells
Difference wt - mut 436 Difference wt - mut 485
Average yes no
Minimum yes no
Maximum no yes


In this case there is a difference between wildtype and mutation 436 in average and minimum and the maximum value of the wildtype and mutation 485. So in this case, the mutation at position 436 seems to change the structure more than the mutation at position 485.

By the comparisons of the RMSF calculation of the different structure, we count only the significant peaks with more than 0.2. If the number is not the same we count this as difference.

RMSF calculation
Wildtype Mutation 436 Mutation 485
RMSF for protein
#high B-factor regions 7 2 8
RMSF for c-alpha
#high B-factor regions 3 1 3
RMSF calculation
Difference wt - mut 436 Difference wt - mut 485
RMSF for protein
#high B-factor regions yes yes
RMSF for c-alpha
#high B-factor regions yes no

As we can see in the table above, there is always a difference in the number of peaks if we calculate the RMSF for the whole protein. Nevertheless, the difference between wildtype and mutation 436 is a way more significant than the difference between wildtype and mutation 485. Furthermore, if we only compare the c-alpha atoms, the number of peaks of the wildtype and mutation 485 is equal. Therefore, the mutation at position 436 seems to change the protein more than the mutation at position 485.

Now, we want to compare the radius of gyration between the different structures. In this case, we mark a difference is there is a deflection of 0.01 nm.

Radius of gyration
Wildtype Mutation 436 Mutation 485
Rg
Average (nm) 2.407 2.408 2.416
Minimum (nm) 2.346 2.344 2.347
Maximum (nm) 2.440 2.436 2.449
RgX
Average (nm) 2.153 2.094 2.145
Minimum (nm) 2.012 1.986 1.992
Maximum (nm) 2.214 2.179 2.212
RgY
Average (nm) 1.609 1.853 1.630
Minimum (nm) 1.423 1.581 1.444
Maximum (nm) 1.807 2.102 1.927
RgZ
Average (nm) 2.084 1.929 2.094
Minimum (nm) 1.945 1.618 1.809
Maximum (nm) 2.238 2.212 2.219
Radius of gyration
Difference wt - mut 436 Difference wt - mut 485
Rg
Average no yes
Minimum no no
Maximum yes no
RgX
Average yes yes
Minimum yes yes
Maximum yes no
RgY
Average yes yes
Minimum yes yes
Maximum yes yes
RgZ
Average yes yes
Minimum yes yes
Maximum yes yes

In this case, most of the time, there is a difference between wildtype and mutation. In average, there is only a difference between mutation 485 and wildtype, if we look at the complete radius of gyration. But if we have a closer look and compare how the average value of the complete radius is composed, we can see that there are significant differences. There is always a difference between the wildtype and the mutated structures.

Another very important property of each protein is the area which is accessible to the surface. By comparisons of the solvent accessible surface area of each residue and each atom, there is a difference if there is a deflection of 0.01 nm². The last comparison is an average value for the complete protein over time and therefore, we only counted a difference, the deflection is more than 1 nm².

solvent accessible surface area
Wildtype Mutation 436 Mutation 485
Solvent accessible area of each residue
Average (in nm²) 0.537 0.553 0.542
Minimum (in nm²) 0.004 0.003 0.007
Maximum (in nm²) 2.058 2.005 2.014
Solvent accessible area of each atom
Average (in nm²) 0.031 0.032 0.032
Minimum (in nm²) 0 0 0
Maximum (in nm²) 0.560 0.558 0.561
Solvent accessible area of the protein over time
Average (in nm²) 135.036 138.727 135.452
Minimum (in nm²) 129.084 127.066 129.167
Maximum (in nm²) 142.218 146.571 142.977
Solvent accessible surface area
Difference wt - mut 436 Difference wt - mut 485
Solvent accessible area of each residue
Average yes no
Minimum no no
Maximum yes yes
Solvent accessible area of each atom
Average no no
Minimum no no
Maximum no no
Solvent accessible area of the protein over time
Average yes no
Minimum yes no
Maximum yes no

There is a difference of the solvent accessible area of each residue between wildtype and the mutations, but the solvent accessible area of each atom is equal in average. More important is the solvent accessible surface area of the whole protein over time and there we can see, is only a difference between wildtype and mutation 436. The area of the wildtype and the mutation at position 485 is nearly the same.

Another very important characteristic for the stability of a protein is the number of hydrogen bonds in the protein and between the protein and the water. If there is a difference of more than 5 hydrogen bonds we decided to count them as not similar.

hydrogen-bonds
Wildtype Mutation 436 Mutation 485
bonds within the protein
real occurring bonds
Average 328.758 319.787 323.337
Minimum 294 292 300
Maximum 361 356 354
possible bonds
Average 1537.77 1534.866 1543.024
Minimum 1468 1483 1491
Maximum 1587 1584 1602
bonds between protein and water
real occurring bonds
Average 836.94 853.403 847.965
Minimum 768 778 783
Maximum 916 907 982
possible bonds
Average 981.18 999.847 999.310
Minimum 853 905 882
Maximum 1091 1106 1126
Hydrogen-bonds
Difference wt - mut 436 Difference wt - mut 485
bonds within the protein
real occurring bonds
Average yes yes
Minimum no yes
Maximum yes yes
possible bonds
Average yes yes
Minimum yes yes
Maximum no yes
bonds between protein and water
real occurring bonds
Average yes yes
Minimum yes yes
Maximum yes yes
possible bonds
Average yes yes
Minimum yes yes
Maximum yes yes


There is almost at every comparisons a difference between wildtype and mutation. The number of real occurring hydrogen-bonds as well as the number of possible hydrogen bonds differ between them. Therefore, the structure of the mutated proteins seems to change dramatically.

Now we want to compare the ramachandran plots of the different structures. Here we only do a visual comparison.

Ramachandran Plot
Difference wt - mut 436 Difference wt - mut 485
Difference no yes

The Ramachandran plots for the wildtype and mutation 436 are quite equal, whereas there a big differences between the wildtype and the mutation 485 ramachandran plot.

Furthermore, we want to compare the RMSD matrices of the different structures, which is also done visually.

RMSD matrix
Difference wt - mut 436 Difference wt - mut 485
Difference no yes

The RMSD matrices of the wildtype and mutation 436 are quite similar. There are differences, but in general the color of the plots are relatively equal, whereas, the plot of mutation 485 is much darker and therefore different.

Now we want to have a look to the number of the different clusters. If there is a difference of more than 5 clusters, we count it as not equal.

Cluster analysis
Wildtype Mutation 436 Mutation 485
#Clusters 225 231 225
Cluster analysis
Difference wt - mut 436 Difference wt - mut 485
Difference yes no

The algorithm found 225 different clusters for the wildtype and mutation 485, but it found 231 clusters for mutation 436.

As last point we compared the internal RMSD of the proteins. If there is a difference of more than 0.01 the structures are counted as different.

Internal RMSD
Wildtype Mutation 436 Mutation 485
Average 0.238 0.242 0.243
Minimum 4.89e-7 0.141 4.906e-07
Maximum 0.312 0.409 0.289
Internal RMSD
Difference wt - mut 436 Difference wt - mut 485
Average no no
Minimum yes no
Maximum yes no


In this analysis we can see, that the internal RMSD between the wildtype structure and the mutated structures is almost the same and therefore, there is no difference in the internal RMSD values.



No we want to decide if the mutations are silent or non-silent. Therefore, we count how often is there a difference between the average values of the wildtype and the mutated structures.

Wildtype vs. Mutation 436 Wildtype vs. Mutation 485
#Differences 13 13
Ratio 56% 56%
Conclusion non-silent non-silent
correctness wrong right

Both mutations have the same number of differences between it and the wildtype. Therefore 56% of the criteria are different from the wildtype. We predicted both mutations as non-silent.
This is wrong. Only one mutation is non-silent. Mutation 436 indeed is silent. There are hints, that this mutation is silent, because there is no difference in the motion of the mutated amino acid and in the energy values. But there are a lot of differences in other analysis.
Therefore, we have a prediction correctness of 50% which is really bad. So we can see, Molecular Dynamics could be very helpful by analyzing a mutation, but it can also fail. MD is a very time-consuming analysis procedure and therefore, in our case, it was not very helpful, with a prediction correctness of 50%. Therefore, we think, it is useful to analyze the mutations first with other methods and only in special cases or cases of doubt it is useful to use the MD simulation.
Back to [top].
Back to [Tay-Sachs Disease].