Molecular Dynamics Analysis GLA

From Bioinformatikpedia

by Benjamin Drexler and Fabian Grandke

Introduction

In this task we analysed the simulated data that have been created within task 8. We used several tools of GROMACS to analyse the data and Pymol to visualize them. As the single steps of this task were exactly done according to the tutorial from the task 10 page there is no methods section provided on this page.

Results

Brief check of the results

How many frames are in the trajectory file and what is the time resolution?

Wildtype Mutation 3 Mutation 8
2001 frames 2001 frames 2001 frames
time resolution of 5ps time resolution of 5ps time resolution of 5ps

How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?

Wildtype Mutation 3 Mutation 8
18h06:37 18h29:13 18h19:11
13.252 ns/day 12.982 ns/day 13.101 ns/day
~206740 years ~211040 years ~209123 years


Which contribution to the potential energy accounts for most of the calculations?

Wildtype Mutation 3 Mutation 8
-8.52573e+05 kJ/mol -8.53327e+05 kJ/mol -8.52539e+05 kJ/mol

Visualization of the results

Wildtype Mutation 3 Mutation 8
Figure 1: Molecular Dynamics simulation of the wildtype protein.
Figure 2: Molecular Dynamics simulation of the mutation 3 protein.
Figure 3: Molecular Dynamics simulation of the mutation 8 protein.

Figures 1,2 ,and 3 show about every thirtieths frame of the molecular dynamics simulation. An animated figure of all 1000 simulated frames would be to large for this wiki. All animations show similar but not identical movement of the protein.

Quality assurance

Convergence of energy terms

Temperature

Description Wildtype Mutation 3 Mutation 8
Max(kJ/mol) 301.3734 302.0039 301.7407
Min(kJ/mol) 293.9565 294.0463 293.9268
Average(kJ/mol) 297.9275 297.9489 297.9303
Plot
Figure 4: Temperature over time for wildtype.
Figure 5: Temperature over time for mutation 3.
Figure 6: Temperature over time for mutation 8.

The mutated proteins (Figures 5 and 6) temperatures averages are slightly higher than the wildtype proteins (Figure 4) one, but there is no significant increase. The lowest temperature of the mutation 8 protein is even lower than the minimum of the wildtype. The general appearance of the temperature diagram differs between the proteins, but there is no change into a certain direction. Thus, the mutations seem not to influence the temperature significantly. In all simulations the temperature varies within a very small range, and therefore could be declared as stabilized.

Pressure

Description Wildtype Mutation 3 Mutation 8
Max(kJ/mol) 346.2387 385.5467 326.9994
Min(kJ/mol) -308.7182 -357.4386 -291.1198
Average(kJ/mol) 0.4503482 -1.521002 -0.7279793
Plot
Figure 7: Pressure over time for wildtype.
Figure 8: Pressure over time for mutation 3.
Figure 9: Pressure over time for mutation 8.

Figures 7, 8 and 9 show the pressure values during the simulation. All proteins have high levels of variation (>600 kJ/mol) and have averages around zero. The mutated proteins averages both are slightly negative, but in the circumstances of the high overall variation this seems not significant. It is not expected that the values converge to a certain value.

Potential

Description Wildtype Mutation 3 Mutation 8
Max(kJ/mol) -849914.5 -850058.5 -849804.2
Min(kJ/mol) -855729.5 -856170.2 -855717.7
Average(kJ/mol) -852568.2 -853314.1 -852547.7
Plot
Figure 10: Potential over time for wildtype.
Figure 11: Potential over time for mutation 3.
Figure 12: Potential over time for mutation 8.

Figures 10, 11 and 12 show the potential values during the simulation. In opposite to the pressure values, there are only low variations around one value (-853,000) and the simulations seem to have reached their particular equilibria. There is no significant difference between the wildtype protein and the mutated ones, so the impact of the mutations on the potential is vanishingly low.

Total Energy

Description Wildtype Mutation 3 Mutation 8
Max(kJ/mol) -695814.5 -696662.8 -695560
Min(kJ/mol) -703550.4 -703769 -702829.8
Average(kJ/mol) -699602.2 -700202.2 -699497.3
Plot
Figure 13: Total energy over time for wildtype.
Figure 14: Total energy over time for mutation 3.
Figure 15: Total energy over time for mutation 8.

Figures 13,14 and 15 show the values of the total energy during the simulation. All diagrams show variation around a value of ~700,000 and there are no results that show a reasonable difference between the proteins. The range of the values is small so they seem to be converged.

Minimum distances between periodic images

Description Wildtype Mutation 3 Mutation 8
Minimum distance(nm) 1.932 1.992 2.007
Time of occurance 370 7910 5940
Plot
Figure 16: Minimum distance between periodic boundary cells for wildtype.
Figure 17: Minimum distance between periodic boundary cells for mutation 3.
Figure 18: Minimum distance between periodic boundary cells for mutation 8.

Figures 16,17 and 18 show the minimum distance between periodic boundary cells for the certain proteins. The minimum distances of the mutated proteins are slightly higher than the wildtype protein ones. If the minimum distance would be smaller and be under a cutoff value, that would mean that there would be interactions of different parts of the molecule what would cause huge changes in molecular dynamics movement, and the results would be completely different.

Root mean square fluctuations

Wildtype Mutation 3 Mutation 8
Figure 19: Root mean square fluctuations for wildtype protein.
Figure 20: Root mean square fluctuations for mutation 3 protein.
Figure 21: Root mean square fluctuations for mutation 8 protein.
Figure 22: Root mean square fluctuations for wildtype C-alpha.
Figure 23: Root mean square fluctuations for mutation 3 C-alpha.
Figure 24: Root mean square fluctuations for mutation 8 C-alpha.
Figure 25: Image of aligned average and b-factor protein for wildtype protein.
Figure 26: Image of aligned average and b-factor protein for mutation 3 protein.
Figure 27: Image of aligned average and b-factor protein for mutation 8 protein.
Figure 28: Image of aligned average and b-factor protein for wildtype C-alpha.
Figure 29: Image of aligned average and b-factor protein for mutation 3 C-alpha.
Figure 30: Image of aligned average and b-factor protein for mutation 8 C-alpha.

Figures 19, 20 and 21 show the RMS fluctuation of the proteins. The diagrams are scaled differently so, but the pattern of peaks are very similar. The only difference and the reason for the different scaling is the last upswing. It is most extreme in mutation 3 protein(>5) and smallest in mutation 8 protein (~2.5). The other values are almost the same. So again there is no obvious difference between the wildtype and the mutated proteins.

Figures 22, 23 and 24 show the RMS fluctuations of the proteins C-alphas. The results have the same character as the previous ones. Again there is only a difference in the last upswing.

Figures 25, 26 and 27 show alignments of the average proteins and the b-factor ones from pymol. They look very similar, as well. On the left side of the picture is a helix that "points" towards the viewer. It is the only part of the protein that is red colored in red. It is the same end of the protein that showed significant behavior in the two steps before. The other parts of the proteins are mostly color dark blue or light blue. The segments that are highlighted light blue are mostly similar for all proteins. Only the mutation 8 protein has some of those colored in green. That means that the b-factor values of those are a little bit higher. All three proteins align very well.

Figures 28, 29 and 30 show alignments of the average proteins C-alpha and the b-factor ones from pymol. Despite some small differences the alignments look very similar. Again only the end of the chain is colored significantly. The dark blue averages and light blue b-factor atoms align very well, again.

Convergence of radius of gyration

Wildtype Mutation 3 Mutation 8
Figure 31: Radius of gyration over time for wildtype protein.
Figure 32: Radius of gyration over time for mutation 3 protein.
Figure 33: Radius of gyration over time for mutation 8 protein.

Figures 31, 32 and 33 show the radius of gyration of the proteins. The black line shows the general change of the proteins structure during the simulation. The green, red and blue lines show the certain values for the X-, Y- and Z-axis, respectively. The general structure seems not to change at all, despite by very little variation. The blue and red line seem to be anti proportional, every time one has an upswing, the other one has a downswing. They are not exactly mirrored, but the general tendency is very obvious. The green line is more independent. It behaves different in every protein and does not seem to influence the general change of the proteins structures.

Solvent accessible surface area

Description Wildtype Mutation 3 Mutation 8
SAS over time
Figure 34: Solvent accessible surface area over time for wildtype protein.
Figure 35: Solvent accessible surface area over time for mutation 3 protein.
Figure 36: Solvent accessible surface area over time for mutation 8 protein.
SAS per atom
Figure 37: Solvent accessible surface area per atom for wildtype protein.
Figure 38: Solvent accessible surface area per atom for mutation 3 protein.
Figure 39: Solvent accessible surface area per atom for mutation 8 protein.
SAS per residue
Figure 40: Solvent accessible surface area per residue for wildtype protein.
Figure 41: Solvent accessible surface area per residue for mutation 3 protein.
Figure 42: Solvent accessible surface area per residue for mutation 8 protein.

Figures 34, 35 and 36 show the solvent accessible surface area of the proteins. The black, red, green and blue lines show the hydrophobic, hydrophilic, total and D Gsolv, respectively. All diagrams show that the hydrophobic solvents accessibility is very slightly higher than the hydrophilic solvents one, but they are very close. The green lines indicate that the total energy stays at a constant level, despite a little variance. There is no significant difference between the three diagrams.

Figures 37, 38 and 39 show the solvent accessible surface area of the proteins per atom. The three diagrams look alike, but have some differences (e.g. Figure 39 does not have the significant double peak(>0.4) at atom ~400).


Figures 40, 41 and 42 show the solvent accessible surface area of the proteins per residue. Again, the diagrams are very similar, but there are significant changes at the positions where the mutations occur. In Figure 41 at residue 117 the value is 0.34, so there is an increase in comparison to the wildtype value that is 0.18. In Figure 42 at residue 283 the value increases from 0.022 to 0.11(wildtype).

Hydrogen bonds

Description Wildtype Mutation 3 Mutation 8
Internal HB
Figure 43: Internal hydrogen bonds over time for wildtype protein.
Figure 44: Internal hydrogen bonds over time for mutation 3 protein.
Figure 45: Internal hydrogen bonds over time for mutation 8 protein.
Protein-Solvent HB
Figure 46: Hydrogen bonds between protein and surrounding solvents for wildtype protein.
Figure 47: Hydrogen bonds between protein and surrounding solvents for mutation 3 protein.
Figure 48: Hydrogen bonds between protein and surrounding solvents for mutation 8 protein.

Figures 43, 44 and 45 show the number of internal hydrogen bonds within the proteins. The diagrams are very similar and show almost no variation. They have the same value of about 350 internal hydrogen bonds. That number does not significantly change over time.

Figures 46, 47 and 48 show the number of hydrogen bonds between the protein and surrounding solvents. Again, the diagrams look alike, but there is more variation within the lines. The number of hydrogen bonds during the simulation lies between the 650 and 720. The mutations do not seem to influence the number of hydrogen bonds.

Ramachandran (phi/psi) plots

General Ramachandran Wildtype Mutation 3 Mutation 8
Figure 49: General ramachandran plot.<ref name=rama_wiki>http://en.wikipedia.org/wiki/Ramachandran_plot</ref>
Figure 50: Ramachandran plot for wildtype protein.
Figure 51: Ramachandran plot for mutation 3 protein.
Figure 52: Ramachandran plot for mutation 8 protein.

Figure 49 shows a general ramachandran plot and only serves as orientation plot for the other ones. Figures 50, 51 and 52 are the ramachandran plots for the proteins. As they contain many data points, they look very similar and differ a lot from the orientation plot. The "usual" regions of alpha-helix, beta-sheet and left-handed helix are very crowded, as well as the other regions that lie within the red lines in the orientation plot. In addition the regions are connected and there are only a few regions that remain white. Those are angle combinations that are very unlikely.

Analysis of dynamics and time-averaged properties

Root mean square deviations again

Wildtype Mutation 3 Mutation 8
Figure 53: RMSD matrix for wildtype protein.
Figure 54: RMSD matrix for mutation 3 protein.
Figure 55: RMSD matrix for mutation 8 protein.

Figures 53, 54 and 55 show the RMSD matrix for the proteins. This time the results for the proteins are all different. The wildtype proteins matrix has a thin light blue diagonal, surrounded by a wider green one that fluently becomes yellow. There only very few red parts at the bottom right/top left border. The mutation 3 proteins matrix has almost no light blue and the diagonal is green instead. The areas surrounding the diagonal are mixed in green and yellow. There is one very significant area between 7000 and 8000 ps, where the area is almost completely red and yellow, except the area that is very close to the diagonal. In addition the top left/bottom right corner are yellow and red, as well. The mutation 8 proteins matrix also has no light blue, but a green diagonal.The top right corner is mostly colored green, with some yellow in it. The top left/bottom right quarters are yellow and red.

As the coloring is very bright the results seem to differ a lot, but the overall patterns equal each other. Thus, the values differ slightly, but there is no significant change that could be interpreted as disease causing. Since the values of the color scheme are different the red parts values in the wildtype protein are even higher than the ones in the mutated proteins.

Cluster analysis

Description Wildtype Mutation 3 Mutation 8
RMS distribution
Figure 56: RMS distribution for wildtype protein.
Figure 57: RMS distribution for mutation 3 protein.
Figure 58: RMS distribution for mutation 8 protein.
Pymol image
Figure 59: Image of the largest two clusters of wildtype protein.
Figure 60: Image of the largest two clusters of mutation 3 protein.
Figure 61: Image of the largest two clusters of mutation 8 protein.

Figures 56, 57 and 58 show the RMS distribution of the proteins. All of them are nearly normally distributed and have their highest peak around 0.15nm, but the values differ. The wildtype protein has the lowest maximal value ~25000, the mutation 8 protein has it maximal value close to 30000 and the mutation 3 protein has a maximum value >35000. Figures 59, 60 and 61 show images of alignments of the largest two clusters of the certain proteins. There is no big difference between them. All alignments are almost superimposed.

Distance RMSD

Wildtype Mutation 3 Mutation 8
Figure 62: RMS deviation for wildtype protein.
Figure 63: RMS deviation for mutation 3 protein.
Figure 64: RMS deviation for mutation 8 protein.

Figures 62, 63 and 64 show the RMS deviation of distances between the protein atoms. Starting from zero all diagrams go straight up to 0.15nm and then slightly increase with some variance. At the latest point of the simulation all RMS deviations are between 0.2 and 0.225 nm. They all have their highest peak at around 5000ps.

Conclusion

Almost none of the analysis steps above show significant differences between the wildtype protein and the mutant proteins. The results are either identical or very similar, what makes an interpretation very difficult. There are no assumptions that can be made only on the basis of the molecular dynamics analysis. It is impossible to classify one of the mutations as disease causing or not. In our report we made the assumption that the mutation 3 protein is very likely to cause Fabry disease, based on the structure based mutation analysis. This can neither be confirmed nor denied. The mutated proteins results are very similar and as mutation 8 is known as disease causing, this can be interpreted as an evidence for mutation 3 is non-neutral. On the other hand the results are similar to the non-mutated wildtype protein, as well and thus can be interpreted as evidence for neutrality. The few difference are not strong enough to work as a basis for the argumentation that mutation 3 is disease causing. Without the a priori knowledge about mutation 8, we most probably would declare both mutations as neutral, because of their similarity to the wildtype protein.

References

<references />