ASPA Molecular Dynamics Simulation Analysis

From Bioinformatikpedia



Since the simulation for one of our three mutations, RS104894553, refused to complete successfully, this analysis part will focus on the reference wild type structure and the CM994594 mutation.


A brief check of results

How many frames are in the trajectory file and what is the time resolution?

2000 frames at a resolution of 5 picoseconds per frame.

How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?

Reference structure: Real time 1d04h28:22, speed 8.429 ns/day, 325035.7 years for one full second

CM994594 Mutation: Real time 1d04h51:42, speed 8.316 ns/day, 329452.3 years for one full second

Visualization of results

Reference structure:

Ref anim.gif

CM mutated structure:

Cm anim small.gif

In both cases, the two domains separated and ended up at quite a large distance from each other:

Dislocated domains and simulation box

We tried to remedy this by improvong the equilibration step of the system, but could not solve this issue despite our efforts. As can be seen clearly, neither domain stays within the simulation box; this certainly has some effect on the results of this analysis.

What happens if the protein diffuses over the boundary of the box?

When a protein leaves the bounding box by, say, the upper face, it enters the box by the bottom face; this creates a hole in the solvent where the protein would be.

What is the average temperature and what is the heat capacity of the system?

Reference structure temperature graph
CM mutation structure temperature graph

The average temperature of the system is:

Reference Structure:

	Energy                      Average   Err.Est.       RMSD  Tot-Drift
	Temperature                 297.929     0.0024   0.902235 0.00656339  (K)

	Heat capacity at constant pressure Cp: 2.62838e+06 J/mol K


	Energy                      Average   Err.Est.       RMSD  Tot-Drift
	Temperature                 297.926     0.0059   0.901054 0.00425121  (K) 

	Heat capacity at constant pressure Cp: 2.62312e+06 J/mol K

What are the terms plotted in the files energy.xvg and box.xvg

Reference structure energy graph
CM mutation structure energy graph
Reference structure box graph
CM mutation structure box graph

energy.xvg: We plot the energy values of kinetic energy, potential energy and total energy over the time of our simulation.

box.xvg: We plot the size of the box around our protein which is given in nm.

Estimate the plateau values for the pressure, the volume and the density.

Reference structure pressure graph
CM mutation structure pressure graph
Reference structure volume graph
CM mutation structure volume graph
Reference structure density graph
CM mutation structure densitiy graph

Reference structure:

  • Pressure: upper plateau: 226 bar, lower plateau: -210 bar
  • Volume: upper plateau: 1077 nm^3, lower plateau: 1073 nm^3
  • Density: upper plateau: 1008 kg/m^3, lower plateau: 1004 kg/m^3

CM994594 mutated structure:

  • Pressure: upper plateau: 230 bar, lower plateau: -230 bar
  • Volume: upper plateau: 1076 nm^3, lower plateau: 1072 nm^3
  • Density: upper plateau: 1008 kg/m^3, lower plateau: 1005 kg/m^3

What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg

Reference structure coulomb graph
CM mutation structure coulomb graph
Reference structure van der Waals graph
CM mutation structure van der Waals graph

The coloumb and Van-der-Waals energies of the protein over time.

What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations?

The total energy would increase dramatically; this appears not to have happened in our simulations.

Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system?

Reference structure minimum periodic distance graph
CM mutation structure minimum periodic distance graph
Reference structure minimum periodic distance (c alpha) graph
CM mutation structure minimum periodic distance (c alpha) graph
  • CM994594: The shortest periodic distance is 0.357548 (nm) at time 1565 (ps), between atoms 4791 and 8299
  • Reference Structure: The shortest periodic distance is 0.384713 (nm) at time 7955 (ps), between atoms 3061 and 10049
Reference structure b-factors visualization
CM mutation structure b-factors visualization
Reference structure RMSF per residue graph
CM mutation structure RMSF per residue graph

Indicate the start and end residue for the most flexible regions and the maximum amplitudes

  • CM994594: Most flexible regions at 50-110, 210-280; maximum amplitude 1.0 (from 1.4 to 2.4)
  • Reference: Most flexible regions at 80-90, 230-250, 260-270; maximum amplitude 0.4 (from 1.9 to 2.3)

Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?

CM994594 appears to be less flexible than ref, ref reaches maximum far more often, has better defined maxima and has a higher average.

Reference structure RMSD all atoms vs. average graph
CM mutation structure RMSD all atoms vs. average graph
Reference structure RMSD all atoms vs. start graph
CM mutation structure RMSD all atoms vs. start graph

If observed, at what time and value does the RMSD reach a plateau?

  • CM994594: the RMSD converges to around 4.2nm at 6000ps
  • Reference:the RMSD converges to around 5.04nm at 8000ps

Briefly discuss two differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?

The starting structure graph oscillates far stronger and does not show any indication of convergence; while it does display The average structure graphs are far more indicative; the starting structure graphs don't even show anything in the way of convergence. This makes sense, too, because the average structure should reflect the converging structures better than the starting structure.

Reference structure radius of gyration graph
CM mutation structure radius of gyration graph

At what time and value does the radius of gyration converge?

  • CM994594: Convergence at about 3600ps, around 5nm.
  • Reference: Convergence at about 5500ps, around 6nm.

Structural Analysis: Properties Derived From Configurations

Reference structure secondary structure vs. time
CM mutation structure secondary structure vs. time

Discuss some of the changes in the secondary structure, if any.


At around 60 to 75, there is a number of residues which constantly switch back and forth from turn to alpha helix. the same can be seen at around 370-375. In both cases, these are followed by a turn/bend structure and a short alpha helix element. The separation from the helix by a turn/bend stretch is somewhat unexpected, a continuous alpha helix would appear to make more sense.

Also, at around 230, a stretch of turns and bends desintegrates into coils over time. this is interesting, since there is no alpha helix nearby and there appears to be little conserving pressure on these residues secondary structure.


At around 380, a stretch of residues alternates between alpha helix and turn; since it would form a very short alpha helix, this appears to be an artifact of dssp's assignment methodics.

Compare the stability of the secondary structures in the different proteins.

Reference has slightly better defined structure elements with less fuzzy boundaries; the major elements (big sheets+helices) are present and being conserved in both cases. Visually, reference appears to be a bit more stable than CM994594. This is what one would expect, considering the higher energies in the case of CM994594.

Reference structure RMSD matrix (main chain + Cb)
CM mutation structure RMSD matrix (main chain + Cb)

What is interesting by choosing the group "Mainchain+Cb" for this analysis? Think about the different proteins used for this practical.

Mainchain+Cb means that we're using the coordinates of the C beta atoms in addition to those of the atoms in the backbone; since the differences between the structures we used lie in mutated residues, the C beta atoms contain valuable information for our analysis.

How many transitions do you see?

We see seven transitions following the tome axis from the starting structure onward; from blue (very small) to green, then to yellow, on to orange, back to yellow, back to green, and finally yellow and orange again.

What can you conclude from this analysis? Could you expect such a result, justify?

From the starting structure on, the rmsd between this and the current structure stays small for a short time, then climbs up to a peak of around 1 nm, then decreases and climbs back up; this is consistent with a sort-of oscillating motion of large parts of the protein. This appears to be the case indeed.

The large separation of the two domains should play a role here, too, increasing the RMSD values rather pointedly.

How many clusters were found and what were the sizes of the largest two? ( T )

  • Reference: 74 clusters, largest two contain 27 resp. 20 structures.
  • CM994594: 169 clusters, largest two contain 34 resp. 22 structures.

Are there notable differences between the two structures?

  • Reference: A slight dislocation of the important helices is visible between the two structures, but it appears to be fairly uniform in terms of distance and type of dislocation.
  • CM994594: Most of the major helices are extremely dislocated, in some cases even rotated. The same is true for the important beta sheets.

At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?

For the reference structure, convergence is around 1500ps and 1.7; for CM, the corresponding figures are 4000ps and 1.8.

The standard RMSD graphs oscillate far stronger; they also tend to converge at a local minimum (global in some cases), whereas dRMSD converges around a local maximum.

dRMSD for reference structure
dRMSD for reference structure