Difference between revisions of "Molecular Dynamics Simulations HEXA"

From Bioinformatikpedia
(check the trajectory)
(check the trajectory)
Line 171: Line 171:
   
 
{| border="1" style="text-align:center; border-spacing:0;"
 
{| border="1" style="text-align:center; border-spacing:0;"
|rowspan="3" | Wildtype
+
|colspan="3" | Wildtype
 
|rowspan="3" | Mutation 436
 
|rowspan="3" | Mutation 436
 
|rowspan="3" | Mutation 485
 
|rowspan="3" | Mutation 485

Revision as of 13:34, 20 September 2011

Run the MD simulation

Here we want to give a receipt for how to analyse the MD simulation result as we did it in our section.

  • check the trajectory

First of all we checked the trajectory, to see if our simulation finished successfully and the file is not corrupted.

gmxcheck -f traj.xtc


  • Visualistation

Next we want to visualise our results:

trjconv -s topol.tpr -f traj.xtc -o protein.pdb -pbc nojump -dt 10 
pymol protein.pdb 

Still TODO

  • create a movie and skip the g_filter step

Still TODO

  • energy calculations for pressure, temperature, potential and total energy

In the next analysis step we calculated the energy values for pressure, temperature, potential and total energy with following commands:

echo 13 0 | g_energy -f ener.edr -o pressure.xvg 
echo 12 0 | g_energy -f ener.edr -o temperature.xvg 
echo 9 0 | g_energy -f ener.edr -o potential.xvg 
echo 11 0 | g_energy -f ener.edr -o total_energy.xvg 

We visualised the results of the different runs with the xmgrace program:

xmgrace pressure.xvg
xmgrace temperature.xvg
xmgrace potential.xvg
xmgrace total_energy.xvg


  • minimum distance between periodic boundary cells

Next, we calculated the minimum distance between periodic boundary cells. A low distance means, that the part of the protein which is in this boundary cell have contacts with itself. This should not be the case, because one part of the protein should not have contacts with the completely equal part of the protein. Therefore, a low periodic boundary cell shows that the quality of the model is bad and the simulation my be wrong. To calculate the minimum distance we used following command:

g_mindist -f traj.xtc -s topol.tpr -od minimal-periodic-distance.xvg -pi 

We visualised the results with xmgrace:

xmgrace minimal-periodic-distance.xvg


  • RMSF for protein and C-alpha and Pymol analysis of average and bfactor

In the next step, we analysed the root mean square fluctuations for the complete protein and also for the C-alpha atoms. With the RMSF you can calculate the differences between two nearly identical structures. In our case, we have a lot of very similar structures. In general we use the same structure but over the simulation time, the structure moves and therefore we got a lot of very similar, but not equal structures during the simulation. We calculate the RMSF between the start structure and the average structure, which is the average of all structures calculated during the simulation. Furthermore, we also calculated the B-factors of the different residues of the structures. Therefore, we can get a good insight in the flexibility of the protein structure. Furthermore, we calculate this for the complete protein and the C-alpha atoms, to get the possibility to see how flexible the backbone and the residues are. Therefore, we used following commands:

echo 1 0 | g_rmsf -f traj.xtc -s topol.tpr -o rmsf-per-residue.xvg -ox average.pdb -oq bfactors.pdb -res 
echo 3 0 | g_rmsf -f traj.xtc -s topol.tpr -o rmsf-per-residue_c.xvg -ox average_c.pdb -oq bfactors_c.pdb -res 

We visualised the rmsf-per-residue file with xmgrace. The pdb files were visualised with pymol. Furthermore, we aligned the calculated structures with the start structure with pymol to get a RMSD value. Additionally, we looked at the parts of the protein which are really flexible to see how the structure change over time.


  • Radius of gyration

The Radius of gyration is the RMS distance of the protein parts from their centre. So therefore, it is possible to get a good insight into the shape of the protein during simulation, because if the radius is higher, this means the distance between the different protein parts and the protein centre is higher and therefore the protein has a bigger shape than before. We calculate the radius of gyration with following command:

g_gyrate -f traj.xtc -s topol.tpr -o radius-of-gyration.xvg 

To visualise the result of this calculation we use two different xmgrace commands.

With the following command, we got a plot which shows the change of the radius of gyration over simulation time.

xmgrace radius-of-gyration.xvg

With the next command, we got some more detailed information about the radius of gyration. Therefore, we got the individual components of which the radius of gyration consists. These components correspond to the eigenvalues of the matrix of inertia. Therefore, the first component of the plot correspond to the longest axis of the molecule and vice versa.


  • solvent accessible surface area

Another important point is the solvent accessible surface area of the protein. With following command, we calculated the average solvent accessibility per residue and per atom over time, and also the solvent accessibility of the protein over the simulation time.

g_sas -f traj_nojump.xtc -s topol.tpr -o solvent-accessible-surface.xvg -oa atomic-sas.xvg -or residue-sas.xvg 

We visualised all of these files with xmgrace.

xmgrace file.xvg
xmgrace -nxy file.xvg

The second command gave us a more detailed output. For the average solvent accessibility per residue and per atom we also got the standard deviation of this calculation which is very useful. For the solvent-accessibility-surface we additionally got the detailed composition of pysicochemical residues over the simulation.


  • hydrogen-bonds between protein and protein / protein and water

Next, we calculated the numbers of hydrogen bonds between the protein itself and also between the protein and the surrounding water.

echo 1 1 | g_hbond -f traj_nojump.xtc -s topol.tpr -num hydrogen-bonds-intra-protein.xvg 
echo 1 12 | g_hbond -f traj_nojump.xtc -s topol.tpr -num hydrogen-bonds-protein-water.xvg 

Again, we visualised the files with xmgrace.

xmgrace hydrogen-bonds-intra-protein.xvg 
xmgrace -nxy hydrogen-bonds-intra-protein.xvg 

With the first command we got a plot which shows us how many hydrogen bonds are in the protein over the simulation time. The second plot shows us how many possible and real hydrogen bonds could be found in the protein.


  • Ramachandran plot

Now we calculate and visualise the Ramachandran plot. The plot we calculated contains all angels of all residues.

g_rama -f traj_nojump.xtc -s topol.tpr -o ramachandran.xvg 
xmgrace ramachandran.xvg


  • RMSD matrix

To see, if there is a periodic motion during the simulation of are there very similar structure during the simulation, we calculated a RMSD matrix. Therefore, on both axis are the simulation time. In the plot you can see how similar the two structure at these two simulation points are to each other. So therefore, if there are structures at different time points very similar, you can see this on this plot. So in this case, there is an all-against-all structure comparison.

g_rms -s topol.tpr -f traj_nojump.xtc -f2 traj_nojump.xtc -m rmsd-matrix.xpm -dt 10 
xpm2ps -f rmsd-matrix.xpm -o rmsd-matrix.eps -rainbow blue 
gv rmsd-matrix.eps 


  • cluster analysis

Out next analysis is quiet similar to the RMSD matrix analysis. Here we calculated different clusters of very similar structures. This could be structures, which are very near in time, but also structures which are far away during the simulation time. Therefore, we can get a rough insight, how many different structures occur during the simulation.

echo 6 6 | g_cluster -s topol.tpr -f traj_nojump.xtc -dm rmsd-matrix.xpm -dist rmsd-distribution.xvg -o clusters.xpm -sz cluster-sizes.xvg -tr cluster-transitions.xpm -ntr cluster-transitions.xvg -clid cluster-id-over-time.xvg -cl clusters.pdb -cutoff 0.1 -method gromos -dt 10 

In this simulation, we only use the MainChain and the H atoms, because this calculation is really time consuming, and only looking at the MainChain and the H atoms is enough to get a closer look to the different structures and to cluster them together. The different clusters are visualised with pymol

pymol clusters.pdb
disable all
split_states clusters
show cartoon

Furthermore, it is possible to align different cluster structures, to see how big the difference between the different clusters is.

align clusters_x, clusters_y



  • internal RMSD

As last step, we calculate the RMSD values in our protein. Therefore, we calculated for each interatomic distance the RMSD value. This is a good hint to see how big the protein is. If there are only a lot of small RMSD values, the protein has to be very compact, whereas if all values a big, the protein needs a lot of space. We calculated the internal RMSD with following command:

g_rmsdist -s topol.tpr -f traj_nojump.xtc -o distance-rmsd.xvg 
xmgrace distance-rmsd.xvg

Therefore, with this calculation we finished our MD result analysis.

Detailed results

Comparison of the results

In this section, we want to compare the different results of the MD analysis to look if there are differences between the wild type structure and the structures with the mutation. For more information about the single result analysis please look at the point Detailed results.

check the trajectory

 Wildtype  Mutation 436  Mutation 485
Item #frames Timesteps (ps) Item #frames Timesteps (ps) Item #frames Timesteps (ps)
Step 2001 5 Step 2001 5 Step 2001 5
Time 2001 5 Time 2001 5 Time 2001 5
Coords 2001 5 Coords 2001 5 Coords 2001 5

Visualize in pymol

create a movie

energy calculations for pressure, temperature, potential and total energy

minimum distance between periodic boundary cells

RMSF for protein and C-alpha and Pymol analysis of average and bfactor

Radius of gyration

solvent accesible surface area

hydrogen-bonds between protein and protein / protein and water

Ramachandran plot

RMSD matrix

cluster analysis

internal RMSD

Discussion