Molecular Dynamics Simulations Analysis (PKU)

From Bioinformatikpedia
Revision as of 03:50, 14 July 2012 by Boidolj (talk | contribs) (Convergence of Energy Terms)

Contents

Short Introduction

We will analyze our completed molecular dynamics simulations, following the task description and the tutorial of the Utrecht University Molecular Modeling Practical. We have completed one run for the wildtype protein and for the mutations ALA322GLY and ARG408TRP, a second run of the wildtype is pending. The second run for the wildtype might be necessary as the trajectory of the wildtype differs significantly from both the mutants. The commands used to generate plots, images etc. can be found in our journal.

Initial Checks

All three simulations run for the desired 10 ns, the trajectories contain 2000 frames in 5 ps steps each. The wildtype simulation took significantly longer, since we used only 16 cores for the widtype, 32 for the mutants. Almost half of the calculation time, 44.2% in each run, is spent on calculating Coulomb interactions and the Lennard-Jones potential of the solvent molecules. A few key statistics can be found in <xr id="tab:simulation_stats"/>.

<figtable id="tab:simulation_stats"> Statistics of the MD simulations

Mutation Sim. time Sim. speed time to reach 1 s
Wildtype 11:32 h 20.8 ns/day 131,621 years
ALA322GLY 4:20 h 55.3 ns/day 49,543 years
ARG408TRP 4:26 h 54.1 ns/day 50,685 years

</figtable>

Wildtype analysis

<figure id="fig:1J8U_overlay">

Overlay of all frames of the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<xr id="fig:1J8U_overlay"/> shows the overlay of all frames of the wildtype simulation. The trajectory for this image is already filtered from jumps over the boundaries and motions in space. We see that the protein remains compact during the simulation but little details. In the following sections we analyze this simulation in closer detail.

Quality Assurance

Convergence of Energy Terms

<figure id="fig:1J8U_temperature">

Plot of the system temperature during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8U_pressure">

Plot of the system pressure during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8U_volume">

Plot of the system volume during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8U_density">

Plot of the system density during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8U_energies">

Plot of the systems potential, kinetic and total energy during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>


<figure id="fig:1J8U_box">

Plot of the system extension in 3 dimensions during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. X- and Y-dimensions overlap and are not to distinguish in the plot.

</figure>

<figure id="fig:1J8U_coulomb">

Plot of the Coulomb interactions during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<figure id="fig:1J8Uvdw">

Plot of the van-der-Waals interactions during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<xr id="fig:1J8U_temperature"/> shows the temperature during the simulation. It fluctuates slightly around 297.9° Kelvin or 24.7° Celsius but stays within just 3 degrees. (Calculation of heat capacity was erroneous in Gromacs and has been disabled in 4.5.)
<xr id="fig:1J8U_pressure"/> shows how the pressure fluctuates wildly from -200 to +200 bar and peaks up to +- 400 bar during the whole simulation. The average stays very close to the setting of 1 bar. This could either simply be a feature of the simulation or be considered realistic, as the volume of the simulation box is very small and small fluctuations in the volume cause large pressure fluctuations (cf. ambermd.org). <xr id="fig:1J8U_volume"/> shows accordingly small changes of the volume, mostly within 0.5 nm^3 of 365.6 nm^3. Density (cf. <xr id="fig:1J8U_density"/>) remains very stable around 1021.3 kg/m^3, as do the potential and kinetic energy in <xr id="fig:1J8U_energies"/>. The size of the box containing the simulation (cf. <xr id="fig:1J8U_box"/>) remains almost fix in all three dimensions. The small peaks are probably water molecules crossing the periodic boundaries. We see for all terms a stable behaviour, and could say that the initial conditions have already been equilibrated properly in the short runs before the production run.

The energies of the van-der-Waals interactions and the Coulomb interactions are shown in <xr id="fig:1J8Uvdw"/> and <xr id="fig:1J8U_coulomb" /> respectively. While the energy of the van-der-Waals interactions stays roughly constant, the energy from coulomb interactions first goes down steeply, then stabilizes but does not converge.


Minimum Distance Between Periodic Images

<figure id="fig:1J8U_mindist_c_alpha">

Plot of the minimal distance of interactions of the C alpha atoms of the backbone during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U. The distances for the three dimensions overlap and are not to distinguish in the plot.

</figure>

  • What was the minimal distance between periodic images and at what time did that occur?
  • What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations? (It also matters if the small distance occurs transiently or if it is persistent. If it is persistent, it is likely affecting the protein dynamics; but if it's just transiently than it will hardly, if at all, influence.)
  • Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system? (Ideally, the minimal distance should therefore not be less than two nanometers.)


Root Mean Square Fluctuations

<figure id="fig:1J8U_rmsf">

Plot of the RMSF of all residues of the protein vs. its average position during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<figure id="fig:1J8U_average">

The average structure of the wildtype during the simulation. The structure is not physical as atom positions are averaged over the whole simulation.

</figure>

<figure id="fig:1J8U_bfactor_binding_site">

The b factors of the binding site in the wildtype. Blue indicates little movement, red great flexibility.

</figure>


<figure id="fig:1J8U_bfactor_down_site">

The b factors of the wildtype, view on the binding pocket. Blue indicates little movement, red great flexibility.

</figure>

<figure id="fig:1J8U_bfactor_up_site">

The b factors of the wildtype, view on the upper side. Blue indicates little movement, red great flexibility.

</figure>

  • Indicate the start and end residue for the most flexible regions and the maximum amplitudes. ( T )
  • Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?


Convergence of RMSD

<figure id="fig:1J8U_rmds_all-atom-vs-start">

Plot of the RMSD of all atoms of the protein vs. the starting structure during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<figure id="fig:1J8U_rmds_all-atom-vs-average">

Plot of the RMSD of all atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<figure id="fig:1J8U_rmds_backbone-vs-start">

Plot of the RMSD of the backbone atoms of the protein vs. the starting structure during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

<figure id="fig:1J8U_rmds_backbone-vs-average">

Plot of the RMSD of the backbone atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>

  • If observed, at what time and value does the RMSD reach a plateau?
  • Briefly discuss differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?


Convergence of Radius of Gyration

<figure id="fig:1J8U_radius_gyration">

Plot of the radius of gyration during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure> <figure id="fig:1J8U_inertia">

Plot of the moment of inertia during the 10 ns simulation of the wildtype phenylalanine hydroxylase structure 1J8U.

</figure>


  • Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
  • At what time and value does the radius of gyration converge? ( T )


Structural Analysis: Properties Derived from Configurations


Solvent accessible surface

  • Which residues are the most accessible to the solvent?


Hydrogen Bonds

  • Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.


Salt Bridges


Secondary Structure

  • Discuss some of the changes in the secondary structure, if any.


Ramachandran Plots

  • What can you say about the conformation of the residues, based on the ramachandran plots?


Analysis of Dynamics and Time-averaged Properties


Root Mean Square Deviations

  • What is interesting by choosing the group "Mainchain+Cb" for this analysis?
  • How many transitions do you see?
  • What can you conclude from this analysis? Could you expect such a result, justify?


Cluster Analysis

  • How many clusters were found and what were the sizes of the largest two?
  • Are there notable differences between the two structures?


Distance RMSD

  • At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?


Gly322Ala analysis

<figure id="fig:mut322_overlay">

Overlay of all frames of the 10 ns simulation of the Gly322Ala mutation of phenylalanine hydroxylase structure 1J8U.

</figure>

Quality Assurance

Convergence of Energy Terms

<figure id="fig:Mut322_temperature">

Plot of the system temperature during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut322_pressure">

Plot of the system pressure during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut322_volume">

Plot of the system volume during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut322_density">

Plot of the system density during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut322_energies">

Plot of the systems potential, kinetic and total energy during the 10 ns simulation of the Ala322Gly mutation.

</figure>

<figure id="fig:Mut322_box">

Plot of the system extension in 3 dimensions during the 10 ns simulation of the Ala322Gly mutation. X- and Y-dimensions overlap and are not to distinguish in the plot.

</figure>

<figure id="fig:Mut322_coulomb">

Plot of the Coulomb interactions during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8Uvdw">

Plot of the van-der-Waals interactions during the 10 ns simulation of the Ala322Gly mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

  • What is the average temperature and what is the heat capacity of the system? ( T )
  • What are the terms plotted in the files energy.xvg and box.xvg
  • Estimate the plateau values for the pressure, the volume and the density. ( T )
  • What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg ?


Minimum Distance Between Periodic Images

<figure id="fig:Mut322_mindist">

Plot of the minimal distance of interactions of the atoms of the protein during the 10 ns simulation of the Ala322Gly mutation. The distances for the three dimensions overlap and are not to distinguish in the plot.

</figure> <figure id="fig:Mut322_mindist_c_alpha">

Plot of the minimal distance of interactions of the C alpha atoms of the backbone during the 10 ns simulation of the Ala322Gly mutation. The distances for the three dimensions overlap and are not to distinguish in the plot.

</figure>

  • What was the minimal distance between periodic images and at what time did that occur?
  • What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations? (It also matters if the small distance occurs transiently or if it is persistent. If it is persistent, it is likely affecting the protein dynamics; but if it's just transiently than it will hardly, if at all, influence.)
  • Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system? (Ideally, the minimal distance should therefore not be less than two nanometers.)


Root Mean Square Fluctuations

<figure id="fig:Mut322_rmsf-per-residue.png">

Plot of the RMSF of all residues of the protein vs. its average position during the 10 ns simulation of the Ala322Gly mutation.

</figure>


<figure id="fig:Mut322_bfactor_binding_site">

The b factors of the binding site in the Ala322Gly mutatant. Blue indicates little movement, red great flexibility.

</figure>


<figure id="fig:Mut322_bfactor_down_site">

The b factors of the Ala322Gly mutatant, view on the binding pocket. Blue indicates little movement, red great flexibility.

</figure>

<figure id="fig:Mut322_bfactor_up_site">

The b factors of the Ala322Gly mutatant, view on the upper side. Blue indicates little movement, red great flexibility.

</figure>


<figure id="fig:Mut322_bfactor_unmutated_site">

The b factors of the Ala322Gly mutatation site in the wildtype, located in the helix. Blue indicates little movement, red great flexibility.

</figure>

<figure id="fig:Mut322_bfactor_mutation_site">

The b factors of the Ala322Gly mutatation, located in the helix. Blue indicates little movement, red great flexibility.

</figure>

  • Indicate the start and end residue for the most flexible regions and the maximum amplitudes. ( T )
  • Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?


Convergence of RMSD

<figure id="fig:Mut322_rmds_all-atom-vs-start">

Plot of the RMSD of all atoms of the protein vs. the starting structure during the 10 ns simulation of the Ala322Gly mutation.

</figure>

<figure id="fig:Mut322_rmds-all-atom-vs-average">

Plot of the RMSD of all atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the Ala322Gly mutation.

</figure>

<figure id="fig:Mut322_rmds-backbone-vs-start">

Plot of the RMSD of the backbone atoms of the protein vs. the starting structure during the 10 ns simulation of the Ala322Gly mutation.

</figure>

<figure id="fig:Mut322_rmds_backbone-vs-average">

Plot of the RMSD of the backbone atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the Ala322Gly mutation.

</figure>

  • If observed, at what time and value does the RMSD reach a plateau?
  • Briefly discuss differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?


Convergence of Radius of Gyration

<figure id="fig:Mut322_radius-of-gyration">

Plot of the radius of gyration during the 10 ns simulation of the Ala322Gly mutation.

</figure> <figure id="fig:Mut322_inertia">

Plot of the moment of inertia during the 10 ns simulation of the Ala322Gly mutation.

</figure>


  • Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
  • At what time and value does the radius of gyration converge? ( T )


Structural Analysis: Properties Derived from Configurations


Solvent accessible surface

  • Which residues are the most accessible to the solvent?


Hydrogen Bonds

  • Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.


Salt Bridges


Secondary Structure

  • Discuss some of the changes in the secondary structure, if any.


Ramachandran Plots

  • What can you say about the conformation of the residues, based on the ramachandran plots?


Analysis of Dynamics and Time-averaged Properties


Root Mean Square Deviations

  • What is interesting by choosing the group "Mainchain+Cb" for this analysis?
  • How many transitions do you see?
  • What can you conclude from this analysis? Could you expect such a result, justify?


Cluster Analysis

  • How many clusters were found and what were the sizes of the largest two?
  • Are there notable differences between the two structures?


Distance RMSD

  • At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?


Arg408Trp analysis

<figure id="fig:mut408_overlay">

Overlay of all frames of the 10 ns simulation of the Arg408Trp mutation of phenylalanine hydroxylase structure 1J8U.

</figure>

Quality Assurance

Convergence of Energy Terms

<figure id="fig:Mut408_temperature">

Plot of the system temperature during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut408_pressure">

Plot of the system pressure during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut408_volume">

Plot of the system volume during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut408_density">

Plot of the system density during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:Mut408_energies">

Plot of the systems potential, kinetic and total energy during the 10 ns simulation of the Arg408Trp mutation.

</figure>

<figure id="fig:Mut408_box">

Plot of the system extension in 3 dimensions during the 10 ns simulation of the Arg408Trp mutation. X- and Y-dimensions overlap and are not to distinguish in the plot.

</figure>

<figure id="fig:Mut408_coulomb">

Plot of the Coulomb interactions during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

<figure id="fig:1J8Uvdw">

Plot of the van-der-Waals interactions during the 10 ns simulation of the Arg408Trp mutation. A running average in a window of length 100 ps is indicated in red.

</figure>

  • What is the average temperature and what is the heat capacity of the system? ( T )
  • What are the terms plotted in the files energy.xvg and box.xvg
  • Estimate the plateau values for the pressure, the volume and the density. ( T )
  • What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg ?


Minimum Distance Between Periodic Images

<figure id="fig:Mut408_mindist">

Plot of the minimal distance of interactions of the atoms of the protein during the 10 ns simulation of the Arg408Trp mutation. The distances for the three dimensions overlap and are not to distinguish in the plot.

</figure> <figure id="fig:Mut408_mindist_c_alpha">

Plot of the minimal distance of interactions of the C alpha atoms of the backbone during the 10 ns simulation of the Arg408Trp mutation. The distances for the three dimensions overlap and are not to distinguish in the plot.

</figure>


  • What was the minimal distance between periodic images and at what time did that occur?
  • What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations? (It also matters if the small distance occurs transiently or if it is persistent. If it is persistent, it is likely affecting the protein dynamics; but if it's just transiently than it will hardly, if at all, influence.)
  • Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system? (Ideally, the minimal distance should therefore not be less than two nanometers.)


Root Mean Square Fluctuations

<figure id="fig:Mut408_rmsf-per-residue.png">

Plot of the RMSF of all residues of the protein vs. its average position during the 10 ns simulation of the Arg408Trp mutation.

</figure>

<figure id="fig:Mut408_bfactor_binding_site">

The b factors of the binding site in the Arg408Trp mutatant. Blue indicates little movement, red great flexibility.

</figure>


<figure id="fig:Mut408_bfactor_down_site">

The b factors of the Arg408Trp mutatant, view on the binding pocket. Blue indicates little movement, red great flexibility.

</figure>

<figure id="fig:Mut408_bfactor_up_site">

The b factors of the Arg408Trp mutatant, view on the upper side. Blue indicates little movement, red great flexibility.

</figure>


<figure id="fig:Mut408_bfactor_unmutated_site">

The b factors of the Arg408Trp mutatation site in the wildtype, located in the loop. Blue indicates little movement, red great flexibility.

</figure>

<figure id="fig:Mut408_bfactor_mutation_site">

The b factors of the Arg408Trp mutatation, located in the loop. Blue indicates little movement, red great flexibility.

</figure>

  • Indicate the start and end residue for the most flexible regions and the maximum amplitudes. ( T )
  • Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?


Convergence of RMSD

<figure id="fig:Mut408_rmds_all-atom-vs-start">

Plot of the RMSD of all atoms of the protein vs. the starting structure during the 10 ns simulation of the Arg408Trp mutation.

</figure>

<figure id="fig:Mut408_rmds-all-atom-vs-average">

Plot of the RMSD of all atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the Arg408Trp mutation.

</figure>

<figure id="fig:Mut408_rmds-backbone-vs-start">

Plot of the RMSD of the backbone atoms of the protein vs. the starting structure during the 10 ns simulation of the Arg408Trp mutation.

</figure>

<figure id="fig:Mut408_rmds_backbone-vs-average">

Plot of the RMSD of the backbone atoms of the protein vs. the (theoretical) average structure during the 10 ns simulation of the Arg408Trp mutation.

</figure>

  • If observed, at what time and value does the RMSD reach a plateau?
  • Briefly discuss differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?


Convergence of Radius of Gyration

<figure id="fig:Mut408_radius-of-gyration">

Plot of the radius of gyration during the 10 ns simulation of the Arg408Trp mutation.

</figure> <figure id="fig:Mut408_inertia">

Plot of the moment of inertia during the 10 ns simulation of the Arg408Trp mutation.

</figure>


  • Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
  • At what time and value does the radius of gyration converge? ( T )


Structural Analysis: Properties Derived from Configurations


Solvent accessible surface

  • Which residues are the most accessible to the solvent?


Hydrogen Bonds

  • Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.


Salt Bridges


Secondary Structure

  • Discuss some of the changes in the secondary structure, if any.


Ramachandran Plots

  • What can you say about the conformation of the residues, based on the ramachandran plots?


Analysis of Dynamics and Time-averaged Properties


Root Mean Square Deviations

  • What is interesting by choosing the group "Mainchain+Cb" for this analysis?
  • How many transitions do you see?
  • What can you conclude from this analysis? Could you expect such a result, justify?


Cluster Analysis

  • How many clusters were found and what were the sizes of the largest two?
  • Are there notable differences between the two structures?


Distance RMSD

  • At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?