Difference between revisions of "Molecular Dynamics Simulations Analysis (PKU)"
(→Convergence of RMSD) |
(→Convergence of RMSD) |
||
Line 155: | Line 155: | ||
<xr id="fig:1J8U_average"/> shows the average structure of the simulation, which means the position of every atom is the average position of this atom during the simulation. This structure has impossible configurations but can serve as reference for the convergence of the protein.<br/> |
<xr id="fig:1J8U_average"/> shows the average structure of the simulation, which means the position of every atom is the average position of this atom during the simulation. This structure has impossible configurations but can serve as reference for the convergence of the protein.<br/> |
||
<xr id="fig:1J8U_rmds_all-atom-vs-start"/> shows the RMSD of the simulated protein compared to the starting structure <figure id="fig:1J8U_rmds_all-atom-vs-average"> the RMSD compared to the average structure. The RMSD compared to the starting structure rises steep during the first 2 ns, then rises more slowly but without noticeable convergence. In the time window between 6000 and 7500 we see the saw like movement encountered [[#Minimum Distance Between Periodic Images|previously]] again.<br/> |
<xr id="fig:1J8U_rmds_all-atom-vs-start"/> shows the RMSD of the simulated protein compared to the starting structure <figure id="fig:1J8U_rmds_all-atom-vs-average"> the RMSD compared to the average structure. The RMSD compared to the starting structure rises steep during the first 2 ns, then rises more slowly but without noticeable convergence. In the time window between 6000 and 7500 we see the saw like movement encountered [[#Minimum Distance Between Periodic Images|previously]] again.<br/> |
||
− | If the structure continually changes, the RMSD compared to the average structure would follow a hyperbola during the simulation, if the structure converges, we would see a declining RMSD and convergence towards a small value (0, if the final structure were rigid). In fact, we see a hyberbola-like behaviour with a plateau in the middle of the simulation, which fits to the observation from <xr id="fig:1J8U_rmds_all-atom-vs-start"/> that the structure does not converge but changes more slowls towards the end. The most likely conclusion is that the structure has not yet converged towards an equilibrium state. The same applies if we look only at the backbone atoms in < |
+ | If the structure continually changes, the RMSD compared to the average structure would follow a hyperbola during the simulation, if the structure converges, we would see a declining RMSD and convergence towards a small value (0, if the final structure were rigid). In fact, we see a hyberbola-like behaviour with a plateau in the middle of the simulation, which fits to the observation from <xr id="fig:1J8U_rmds_all-atom-vs-start"/> that the structure does not converge but changes more slowls towards the end. The most likely conclusion is that the structure has not yet converged towards an equilibrium state. The same applies if we look only at the backbone atoms in <xr id="fig:1J8U_rmds_backbone-vs-start"/> and <xr id="fig:1J8U_rmds_backbone-vs-average"/>. Here, we see more clearly how the structure does not reach a plateau. Around 6500 ps in <xr id="fig:1J8U_rmds_backbone-vs-average">/ we again see some pronounced shift in the structure. |
<br style="clear:both;"/> |
<br style="clear:both;"/> |
Revision as of 14:41, 14 July 2012
Contents
- 1 Short Introduction
- 2 Initial Checks
- 3 Wildtype analysis
- 4 Gly322Ala analysis
- 5 Arg408Trp analysis
Short Introduction
We will analyze our completed molecular dynamics simulations, following the task description and the tutorial of the Utrecht University Molecular Modeling Practical. We have completed one run for the wildtype protein and for the mutations ALA322GLY and ARG408TRP, a second run of the wildtype is pending. The second run for the wildtype might be necessary as the trajectory of the wildtype differs significantly from both the mutants. The commands used to generate plots, images etc. can be found in our journal.
Initial Checks
All three simulations run for the desired 10 ns, the trajectories contain 2000 frames in 5 ps steps each. The wildtype simulation took significantly longer, since we used only 16 cores for the widtype, 32 for the mutants. Almost half of the calculation time, 44.2% in each run, is spent on calculating Coulomb interactions and the Lennard-Jones potential of the solvent molecules. A few key statistics can be found in <xr id="tab:simulation_stats"/>.
<figtable id="tab:simulation_stats"> Statistics of the MD simulations
Mutation | Sim. time | Sim. speed | time to reach 1 s | ||||
---|---|---|---|---|---|---|---|
Wildtype | 11:32 h | 20.8 ns/day | 131,621 years | ||||
ALA322GLY | 4:20 h | 55.3 ns/day | 49,543 years | ||||
ARG408TRP | 4:26 h | 54.1 ns/day | 50,685 years |
</figtable>
Wildtype analysis
<figure id="fig:1J8U_overlay">
</figure>
<xr id="fig:1J8U_overlay"/> shows the overlay of all frames of the wildtype simulation. The trajectory for this image is already filtered from jumps over the boundaries and motions in space. We see that the protein remains compact during the simulation but little details. In the following sections we analyze this simulation in closer detail.
Quality Assurance
Convergence of Energy Terms
<figure id="fig:1J8U_temperature">
</figure>
<figure id="fig:1J8U_pressure">
</figure>
<figure id="fig:1J8U_volume">
</figure>
<figure id="fig:1J8U_density">
</figure>
<figure id="fig:1J8U_energies">
</figure>
<figure id="fig:1J8U_box">
</figure>
<figure id="fig:1J8U_coulomb">
</figure>
<figure id="fig:1J8Uvdw">
</figure>
<xr id="fig:1J8U_temperature"/> shows the temperature during the simulation. It fluctuates slightly around 297.9° Kelvin or 24.7° Celsius but stays within just 3 degrees. (Calculation of heat capacity was erroneous in Gromacs and has been disabled in 4.5.)
<xr id="fig:1J8U_pressure"/> shows how the pressure fluctuates wildly from -200 to +200 bar and peaks up to +- 400 bar during the whole simulation. The average stays very close to the setting of 1 bar. This could either simply be a feature of the simulation or be considered realistic, as the volume of the simulation box is very small and small fluctuations in the volume cause large pressure fluctuations (cf. ambermd.org). <xr id="fig:1J8U_volume"/> shows accordingly small changes of the volume, mostly within 0.5 nm^3 of 365.6 nm^3. Density (cf. <xr id="fig:1J8U_density"/>) remains very stable around 1021.3 kg/m^3, as do the potential and kinetic energy in <xr id="fig:1J8U_energies"/>. The size of the box containing the simulation (cf. <xr id="fig:1J8U_box"/>) remains almost fix in all three dimensions. The small peaks are probably water molecules crossing the periodic boundaries. The energies of the van-der-Waals interactions and the Coulomb interactions are shown in <xr id="fig:1J8Uvdw"/> and <xr id="fig:1J8U_coulomb" /> respectively. While the energy of the van-der-Waals interactions stays roughly constant, the energy from coulomb interactions first goes down steeply, then stabilizes but does not converge. Altogether, we see for most terms a stable behaviour, and assume, that the initial conditions have already been equilibrated properly in the short runs before the production run.
Minimum Distance Between Periodic Images
<figure id="fig:1J8U_mindist">
</figure>
<figure id="fig:1J8U_mindist_c_alpha">
</figure> Since the protein uses periodic boundaries, it is possible that the protein interacts with another copy of itself. This interaction could even be indirect if the hydration shell of the protein touches over the boundaries, so the distance between periodic images should be at least 2 nm. The minimal distance in this simulation is 1.69 nm around 1350 ps, near the simulation start. There is another valley around 7800 ps, but if there was any interaction, it was only transient and did probably not affect the simulation, as there is no plateau in an unsafe distance as can be seen in <xr id="fig:1J8U_mindist"/>. Looking only at the backbone C alphe atoms in <xr id="fig:1J8U_mindist_c_alpha"/>, the distance is always well above 2 nm. Here, interactions would severely affect the simulation if e.g. hydrogen bonds between the backbone would form. There is a saw teeth like movement between 6000 ps and 7500 ps where the distance reaches a peak and a minimum twice in shost succession. This could indicate spatial movement or a contraction and rebound of the protein in this time window.
Root Mean Square Fluctuations
<figure id="fig:1J8U_rmsf">
</figure>
<figure id="fig:1J8U_bfactor_binding_site">
</figure>
<figure id="fig:1J8U_bfactor_down_site">
</figure>
<figure id="fig:1J8U_bfactor_up_site">
</figure>
The most flexible regions corresponding to the peaks in <xr id="fig:1J8U_rmsf"/> are the loops from residue 18 to 32 with a highly flexible Tyr20 (B-factor 330.63), 153 to 163 with again the most flexible residue Tyr159 (B-factor 203.20) and 258 to 267 with Phe264 as the most flexible (B-factor 148.98). There are other single residues with high B-factors, most of them located at the end of alpha helices and often tyrosines as can be seen in <xr id="fig:1J8U_bfactor_down_site"/> and <xr id="fig:1J8U_bfactor_up_site"/>. <xr id="fig:1J8U_bfactor_binding_site"/> shows a close-up of the binding site.
Convergence of RMSD
<figure id="fig:1J8U_average">
</figure>
<figure id="fig:1J8U_rmds_all-atom-vs-start">
</figure>
<figure id="fig:1J8U_rmds_all-atom-vs-average">
</figure>
<figure id="fig:1J8U_rmds_backbone-vs-start">
</figure>
<figure id="fig:1J8U_rmds_backbone-vs-average">
</figure>
<xr id="fig:1J8U_average"/> shows the average structure of the simulation, which means the position of every atom is the average position of this atom during the simulation. This structure has impossible configurations but can serve as reference for the convergence of the protein.
<xr id="fig:1J8U_rmds_all-atom-vs-start"/> shows the RMSD of the simulated protein compared to the starting structure <figure id="fig:1J8U_rmds_all-atom-vs-average"> the RMSD compared to the average structure. The RMSD compared to the starting structure rises steep during the first 2 ns, then rises more slowly but without noticeable convergence. In the time window between 6000 and 7500 we see the saw like movement encountered previously again.
If the structure continually changes, the RMSD compared to the average structure would follow a hyperbola during the simulation, if the structure converges, we would see a declining RMSD and convergence towards a small value (0, if the final structure were rigid). In fact, we see a hyberbola-like behaviour with a plateau in the middle of the simulation, which fits to the observation from <xr id="fig:1J8U_rmds_all-atom-vs-start"/> that the structure does not converge but changes more slowls towards the end. The most likely conclusion is that the structure has not yet converged towards an equilibrium state. The same applies if we look only at the backbone atoms in <xr id="fig:1J8U_rmds_backbone-vs-start"/> and <xr id="fig:1J8U_rmds_backbone-vs-average"/>. Here, we see more clearly how the structure does not reach a plateau. Around 6500 ps in <xr id="fig:1J8U_rmds_backbone-vs-average">/ we again see some pronounced shift in the structure.
Convergence of Radius of Gyration
<figure id="fig:1J8U_radius_gyration">
</figure> <figure id="fig:1J8U_inertia">
</figure>
- Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
- At what time and value does the radius of gyration converge? ( T )
Structural Analysis: Properties Derived from Configurations
Solvent accessible surface
- Which residues are the most accessible to the solvent?
Hydrogen Bonds
- Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.
Salt Bridges
Secondary Structure
- Discuss some of the changes in the secondary structure, if any.
Ramachandran Plots
- What can you say about the conformation of the residues, based on the ramachandran plots?
Analysis of Dynamics and Time-averaged Properties
Root Mean Square Deviations
- What is interesting by choosing the group "Mainchain+Cb" for this analysis?
- How many transitions do you see?
- What can you conclude from this analysis? Could you expect such a result, justify?
Cluster Analysis
- How many clusters were found and what were the sizes of the largest two?
- Are there notable differences between the two structures?
Distance RMSD
- At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?
Gly322Ala analysis
<figure id="fig:mut322_overlay">
</figure>
Quality Assurance
Convergence of Energy Terms
<figure id="fig:Mut322_temperature">
</figure>
<figure id="fig:Mut322_pressure">
</figure>
<figure id="fig:Mut322_volume">
</figure>
<figure id="fig:Mut322_density">
</figure>
<figure id="fig:Mut322_energies">
</figure>
<figure id="fig:Mut322_box">
</figure>
<figure id="fig:Mut322_coulomb">
</figure>
<figure id="fig:1J8Uvdw">
</figure>
- What is the average temperature and what is the heat capacity of the system? ( T )
- What are the terms plotted in the files energy.xvg and box.xvg
- Estimate the plateau values for the pressure, the volume and the density. ( T )
- What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg ?
Minimum Distance Between Periodic Images
<figure id="fig:Mut322_mindist">
</figure> <figure id="fig:Mut322_mindist_c_alpha">
</figure>
- What was the minimal distance between periodic images and at what time did that occur?
- What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations? (It also matters if the small distance occurs transiently or if it is persistent. If it is persistent, it is likely affecting the protein dynamics; but if it's just transiently than it will hardly, if at all, influence.)
- Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system? (Ideally, the minimal distance should therefore not be less than two nanometers.)
Root Mean Square Fluctuations
<figure id="fig:Mut322_rmsf-per-residue.png">
</figure>
<figure id="fig:Mut322_bfactor_binding_site">
</figure>
<figure id="fig:Mut322_bfactor_down_site">
</figure>
<figure id="fig:Mut322_bfactor_up_site">
</figure>
<figure id="fig:Mut322_bfactor_unmutated_site">
</figure>
<figure id="fig:Mut322_bfactor_mutation_site">
</figure>
- Indicate the start and end residue for the most flexible regions and the maximum amplitudes. ( T )
- Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?
Convergence of RMSD
<figure id="fig:Mut322_rmds_all-atom-vs-start">
</figure>
<figure id="fig:Mut322_rmds-all-atom-vs-average">
</figure>
<figure id="fig:Mut322_rmds-backbone-vs-start">
</figure>
<figure id="fig:Mut322_rmds_backbone-vs-average">
</figure>
- If observed, at what time and value does the RMSD reach a plateau?
- Briefly discuss differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?
Convergence of Radius of Gyration
<figure id="fig:Mut322_radius-of-gyration">
</figure> <figure id="fig:Mut322_inertia">
</figure>
- Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
- At what time and value does the radius of gyration converge? ( T )
Structural Analysis: Properties Derived from Configurations
Solvent accessible surface
- Which residues are the most accessible to the solvent?
Hydrogen Bonds
- Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.
Salt Bridges
Secondary Structure
- Discuss some of the changes in the secondary structure, if any.
Ramachandran Plots
- What can you say about the conformation of the residues, based on the ramachandran plots?
Analysis of Dynamics and Time-averaged Properties
Root Mean Square Deviations
- What is interesting by choosing the group "Mainchain+Cb" for this analysis?
- How many transitions do you see?
- What can you conclude from this analysis? Could you expect such a result, justify?
Cluster Analysis
- How many clusters were found and what were the sizes of the largest two?
- Are there notable differences between the two structures?
Distance RMSD
- At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?
Arg408Trp analysis
<figure id="fig:mut408_overlay">
</figure>
Quality Assurance
Convergence of Energy Terms
<figure id="fig:Mut408_temperature">
</figure>
<figure id="fig:Mut408_pressure">
</figure>
<figure id="fig:Mut408_volume">
</figure>
<figure id="fig:Mut408_density">
</figure>
<figure id="fig:Mut408_energies">
</figure>
<figure id="fig:Mut408_box">
</figure>
<figure id="fig:Mut408_coulomb">
</figure>
<figure id="fig:1J8Uvdw">
</figure>
- What is the average temperature and what is the heat capacity of the system? ( T )
- What are the terms plotted in the files energy.xvg and box.xvg
- Estimate the plateau values for the pressure, the volume and the density. ( T )
- What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg ?
Minimum Distance Between Periodic Images
<figure id="fig:Mut408_mindist">
</figure> <figure id="fig:Mut408_mindist_c_alpha">
</figure>
- What was the minimal distance between periodic images and at what time did that occur?
- What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations? (It also matters if the small distance occurs transiently or if it is persistent. If it is persistent, it is likely affecting the protein dynamics; but if it's just transiently than it will hardly, if at all, influence.)
- Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system? (Ideally, the minimal distance should therefore not be less than two nanometers.)
Root Mean Square Fluctuations
<figure id="fig:Mut408_rmsf-per-residue.png">
</figure>
<figure id="fig:Mut408_bfactor_binding_site">
</figure>
<figure id="fig:Mut408_bfactor_down_site">
</figure>
<figure id="fig:Mut408_bfactor_up_site">
</figure>
<figure id="fig:Mut408_bfactor_unmutated_site">
</figure>
<figure id="fig:Mut408_bfactor_mutation_site">
</figure>
- Indicate the start and end residue for the most flexible regions and the maximum amplitudes. ( T )
- Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?
Convergence of RMSD
<figure id="fig:Mut408_rmds_all-atom-vs-start">
</figure>
<figure id="fig:Mut408_rmds-all-atom-vs-average">
</figure>
<figure id="fig:Mut408_rmds-backbone-vs-start">
</figure>
<figure id="fig:Mut408_rmds_backbone-vs-average">
</figure>
- If observed, at what time and value does the RMSD reach a plateau?
- Briefly discuss differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?
Convergence of Radius of Gyration
<figure id="fig:Mut408_radius-of-gyration">
</figure> <figure id="fig:Mut408_inertia">
</figure>
- Have a look at the radius of gyration and the individual components and note how each of these progress to an equilibrium value.
- At what time and value does the radius of gyration converge? ( T )
Structural Analysis: Properties Derived from Configurations
Solvent accessible surface
- Which residues are the most accessible to the solvent?
Hydrogen Bonds
- Discuss the relation between the number of hydrogen bonds for both cases and the fluctuations in each plot.
Salt Bridges
Secondary Structure
- Discuss some of the changes in the secondary structure, if any.
Ramachandran Plots
- What can you say about the conformation of the residues, based on the ramachandran plots?
Analysis of Dynamics and Time-averaged Properties
Root Mean Square Deviations
- What is interesting by choosing the group "Mainchain+Cb" for this analysis?
- How many transitions do you see?
- What can you conclude from this analysis? Could you expect such a result, justify?
Cluster Analysis
- How many clusters were found and what were the sizes of the largest two?
- Are there notable differences between the two structures?
Distance RMSD
- At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?