Difference between revisions of "ASPA Molecular Dynamics Simulation Analysis"
(Created page with "Nothing to see yet, as we are still waiting for our LRZ jobs.") |
|||
Line 1: | Line 1: | ||
+ | ===A brief check of results=== |
||
− | Nothing to see yet, as we are still waiting for our LRZ jobs. |
||
+ | |||
+ | |||
+ | ====How many frames are in the trajectory file and what is the time resolution?==== |
||
+ | 2000 frames at a resolution of 5 picoseconds per frame. |
||
+ | |||
+ | |||
+ | ====How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?==== |
||
+ | |||
+ | Reference structure: Real time 1d04h28:22, speed 8.429 ns/day, 325035.7 years for one full second |
||
+ | |||
+ | CM994594 Mutation: Real time 1d04h51:42, speed 8.316 ns/day, 329452.3 years for one full second |
||
+ | |||
+ | |||
+ | ===Visualization of results=== |
||
+ | |||
+ | Reference structure: |
||
+ | |||
+ | [[File:Ref anim.gif]] |
||
+ | |||
+ | CM mutated structure: |
||
+ | |||
+ | [[File:Cm anim small.gif]] |
||
+ | |||
+ | In both cases, the two domains separated and ended up at quite a large distance from each other: |
||
+ | |||
+ | [[File:Ref domains.png|500px|Dislocated domains and simulation box]] |
||
+ | |||
+ | We tried to remedy this by improvong the equilibration step of the system, but could not solve this issue despite our efforts. As can be seen clearly, neither domain stays within the simulation box; this certainly has some effect on the results of this analysis. |
||
+ | |||
+ | |||
+ | ====What happens if the protein diffuses over the boundary of the box?==== |
||
+ | |||
+ | When a protein leaves the bounding box by, say, the upper face, it enters the box by the bottom face; this creates a hole in the solvent where the protein would be. |
||
+ | |||
+ | |||
+ | ====What is the average temperature and what is the heat capacity of the system?==== |
||
+ | |||
+ | [[File:Ref temperature CD.png|300px|thumb|Reference structure temperature graph]] |
||
+ | [[File:Cm temperature CD.png|300px|thumb|CM mutation structure temperature graph]] |
||
+ | |||
+ | The average temperature of the system is: |
||
+ | |||
+ | Reference Structure: |
||
+ | <pre> Energy Average Err.Est. RMSD Tot-Drift |
||
+ | ------------------------------------------------------------------------------- |
||
+ | Temperature 297.929 0.0024 0.902235 0.00656339 (K) |
||
+ | |||
+ | Heat capacity at constant pressure Cp: 2.62838e+06 J/mol K</pre> |
||
+ | |||
+ | cm99: |
||
+ | <pre> Energy Average Err.Est. RMSD Tot-Drift |
||
+ | ------------------------------------------------------------------------------- |
||
+ | Temperature 297.926 0.0059 0.901054 0.00425121 (K) |
||
+ | |||
+ | Heat capacity at constant pressure Cp: 2.62312e+06 J/mol K</pre> |
||
+ | |||
+ | |||
+ | ====What are the terms plotted in the files energy.xvg and box.xvg==== |
||
+ | [[File:Ref energy CD.png|300px|thumb|Reference structure energy graph]] |
||
+ | [[File:Cm energy CD.png|300px|thumb|CM mutation structure energy graph]] |
||
+ | [[File:Ref box CD.png|300px|thumb|Reference structure box graph]] |
||
+ | [[File:Cm box CD.png|300px|thumb|CM mutation structure box graph]] |
||
+ | |||
+ | '''energy.xvg:''' We plot the energy values of kinetic energy, potential energy and total energy over the time of our simulation. |
||
+ | |||
+ | '''box.xvg:''' We plot the size of the box around our protein which is given in nm. |
||
+ | |||
+ | |||
+ | ====Estimate the plateau values for the pressure, the volume and the density.==== |
||
+ | [[File:Ref pressure CD.png|300px|thumb|Reference structure pressure graph]] |
||
+ | [[File:Cm pressure CD.png|300px|thumb|CM mutation structure pressure graph]] |
||
+ | [[File:Ref volume CD.png|300px|thumb|Reference structure volume graph]] |
||
+ | [[File:Cm volume CD.png|300px|thumb|CM mutation structure volume graph]] |
||
+ | [[File:Ref density CD.png|300px|thumb|Reference structure density graph]] |
||
+ | [[File:Cm density CD.png|300px|thumb|CM mutation structure densitiy graph]] |
||
+ | |||
+ | Reference structure: |
||
+ | |||
+ | * '''Pressure:''' upper plateau: 226 bar, lower plateau: -210 bar |
||
+ | * '''Volume:''' upper plateau: 1077 nm^3, lower plateau: 1073 nm^3 |
||
+ | * '''Density:''' upper plateau: 1008 kg/m^3, lower plateau: 1004 kg/m^3 |
||
+ | |||
+ | CM994594 mutated structure: |
||
+ | |||
+ | * '''Pressure:''' upper plateau: 230 bar, lower plateau: -230 bar |
||
+ | * '''Volume:''' upper plateau: 1076 nm^3, lower plateau: 1072 nm^3 |
||
+ | * '''Density:''' upper plateau: 1008 kg/m^3, lower plateau: 1005 kg/m^3 |
||
+ | |||
+ | |||
+ | ====What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg==== |
||
+ | [[File:Ref coloumb CD.png|300px|thumb|Reference structure coulomb graph]] |
||
+ | [[File:Cm coloumb CD.png|300px|thumb|CM mutation structure coulomb graph]] |
||
+ | [[File:Ref vanderwaals CD.png|300px|thumb|Reference structure van der Waals graph]] |
||
+ | [[File:Cm vanderwaals CD.png|300px|thumb|CM mutation structure van der Waals graph]] |
||
+ | |||
+ | The coloumb and Van-der-Waals energies of the protein over time. |
||
+ | |||
+ | |||
+ | ====What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations?==== |
||
+ | |||
+ | The total energy would increase dramatically; this appears not to have happened in our simulations. |
||
+ | |||
+ | |||
+ | ====Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system?==== |
||
+ | [[File:Ref minimal periodic distance CD.png|300px|thumb|Reference structure minimum periodic distance graph]] |
||
+ | [[File:Cm minimal periodic distance CD.png|300px|thumb|CM mutation structure minimum periodic distance graph]] |
||
+ | [[File:Ref minimal periodic distance calpha CD.png|300px|thumb|Reference structure minimum periodic distance (c alpha) graph]] |
||
+ | [[File:Cm minimal periodic distance calpha CD.png|300px|thumb|CM mutation structure minimum periodic distance (c alpha) graph]] |
||
+ | |||
+ | * '''CM994594:''' The shortest periodic distance is 0.357548 (nm) at time 1565 (ps), between atoms 4791 and 8299 |
||
+ | * '''Reference Structure:''' The shortest periodic distance is 0.384713 (nm) at time 7955 (ps), between atoms 3061 and 10049 |
||
+ | |||
+ | [[File:Ref pymol b-factors.png|300px|thumb|Reference structure b-factors visualization]] |
||
+ | [[File:Cm pymol b-factors.png|300px|thumb|CM mutation structure b-factors visualization]] |
||
+ | [[File:Ref rms fluctuation CD.png|300px|thumb|Reference structure RMSF per residue graph]] |
||
+ | [[File:Cm rmsf per residue CD.png|300px|thumb|CM mutation structure RMSF per residue graph]] |
||
+ | |||
+ | |||
+ | ====Indicate the start and end residue for the most flexible regions and the maximum amplitudes==== |
||
+ | |||
+ | * '''CM994594:''' Most flexible regions at 50-110, 210-280; maximum amplitude 1.0 (from 1.4 to 2.4) |
||
+ | * '''Reference:''' Most flexible regions at 80-90, 230-250, 260-270; maximum amplitude 0.4 (from 1.9 to 2.3) |
||
+ | |||
+ | |||
+ | ====Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?==== |
||
+ | |||
+ | CM994594 appears to be less flexible than ref, ref reaches maximum far more often, has better defined maxima and has a higher average. |
||
+ | |||
+ | [[File:Ref rmsd all atom vs avg CD.png|300px|thumb|Reference structure RMSD all atoms vs. average graph]] |
||
+ | [[File:Cm rmsd all atoms vs avg CD.png|300px|thumb|CM mutation structure RMSD all atoms vs. average graph]] |
||
+ | [[File:Ref rmsd all atom vs start CD.png|300px|thumb|Reference structure RMSD all atoms vs. start graph]] |
||
+ | [[File:Cm rmsd all atoms vs start CD.png|300px|thumb|CM mutation structure RMSD all atoms vs. start graph]] |
||
+ | |||
+ | ====If observed, at what time and value does the RMSD reach a plateau?==== |
||
+ | |||
+ | * '''CM994594:''' the RMSD converges to around 4.2nm at 6000ps |
||
+ | * '''Reference:'''the RMSD converges to around 5.04nm at 8000ps |
||
+ | |||
+ | |||
+ | ====Briefly discuss two differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?==== |
||
+ | |||
+ | The starting structure graph oscillates far stronger and does not show any indication of convergence; while it does display |
||
+ | The average structure graphs are far more indicative; the starting structure graphs don't even show anything in the way of convergence. This makes sense, too, because the average structure should reflect the converging structures better than the starting structure. |
||
+ | |||
+ | |||
+ | [[File:Ref radius of gyration CD.png|300px|thumb|Reference structure radius of gyration graph]] |
||
+ | [[File:Cm radius of gyration CD.png|300px|thumb|CM mutation structure radius of gyration graph]] |
||
+ | |||
+ | ====At what time and value does the radius of gyration converge?==== |
||
+ | |||
+ | * '''CM994594:''' Convergence at about 3600ps, around 5nm. |
||
+ | * '''Reference:''' Convergence at about 5500ps, around 6nm. |
||
+ | |||
+ | |||
+ | ===Structural Analysis: Properties Derived From Configurations=== |
||
+ | [[File:Ref ss.png|300px|thumb|Reference structure secondary structure vs. time]] |
||
+ | [[File:Cm ss.png|300px|thumb|CM mutation structure secondary structure vs. time]] |
||
+ | |||
+ | ====Discuss some of the changes in the secondary structure, if any.==== |
||
+ | '''Reference:''' |
||
+ | |||
+ | At around 60 to 75, there is a number of residues which constantly switch back and forth from turn to alpha helix. the same can be seen at around 370-375. In both cases, these are followed by a turn/bend structure and a short alpha helix element. The separation from the helix by a turn/bend stretch is somewhat unexpected, a continuous alpha helix would appear to make more sense. |
||
+ | |||
+ | Also, at around 230, a stretch of turns and bends desintegrates into coils over time. this is interesting, since there is no alpha helix nearby and there appears to be little conserving pressure on these residues secondary structure. |
||
+ | |||
+ | '''CM994594:''' |
||
+ | |||
+ | At around 380, a stretch of residues alternates between alpha helix and turn; since it would form a very short alpha helix, this appears to be an artifact of dssp's assignment methodics. |
||
+ | |||
+ | |||
+ | ====Compare the stability of the secondary structures in the different proteins.==== |
||
+ | Reference has slightly better defined structure elements with less fuzzy boundaries; the major elements (big sheets+helices) are present and being conserved in both cases. Visually, reference appears to be a bit more stable than CM994594. This is what one would expect, considering the higher energies in the case of CM994594. |
||
+ | |||
+ | [[File:Ref rmsd matrix.png|300px|thumb|Reference structure RMSD matrix (main chain + Cb)]] |
||
+ | [[File:Cm rmsd matrix.png|300px|thumb|CM mutation structure RMSD matrix (main chain + Cb)]] |
||
+ | |||
+ | |||
+ | ====What is interesting by choosing the group "Mainchain+Cb" for this analysis? Think about the different proteins used for this practical.==== |
||
+ | Mainchain+Cb means that we're using the coordinates of the C beta atoms in addition to those of the atoms in the backbone; since the differences between the structures we used lie in mutated residues, the C beta atoms contain valuable information for our analysis. |
||
+ | |||
+ | |||
+ | ====How many transitions do you see?==== |
||
+ | We see seven transitions following the tome axis from the starting structure onward; from blue (very small) to green, then to yellow, on to orange, back to yellow, back to green, and finally yellow and orange again. |
||
+ | |||
+ | |||
+ | ====What can you conclude from this analysis? Could you expect such a result, justify?==== |
||
+ | From the starting structure on, the rmsd between this and the current structure stays small for a short time, then climbs up to a peak of around 1 nm, then decreases and climbs back up; this is consistent with a sort-of oscillating motion of large parts of the protein. This appears to be the case indeed. |
||
+ | |||
+ | The large separation of the two domains should play a role here, too, increasing the RMSD values rather pointedly. |
||
+ | |||
+ | |||
+ | ====How many clusters were found and what were the sizes of the largest two? ( T )==== |
||
+ | * '''Reference:''' 74 clusters, largest two contain 27 resp. 20 structures. |
||
+ | * '''CM994594:''' 169 clusters, largest two contain 34 resp. 22 structures. |
||
+ | |||
+ | |||
+ | ====Are there notable differences between the two structures?==== |
||
+ | * '''Reference:''' A slight dislocation of the important helices is visible between the two structures, but it appears to be fairly uniform in terms of distance and type of dislocation. |
||
+ | * '''CM994594:''' Most of the major helices are extremely dislocated, in some cases even rotated. The same is true for the important beta sheets. |
||
+ | |||
+ | |||
+ | ====At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?==== |
||
+ | |||
+ | For the reference structure, convergence is around 1500ps and 1.7; for CM, the corresponding figures are 4000ps and 1.8. |
||
+ | |||
+ | The standard RMSD graphs oscillate far stronger; they also tend to converge at a local minimum (global in some cases), whereas dRMSD converges around a local maximum. |
||
+ | |||
+ | [[File:Ref drmsd.png|300px|thumb|dRMSD for reference structure]] |
||
+ | [[File:Cm drmsd.png|300px|thumb|dRMSD for reference structure]] |
Revision as of 06:42, 4 November 2011
Contents
- 1 A brief check of results
- 2 Visualization of results
- 2.1 What happens if the protein diffuses over the boundary of the box?
- 2.2 What is the average temperature and what is the heat capacity of the system?
- 2.3 What are the terms plotted in the files energy.xvg and box.xvg
- 2.4 Estimate the plateau values for the pressure, the volume and the density.
- 2.5 What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg
- 2.6 What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations?
- 2.7 Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system?
- 2.8 Indicate the start and end residue for the most flexible regions and the maximum amplitudes
- 2.9 Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?
- 2.10 If observed, at what time and value does the RMSD reach a plateau?
- 2.11 Briefly discuss two differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?
- 2.12 At what time and value does the radius of gyration converge?
- 3 Structural Analysis: Properties Derived From Configurations
- 3.1 Discuss some of the changes in the secondary structure, if any.
- 3.2 Compare the stability of the secondary structures in the different proteins.
- 3.3 What is interesting by choosing the group "Mainchain+Cb" for this analysis? Think about the different proteins used for this practical.
- 3.4 How many transitions do you see?
- 3.5 What can you conclude from this analysis? Could you expect such a result, justify?
- 3.6 How many clusters were found and what were the sizes of the largest two? ( T )
- 3.7 Are there notable differences between the two structures?
- 3.8 At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?
A brief check of results
How many frames are in the trajectory file and what is the time resolution?
2000 frames at a resolution of 5 picoseconds per frame.
How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?
Reference structure: Real time 1d04h28:22, speed 8.429 ns/day, 325035.7 years for one full second
CM994594 Mutation: Real time 1d04h51:42, speed 8.316 ns/day, 329452.3 years for one full second
Visualization of results
Reference structure:
CM mutated structure:
In both cases, the two domains separated and ended up at quite a large distance from each other:
We tried to remedy this by improvong the equilibration step of the system, but could not solve this issue despite our efforts. As can be seen clearly, neither domain stays within the simulation box; this certainly has some effect on the results of this analysis.
What happens if the protein diffuses over the boundary of the box?
When a protein leaves the bounding box by, say, the upper face, it enters the box by the bottom face; this creates a hole in the solvent where the protein would be.
What is the average temperature and what is the heat capacity of the system?
The average temperature of the system is:
Reference Structure:
Energy Average Err.Est. RMSD Tot-Drift ------------------------------------------------------------------------------- Temperature 297.929 0.0024 0.902235 0.00656339 (K) Heat capacity at constant pressure Cp: 2.62838e+06 J/mol K
cm99:
Energy Average Err.Est. RMSD Tot-Drift ------------------------------------------------------------------------------- Temperature 297.926 0.0059 0.901054 0.00425121 (K) Heat capacity at constant pressure Cp: 2.62312e+06 J/mol K
What are the terms plotted in the files energy.xvg and box.xvg
energy.xvg: We plot the energy values of kinetic energy, potential energy and total energy over the time of our simulation.
box.xvg: We plot the size of the box around our protein which is given in nm.
Estimate the plateau values for the pressure, the volume and the density.
Reference structure:
- Pressure: upper plateau: 226 bar, lower plateau: -210 bar
- Volume: upper plateau: 1077 nm^3, lower plateau: 1073 nm^3
- Density: upper plateau: 1008 kg/m^3, lower plateau: 1004 kg/m^3
CM994594 mutated structure:
- Pressure: upper plateau: 230 bar, lower plateau: -230 bar
- Volume: upper plateau: 1076 nm^3, lower plateau: 1072 nm^3
- Density: upper plateau: 1008 kg/m^3, lower plateau: 1005 kg/m^3
What are the terms plotted in the files coulomb-inter.xvg and vanderwaals-inter.xvg
The coloumb and Van-der-Waals energies of the protein over time.
What happens if the minimal distance becomes shorter than the cut-off distance used for electrostatic interactions? Is it the case in your simulations?
The total energy would increase dramatically; this appears not to have happened in our simulations.
Run now g_mindist on the C-alpha group, does it change the results? What does is mean for your system?
- CM994594: The shortest periodic distance is 0.357548 (nm) at time 1565 (ps), between atoms 4791 and 8299
- Reference Structure: The shortest periodic distance is 0.384713 (nm) at time 7955 (ps), between atoms 3061 and 10049
Indicate the start and end residue for the most flexible regions and the maximum amplitudes
- CM994594: Most flexible regions at 50-110, 210-280; maximum amplitude 1.0 (from 1.4 to 2.4)
- Reference: Most flexible regions at 80-90, 230-250, 260-270; maximum amplitude 0.4 (from 1.9 to 2.3)
Compare the results from the different proteins. Are there differences? If yes, which is the most flexible and which least?
CM994594 appears to be less flexible than ref, ref reaches maximum far more often, has better defined maxima and has a higher average.
If observed, at what time and value does the RMSD reach a plateau?
- CM994594: the RMSD converges to around 4.2nm at 6000ps
- Reference:the RMSD converges to around 5.04nm at 8000ps
Briefly discuss two differences between the graphs against the starting structure and against the average structure. Which is a better measure for convergence?
The starting structure graph oscillates far stronger and does not show any indication of convergence; while it does display The average structure graphs are far more indicative; the starting structure graphs don't even show anything in the way of convergence. This makes sense, too, because the average structure should reflect the converging structures better than the starting structure.
At what time and value does the radius of gyration converge?
- CM994594: Convergence at about 3600ps, around 5nm.
- Reference: Convergence at about 5500ps, around 6nm.
Structural Analysis: Properties Derived From Configurations
Discuss some of the changes in the secondary structure, if any.
Reference:
At around 60 to 75, there is a number of residues which constantly switch back and forth from turn to alpha helix. the same can be seen at around 370-375. In both cases, these are followed by a turn/bend structure and a short alpha helix element. The separation from the helix by a turn/bend stretch is somewhat unexpected, a continuous alpha helix would appear to make more sense.
Also, at around 230, a stretch of turns and bends desintegrates into coils over time. this is interesting, since there is no alpha helix nearby and there appears to be little conserving pressure on these residues secondary structure.
CM994594:
At around 380, a stretch of residues alternates between alpha helix and turn; since it would form a very short alpha helix, this appears to be an artifact of dssp's assignment methodics.
Compare the stability of the secondary structures in the different proteins.
Reference has slightly better defined structure elements with less fuzzy boundaries; the major elements (big sheets+helices) are present and being conserved in both cases. Visually, reference appears to be a bit more stable than CM994594. This is what one would expect, considering the higher energies in the case of CM994594.
What is interesting by choosing the group "Mainchain+Cb" for this analysis? Think about the different proteins used for this practical.
Mainchain+Cb means that we're using the coordinates of the C beta atoms in addition to those of the atoms in the backbone; since the differences between the structures we used lie in mutated residues, the C beta atoms contain valuable information for our analysis.
How many transitions do you see?
We see seven transitions following the tome axis from the starting structure onward; from blue (very small) to green, then to yellow, on to orange, back to yellow, back to green, and finally yellow and orange again.
What can you conclude from this analysis? Could you expect such a result, justify?
From the starting structure on, the rmsd between this and the current structure stays small for a short time, then climbs up to a peak of around 1 nm, then decreases and climbs back up; this is consistent with a sort-of oscillating motion of large parts of the protein. This appears to be the case indeed.
The large separation of the two domains should play a role here, too, increasing the RMSD values rather pointedly.
How many clusters were found and what were the sizes of the largest two? ( T )
- Reference: 74 clusters, largest two contain 27 resp. 20 structures.
- CM994594: 169 clusters, largest two contain 34 resp. 22 structures.
Are there notable differences between the two structures?
- Reference: A slight dislocation of the important helices is visible between the two structures, but it appears to be fairly uniform in terms of distance and type of dislocation.
- CM994594: Most of the major helices are extremely dislocated, in some cases even rotated. The same is true for the important beta sheets.
At what time and value does the dRMSD converge and how does this graph compare to the standard RMSD?
For the reference structure, convergence is around 1500ps and 1.7; for CM, the corresponding figures are 4000ps and 1.8.
The standard RMSD graphs oscillate far stronger; they also tend to converge at a local minimum (global in some cases), whereas dRMSD converges around a local maximum.