Difference between revisions of "MD Mutation485"

From Bioinformatikpedia
(Created page with "=== check the trajectory === We checked the trajectory with following command: gmxcheck -f mut436_md.xtc With the command we got following results: Reading frame …")
 
Line 238: Line 238:
   
 
==== Protein ====
 
==== Protein ====
  +
  +
First of all, we calculate the RMSF for the whole protein.
  +
  +
The analysis produce two different pdb files, one file with the average structure of the protein and one file with high B-Factor values, which means that the high flexbile regions of the protein are not in accordance with the original PDB file.
  +
  +
To compare the structure we align them with Pymol with the original structure.
  +
  +
{| border="1" style="text-align:center; border-spacing:0;"
  +
|original & average
  +
|original & B-Factors
  +
|average & B-Factors
  +
|-
  +
|colspan="3" | Perspective one
  +
|-
  +
| [[Image:ali_average_original.png|thumb|Figure 7: Alignment of the original structure (green) and the calculated average structure (turquoise)]]
  +
| [[Image:ali_bfactor_original.png|thumb|Figure 8: Alignment of the original structure (green) and the calculated structure with high B-Factor values (turquoise)]]
  +
| [[Image:ali_bfactor_average.png|thumb|Figure 9: Alignment of the structure with high B-Factor values (red) and the calculated average structure (blue)]]
  +
|-
  +
|colspan="3" | Perspective two
  +
|-
  +
| [[Image:ali_average_original_2.png|thumb|Figure 10: Alignment of the original structure (green) and the calculated average structure (turquoise)]]
  +
| [[Image:ali_bfactor_original_2.png|thumb|Figure 11: Alignment of the original structure (green) and the calculated structure with high B-Factor values (turquoise)]]
  +
| [[Image:ali_bfactor_average_2.png|thumb|Figure 12: Alignment of the structure with high B-Factor values (red) and the calculated average structure (blue)]]
  +
|-
  +
|colspan="3" | RMSD
  +
|-
  +
|1.519
  +
|0.349
  +
|1.727
  +
|-
  +
|}
  +
  +
**************************** Hier noch ändern: ***************************
  +
  +
The structure with the high B-factors is the most similar structure (Figure 8 and Figure 11) compared with the original structure from PDB (Figure 7 and Figure 10). The average structure is not that similar (Figure 10 and Figure 12). But we know, that the regions with high B-Factors are very flexible, and therefore in the structure downloaded from the PDB, the protein is in another state, because of its flexible regions. Therefore, because of the low RMSD between the high B-factors structure and the original structure we can see, that the simulation predicts the structure quite good.
  +
  +
  +
Furthermore, we got a plot of the RMSF values of the protein, which can be seen in Figure 13:
  +
  +
[[Image:mut485_rmsf.png|thumb|center|Figure 13: Plot of the RMSF values over the whole protein.]]
  +
  +
**************************** Hier noch ändern: ***************************
  +
There are two regions with very high B-factor values. One region at position 150 (Figure 14), and the other region at position 440 (Figure 15). If we compare the picture of the original and the average structure, we can see that most of the regions build a very good alignment, whereas some regions vary in their position. Therefore, we want to compare, if these regions are the regions with very high B-factor values.
  +
  +
{|
  +
| [[Image:mut485_peek1.png|thumb|Figure 14: Part of the alignment between the original structure and the average structure between residue 140 and 160.]]
  +
| [[Image:mut485_peek2.png|thumb|Figure 15: Part of the alignment between the original structure and the average structure between residue 430 and 450.]]
  +
|}
  +
  +
Furthermore, we visualized the B-factors with the Pymol selection B-factor method. We calculated the B-factors for the blue protein (Figure 16 and Figure 17). If you see red, this part of the protein is very flexible. The brighter the color, the higher is the flexibility of this residue.
  +
  +
{|
  +
| [[Image:mut485_bfact_zoom.png|thumb|Figure 16: Part of the alignment between the original structure and the average structure between residue 140 and 160. High B-Factor value -> bright color]]
  +
| [[Image:mut485_bfact_zoom.png|thumb|Figure 17: Part of the alignment between the original structure and the average structure between residue 430 and 450. High B-Factor value -> bright color]]
  +
|}
  +
  +
**************************** Hier noch ändern: ***************************
  +
In the second picture, you can see, that the color is dark blue. Therefore a peak lower than 0.3 do not mean that there is high flexibility. Therefore, our protein has only one very flexible region and this is around residue 140.
  +
  +
  +
As you can see in the pictures above, especially in the first picture, which is the part with the highest peak in the plot, the structures have a very different position and the alignment in this part of the protein is very bad, although the rest of the alignment is quite good. This also explains the relatively high RMSD values, because of the different positions of the flexible parts of the protein.

Revision as of 15:09, 19 September 2011

check the trajectory

We checked the trajectory with following command:

gmxcheck -f mut436_md.xtc 

With the command we got following results:

Reading frame       0 time    0.000   
# Atoms  96545
Precision 0.001 (nm)
Last frame       2000 time 10000.000 

Furthermore, we got some detailed results about the different items during the simulation.

Item #frames Timestep (ps)
Step 2001 5
Time 2001 5
Lambda 0 -
Coords 2001 5
Velocities 0 -
Forces 0 -
Box 2001 5

The simulation finished on node 0 Thu Sep 15 19:12:47 2011.

Time
Node (s) Real (s) %
22336.000 22336.000 100%
6h12:00

The complete simulation needs 6 hours and 12 minutes to finishing.

Performance
Mnbf/s GFlops ns/day hour/ns
1277.617 93.808 38.682 0.620

As you can see in the table above, it takes about 2/3 hour to simulate 1 ns of the system. So therefore, it would be possible to simulate about __ns in one complete day calculation time.

Visualize in pymol

First of all, we visualized the simulation with with ngmx, because it draws bonds based on the topology file. ngmx gave the user the possibility to choose different parameters. Therefore, we decided to visualize the system with following parameters:

Group 1 Group 2
System Water
Protein Ion
Backbone NA
MainChain+H CL
SideChain

igure 1 shows the visualization with ngmx:

Figure 1: Visualisation of the MD simulation for Mutation 436 with ngmx

create a movie

Next, we want to visualize the protein with pymol. Therefore, we extracted 1000 frames from the trajectory, leaving out the water and jump over the boundaries to make continuse trajectories. Therefore, we used following command:

trjconv -s fole.tpr -f file.xtc -o output_file.pdb -pbc nojump -dt 10

The program asks for the a group as output. We only want to see the protein, therefore we decided to use group 1.

Todo: film und filtered

energy calculations for pressure, temperature, potential and total energy

Temperature

Average (in K) 297.936
Error Estimation 0.0045
RMSD 0.940566
Tot-Drift 0.00654126

The plot with the temperature distribution of the system can be seen here:

File:Mut436 md temp.png
Figure 2: Plot of the temperature distribution of the MD system.
                                                        • Hier noch ändern: ***************************

As you can see on Figure 2, most of the time the system has a temperature about 298K. The maximal difference between this average temperature and the minimum/maxmimum temperature is only about 4 K, which is not that high. But we have to keep in mind, that only some degree difference can destroy the function of a protein. 298 K is about 25°C, which is relativly cold for a protein to work, because the temperature in our bodies is about 36°C.


Potential

Average (in kJ/mol) -1.28176e+06
Error Estimation 85
RMSD 1068.67
Tot-Drift -536.314

The plot with the potential energy distribution of the system can be seen here:

Figure 3: Plot of the potential energy distribution of the MD system.
                                                        • Hier noch ändern: ***************************

As can be seen on Figure 3, the potential engery of the system is between -1.282e+06 and -1.281e+06, which is a relativly low energy. Therefore this means that the protein is stable. So we can suggest, that the protein with such a low energy is able to function and is stable and therefore, our simulation could be true. Otherwise, if the energy of the simulated system is too high, we can not trust the results, because the protein is too instable to work.

Total energy

Average (in kJ/mol) -1.05203e+06
Error Estimation 83
RMSD 1308.04
Tot-Drift -531.275

The plot with the total energy distribution of the system can be seen here:

Figure 4: Plot of the total energy distribution of the MD system.
                                                        • Hier noch ändern: ***************************

As we can see on Figure 4 above, the total energy of the protein is a little bit higher than the potential energy of the protein. In this case, the energy is between -1.05e+06 and -1.051e+06. But these values are already in a range, where we can suggest that the energy of the protein is low enough so that this one can work.

Pressure

Average (in bar) 0.998385
Error Estimation 0.0058
RMSD 71.0317
Tot-Drift -0.0436306

The plot with the pressure distribution of the system can be seen here:

Figure 5: Plot of the pressure distribution of the MD system.
                                                        • Hier noch ändern: ***************************

As you can see on Figure 5, the pressure in the system is most of the time about 1, but there a big outlier with 250 and -250 bar. So therefore we are not sure, if a protein can work with such a pressure.

minimum distance between periodic boundary cells

Next we try to calculate the minimum distance between periodic boundary cells. As before, the program asks for one group to use for the calculation and we decided to use only the protein, because the calculation needs a lot of time and the whole system is significant bigger than only the protein. So therefore, we used group 1.

Here you can see the result of this analysis:

Figure 6: Plot of the minimum distance between periodic boundary cells.
                                                        • Hier noch ändern: ***************************

As you can see on Figure 6, there is a huge difference between the different time steps and distances. The highest distance is up to 4 nm, whereas the smallest distance is only about 1nm. Therefore, we can see that the protein is very flexible over the time.


RMSF for protein and C-alpha

Protein

First of all, we calculate the RMSF for the whole protein.

The analysis produce two different pdb files, one file with the average structure of the protein and one file with high B-Factor values, which means that the high flexbile regions of the protein are not in accordance with the original PDB file.

To compare the structure we align them with Pymol with the original structure.

original & average original & B-Factors average & B-Factors
Perspective one
Figure 7: Alignment of the original structure (green) and the calculated average structure (turquoise)
Figure 8: Alignment of the original structure (green) and the calculated structure with high B-Factor values (turquoise)
Figure 9: Alignment of the structure with high B-Factor values (red) and the calculated average structure (blue)
Perspective two
Figure 10: Alignment of the original structure (green) and the calculated average structure (turquoise)
Figure 11: Alignment of the original structure (green) and the calculated structure with high B-Factor values (turquoise)
Figure 12: Alignment of the structure with high B-Factor values (red) and the calculated average structure (blue)
RMSD
1.519 0.349 1.727
                                                        • Hier noch ändern: ***************************

The structure with the high B-factors is the most similar structure (Figure 8 and Figure 11) compared with the original structure from PDB (Figure 7 and Figure 10). The average structure is not that similar (Figure 10 and Figure 12). But we know, that the regions with high B-Factors are very flexible, and therefore in the structure downloaded from the PDB, the protein is in another state, because of its flexible regions. Therefore, because of the low RMSD between the high B-factors structure and the original structure we can see, that the simulation predicts the structure quite good.


Furthermore, we got a plot of the RMSF values of the protein, which can be seen in Figure 13:

Figure 13: Plot of the RMSF values over the whole protein.
                                                        • Hier noch ändern: ***************************

There are two regions with very high B-factor values. One region at position 150 (Figure 14), and the other region at position 440 (Figure 15). If we compare the picture of the original and the average structure, we can see that most of the regions build a very good alignment, whereas some regions vary in their position. Therefore, we want to compare, if these regions are the regions with very high B-factor values.

Figure 14: Part of the alignment between the original structure and the average structure between residue 140 and 160.
Figure 15: Part of the alignment between the original structure and the average structure between residue 430 and 450.

Furthermore, we visualized the B-factors with the Pymol selection B-factor method. We calculated the B-factors for the blue protein (Figure 16 and Figure 17). If you see red, this part of the protein is very flexible. The brighter the color, the higher is the flexibility of this residue.

Figure 16: Part of the alignment between the original structure and the average structure between residue 140 and 160. High B-Factor value -> bright color
Figure 17: Part of the alignment between the original structure and the average structure between residue 430 and 450. High B-Factor value -> bright color
                                                        • Hier noch ändern: ***************************

In the second picture, you can see, that the color is dark blue. Therefore a peak lower than 0.3 do not mean that there is high flexibility. Therefore, our protein has only one very flexible region and this is around residue 140.


As you can see in the pictures above, especially in the first picture, which is the part with the highest peak in the plot, the structures have a very different position and the alignment in this part of the protein is very bad, although the rest of the alignment is quite good. This also explains the relatively high RMSD values, because of the different positions of the flexible parts of the protein.