MD WildType
Contents
check the trajectory
We checked the trajectory with following command:
gmxcheck -f 2GJX_A_md.xtc
With the command we got following results:
Reading frame 0 time 0.000 # Atoms 96543 Precision 0.001 (nm) Last frame 2000 time 10000.000
Furthermore, we got some detailed results about the different items during the simulation.
Item | #frames | Timestep (ps) |
Step | 2001 | 5 |
Time | 2001 | 5 |
Lambda | 0 | - |
Coords | 2001 | 5 |
Velocities | 0 | - |
Forces | 0 | - |
Box | 2001 | 5 |
The simulation finished on node 0 Thu Sep 15 23:45:08 2011
Time | ||
Node (s) | Real (s) | % |
22438.875 | 22438.875 | 1oo% |
6h13:58 |
The complete simulation needs 6 hours and 13 minutes to finishing.
Performance | |||
Mnbf/s | GFlops | ns/day | hour/ns |
1271.745 | 93.383 | 38.505 | 0.623 |
As you can see in the table above, it takes about half an hour to simulate 1ns of the system. So therefore, it would be possible to simulate about 40ns in one complete day calculation time.
Visualize in pymol
First of all, we visualized the simulation with with ngmx, because it draws bonds based on the topology file. ngmx gave the user the possibility to choose different parameters. Therefore, we decided to visualize the system with following parameters:
Group 1 | Group 2 |
System | Water |
Protein | Ion |
Backbone | NA |
MainChain+H | CL |
SideChain |
Figure 1 shows the visualization with ngmx:
Create a movie
Next, we want to visualize the protein with pymol. Therefore, we extracted 1000 frames from the trajectory, leaving out the water and jump over the boundaries to make continouse trajectories. Therefore, we used following command:
trjconv -s fole.tpr -f file.xtc -o output_file.pdb -pbc nojump -dt 10
The program asks for the a group as output. We only want to see the protein, therefore we decided to choose group 1.
Todo: Filem erstellen und die filtered machen
energy calculations for pressure, temperature, potential and total energy
Temperature
Average (in K) | 297.94 |
Error Estimation | 0.0029 |
RMSD | 0.942857 |
Tot-Drift | 0.0066475 |
The plot with the temperature distribution of the system can be seen here:
As you can see on Figure 2, most of the time the system has a temperature about 298K. The maximal difference between this average temperature and the minimum/maxmimum temperature is only about 4 K, which is not that high. But we have to keep in mind, that only some degree difference can destroy the function of a protein. 298 K is about 25°C, which is relatively cold for a protein to work, because the temperature in our bodies is about 36°C.
Potential
Average (in kJ/mol) | -1.2815e+06 |
Error Estimation | 100 |
RMSD | 1078.55 |
Tot-Drift | -661.902 |
The plot with the potential energy distribution of the system can be seen here:
As can be seen on Figure 3, the potential energy of the system is between -1.285e+06 and -1.278e+06, which is a relatively low energy. Therefore this means that the protein is stable. So we can suggest, that the protein with such a low energy is able to function and is stable and therefore, our simulation could be true. Otherwise, if the energy of the simulated system is too high, we can not trust the results, because the protein is too instable to work.
Total energy
Average (in kJ/mol) | -1.05177e+06 |
Error Estimation | 100 |
RMSD | 1321.31 |
Tot-Drift | -656.777 |
The plot with the total energy distribution of the system can be seen here:
As we can see on Figure 4 above, the total energy of the protein is a little bit higher than the potential energy of the protein. In this case, the energy is between -1.055e+06 and -1.048e+06. But these values are already in a range, where we can suggest that the energy of the protein is low enough so that this one can work.
Pressure
Average (in bar) | 1.00711 |
Error Estimation | 0.0087 |
RMSD | 71.2473 |
Tot-Drift | -0.0454746 |
The plot with the pressure distribution of the system can be seen here:
As you can see on Figure 5, the pressure in the system is most of the time about 0, but there a big outlier with 250 and about -200 bar. So therefore we are not sure, if a protein can work with such a pressure.
minimum distance between periodic boundary cells
Next we try to calculate the minimum distance between periodic boundary cells. As before, the program asks for one group to use for the calculation and we decided to use only the protein, because the calculation needs a lot of time and the whole system is significant bigger than only the protein. So therefore, we used group 1.
Here you can see the result of this analysis.
As you can see on Figure 6, there is a huge difference between the different time steps and distances. The highest distance is up to 4 nm, whereas the smallest distance is only about 1nm. Therefore, we can see that the protein is very flexible over the time.
TODO
RMSF for protein and C-alpha
Protein
First of all, we calculate the RMSF for the whole protein.
The analysis produce two different pdb files, one file with the average structure of the protein and one file with high B-Factor values, which means that the high flexbile regions of the protein are not in accordance with the original PDB file.
To compare the structure we align them with pymol with the original structure.
original & average | original & B-Factors | average & B-Factors |
Perspective one | ||
Perspective two | ||
RMSD | ||
1.556 | 0.349 | 1.684 |
The structure with the high B-factors is the most similar structure (Figure 8 and Figure 11) compared with the original structure from PDB (Figure 7 and Figure 10). The average structure is not that similar (Figure 10 and Figure 12). But we know, that the regions with high B-Factors are very flexible, and therefore in the structure downloaded from the PDB, the protein is in another state, because of its flexible regions. Therefore, because of the low RMSD between the high B-factors structure and the original structure we can see, that the simulation predicts the structure quite good.
Furthermore, we got a plot of the RMSF values of the protein, which can be seen in Figure 13:
There are a lot regions with high B-factor values. The highest B-factor value can be found on position 150 (Figure 14), but there are also high values on position 110 (Figure 13), 290 (Figure 16), 320 (Figure 17), 410 (Figure 18), 460 (Figure 19) and 490 (Figure 20). If we compare the picture of the original and the average structure, we can see that most of the regions build a very good alignment, whereas some regions vary in their position. Therefore, we want to compare, if these regions are the regions with very high B-factor values.
As we can see in the pictures above (Figure 14 - Figure 20),we can see that there is always a little difference between the two structures. Therefore, this regions seem to be flexible.
Furthermore, we visualized the B-factors with the pymol selection B-factor method. We calculated the B-factors for the blue protein (Figure 21 - Figure 27). If you see red, this part of the protein is very flexible. The brighter the color, the higher is the flexibility of this residue.
In Figure 21 and Figure 22, the color of the protein is turquoise, which shows that there is a relatively high B-factor value. Figure 23 is completely dark blue, so therefore the plot seems to be wrong on this position. The pictures which shows the center of the protein (Figure 24 - Figure 26) also show only a dark blue protein which means that there is a low B-factor value and therefore the plot or the picture are wrong. The last picture (Figure 27) shows us that there is a high B-factor value in the chain.
C-alpha
Now we repeat the analysis done for the protein for the C-alpha atoms of the protein. Therefore, we followed the same steps as in the section above.
To compare the structure we align them with pymol with the original structure.
original & average | original & B-Factors | average & B-Factors |
Perspective one | ||
Perspective two | ||
RMSD | ||
1.373 | 0.279 | - |
As in the section above, the RMSD between the structure with high B-factor values and the original structure is the most similar (Figure 19 and Figure 22). This was expected, because we used twice the same model, but in this case we neglecte the residues of the atoms. But the backbone of the protein remains the same. The other two models (Figure 18, Figure 20, Figure 21 and Figure 23) have nearly the same RMSD value and therefore there are equally.
Furthermore, we got a plot of the RMSF values of the protein, which can be seen on Figure 24:
In this case, there are only three high peaks at position 150, 280 and 310. By comparison to figure 13, these high peaks can also be found in this plot. Furthermore, it was possible to observe these high B-factor values in the pictures.