Latest revision as of 20:03, 29 September 2011

Wildtype

A brief check of results

Protocol and background information

How many frames are in the trajectory file and what is the time resolution?

frames: 2001
time resolution: 5ps

How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?

real time: 9h27:35
simulation speed: 25.370 ns/day
simulation speed: 107991 years/second

Which contribution to the potential energy accounts for most of the calculations?

potential energy: -1.24431e+06

Visualization of results

Protocol and background information

Figure1a: MD simulation of the movement of BCKDHA

Figure1b: Visualisation of the simulated protein with nmgx

In Figure 1a the motion of the protein and especially of the side chains is shown. As we can see the part on the right side of the protein which is colored blue has the most movement. Figure 1b shows another visualisation of the protein which is produced with ngmx.

Quality assurance

Quality assurance is a step to find out whether an equilibrium of the system was reached. Therefore tests are performed in which the convergence of thermodynamic parameters (temperature, pressure, potential and kinetic energy) examined. The following section shows the results for the quality assurance analysis for the wildtype protein.

Energy calculations

Protocol and background information

Pressure

Figure2: Plot of the pressure during the MD simulation

Energy	Average	Err.Est.	RMSD	Tot-Drift (bar)
Pressure	1.01601	0.015	71.2152	-0.0706383

In Figure 2 the pressure of the molecular dynamic system is shown. The average value is 1.0161 bar which is shown in the table above. But as we can see the pressure ranges from about -250 bar to 250 bar. Since there is such a big range of 500 bar we would say that the temperature of this simulation does not convergate to a specific value.

Temperature

Figure3: Plot of the temperature during the MD simulation

Energy	Average	Err.Est.	RMSD	Tot-Drift (K)
Temperature	297.941	0.0047	0.954498	0.00557078

In Figure 3 the temperature of the MD simulation is shown. As we can see it ranges between 294 K and 302 K so it has a very small deviation of the average value of 297.9 K. Since there is only such a small fluctuation we can see that the temperature in the system is quite stable which means that it reached an equilibrium.

Potential

Figure4: Plot of the potential during the MD simulation

Energy	Average	Err.Est.	RMSD	Tot-Drift (kJ/mol)
Potential	-1.24431e+06	66	1041.57	-463.992

Figure 4 shows the potential of the md system. As we can see in the picture the potential ranges from -1.24e+06 kJ/mol to -1.25e+06 kJ/mol. Although this is a very huge range of 10000 we can see that all in all the potential is very low. This low potential indicates that the protein is quite stable. Since the structure of a protein is responsible for the function of a protein this stability is important for the function of the protein.

Total Energy

Figure5: Plot of the total energy during the MD simulation

Energy	Average	Err.Est.	RMSD	Tot-Drift (kJ/mol)
Total Energy	-1.02119e+06	65	1279.76	-459.819

The low potential energy already indicated that the total energy of the system has to be quite low. By looking at Figure 5 we can see that the energy is a bit higher than the potential energy but it is still very low. Additionally there is less variation in the energy since it ranges between -1.017e+06 kJ/mol and -1.025e+06 kJ/mol. Again we can say that such a low energy stands for a stable protein which indicates that the simulation was correct.

Minimum distance between periodic boundary cells

The determination of the minimum distance between periodic boundary cells is a crucial part in the quality assurance of an MD simulation. In this step you have to verify that there were no direct interactions between periodic images, as interactions between atoms of the same molecule over the periodic boundary would disturb the native behaviour of the protein. This would lead to invalid molecular dynamics results. Therefore we have to check that the minimum distance is greater than 0.9.

Protocol and background information

Figure6: Minimum distance between periodic boundary cells

The shortest periodic distance is 1.40945 nm at time 6090 ps between atoms 25 and 6490. As we can see in Figure 6 the values range between 1 nm and 4 nm during the whole simulation. It is interesting that the most variation is in the middle part of the simulation so we can say that in the middle of the simulation. Since the fluctuation of the values correspond with the movement and flexibility of the protein we can say that the protein is very flexible between 4000 ps and 7000 ps. The rest of the time it is also flexible but less than in this period.

Root mean square fluctuations

Protocol and background information

RMSF for protein	RMSF for C-alpha
Figure7: RMSF for protein	Figure8: RMSF for C-alpha

Figure 7 shows the RMSF for the whole protein. The beginning of the protein shows the most fluctuation. This indicates that this region of the protein is the one with the most flexibility. In the middle part of the protein there are nearly no peaks which means that this part is very stable. In the end there is a peak which is a bit higher than the RMSF of the rest of the protein. This could suggest that this terminal region is also a bit flexible.
In Figure 8 we did the RMSF calculations not for the whole protein but only for the C-alpha atoms. Those atoms are the central carbon atoms of each amino acid of the protein which means that in this calculation only the backbone of the protein is considered. But by comparing the two plots above with each other we can see that in both cases the beginning of the protein is the part which has definitly the highest fluctuation. So not only the side chains differ from the average structure but also the backbone which indicates for a strong flexibility.

Pymol analysis of average and bfactors

The average.pdb file was produced automatically during the calculation of the RMSF because it is needed for comparisons. This file contains the average structure of the protein. Because of the option -oq the bfactor.pdb file was produced additionally. In this file the temperature factors (bfactors) are calculated and added to a reference structure by coloring the specific regions of the structure. Normally the parts of the protein which are most flexible have also the highest temperature. To find out if this is the fact in our case we used pymol to analyze the average structure and the bfactors structure. Additionally we compared the predicted average structure with the original structure of our protein.

Protein

1u5b/average	1u5b/bfactors	bfactors/average
Figure9: Alignment of the experimental structure with the average structure	Figure10: Alignment of the experimental structure with the structure containing the bfactors	Figure11: Alignment of the average structure with the structure containing the bfactors
RMSD: 1.169	RMSD: 0.377	RMSD: 1.422

To find out how accurate the calculated average structure is we aligned it with the experimental structure of our protein (1u5b). As we can see in Figure 9 and additionally because of the RMSD value of 1.169 the superposition of the two structures is not covering perfectly. The middle part of the protein is aligned quite good. The most deviating parts are the two ends of the structures on the left and on the right side of the picture. By looking at the already discussed RMSF we can see that these regions are the most flexible ones so it is possible that the two structures are only in two different states of movement. Next we compared the structure of 1u5b with the structure containing the bfactors and according to Figure 10 the used reference structure on which the bfactors are added is the structure of 1u5b. This is obvious since they are superposed nearly perfectly. There is a minimal shift in the alignment but since this occurs at the whole structure we consider it to be an error of the superposition tool. Now we come to the analysis of the bfactors. In Figure 11 we can see the alignment of the structur containing the bfactors with the average structure. Of course they are a bit different again because of the different states of movement but in this figure the bfactors are the most interesting observation. As it is shown in the picture only the part in the end of the protein (left side of the picture) is colored indicating that only this part of the protein is flexible. The coloring ranges from yellow to red where yellow stands for little and red for high flexibility. This flexibility according to the bfactors is reflected by the RMSF value above. The other end of the protein is not colored but only a bit shifted. We thought that this shift could be a result of a movement but since it is not colored this theory is perhaps false. But by looking at the RMSF values above we see that there is only a very little fluctuation of the atoms. So perhaps this part is a little bit flexible but not enough to be marked as flexible by the bfactors.

C-alpha

1u5b/average	1u5b/bfactors	bfactors/average
Figure12: Alignment of the experimental structure with the C-alpha atoms of the average structure	Figure13: Alignment of the experimental structure with the C-alpha atoms of the structure which contains the bfactors	Figure14: Alignment of the C-alpha atoms of the average structure with the C-alpha atoms of the structure containing the bfactors
RMSD: 0.955	RMSD: 0.300	RMSD: 0.993

To analyse the run where we only considered the C-alpha atoms of the structure again we first wanted to find out how good the calculated average structure fits the experimental structure. As we can see in Figure 12 there is again a lot of variation between the average structure and the structure of 1u5b. But by looking at the RMSD value (0.955) we can see that it is smaller than the RMSD value considering the whole protein for the average structure (1.169). Since we assumed that the variation is aroused by the different states of movement in the different structures we can say that the backbone has a bit less flexibility because of the lower RMSD value. The most variation is again in the both terminal regions of the protein. Next the comparison of the experimental structure with the C-alpha atoms of the structure containing the bfactors is analysed. Figure 13 and also the very low RMSD value show that the superposition of the two structures is very close. Perhaps there is a little bit of variation between the two structures or we have the same case as in Figure 10 where we assumed a mistake of the programm since there was a shift during the whole alignment. It is hard to see it here because of the spheres. The last analysis is the detailed one of the structure containing the bfactors ( Figure 14). Again the part with the highest temperature is colored where red means high flexibility and green low flexibility. As we can see only the end of the protein (right side) is colored so only this part of the protein shows strong flexibility. This observation agrees with the RMSF because in both cases the beginning of the protein is predicted to be flexible.

Radius of gyration

The radius of gyration reflects how the structure changes during the simulation and how the shape changes during the time.

Protocol and background information

Figure15: Radius of gyration during the MD simulation

According to the black line in Figure 15 the radius of gyration ranges between 2.22 and 2.4. nm during the whole simulation. The black line describes the general change of the shape of the protein. By looking at the plot more closely we can see that there is a trend. In the beginning the radius has its maximal value of about 2.4 nm but during the simulation it falls half of the time. But after about 6300 ps the decline of the radius stopps. From then on the value is between 2.23 and 2.27 nm which shows that the fluctuation is very small. But the fact there is still variation until the end of the simulation shows that the protein moves all the time indicating the flexibility of the protein. The changes in the profile of the protein are specified by the red (x axis), green (y axis) and blue (z axis) lines.

Structural analysis

Protocol and background information

Solvent accessible surface area

The solvent accessible surface area (SASA) of a protein is the part of the surface which is reachable a solvent. This definition of SASA can be devided into two subgroups - hydrophilic SASA and hydrophobic SASA. This shows that the possibility that a solvent can reach the surface depends on its properties.

Protocol and background information

SAS over time per residue	SAS over time per atom	Solvent accessible surface
Figure16: Plot of the average solvent accessibe surface over time per residue	Figure17: Plot of the average solvent accessibe surface over time per atom	Figure18: Plot of the solvent accessible surface of the protein during the md simulation

In Figure 16 the average sas for each residue during the simulation is shown. We can see that there is much variation and the solvent asseccible areas for the residues range between 0 nm² and 2.3 nm². As there are also regions which have a sas of 0 nm² we can see that there are parts which are not accessible for solvents but the most regions are accessible. The most accessible one is in the total beginning since the peak is definitly the highest one. Additionally there are two high peaks in the middle of the protein which differ completely from the peaks next to them since they are all quite low. This shows that there are only a few parts in the the center of the protein which are accessible for solvents but here the accessibility is very good. In Figure 17 the average solvent accessibe surface over time per atom is shown. Again there is a lot of variation in the sas. It ranges from 0 nm² to 0.55 nm². The last plot ( Figure 18) shows the general sas for the whole protein during the simulation. The red line describes the accessibility for hydrophilic solvents and the black line for hydrophobic solvents. As we can see the accessibility for hydrophobic solvents is a little bit higher but not a lot. The green line which hardly fluctuates shows the general sas for the protein during the whole simulation indicating that the sas is always quite the same.

Hydrogen bonds

Protocol and background information

protein and protein	protein and water
Figure19a: Internal hydrogen bonds and pairs within 0.35 nm during the simulation	Figure20a: Hydrogen bonds with the surrounding solvents and pairs within 0.35 nm during the simulation
Figure19b: Internal hydrogen bonds during the simulation	Figure20b: Hydrogen bonds with the surrounding solvents during the simulation

	Donors	Acceptors	avg.# of h-bonds	possible # of h-bonds
protein-protein	594	1158	308.847	343926
protein-water	29470	30034	806.073	4.42551e+08

Figure 19a (left) shows the number of internal hydrogen bonds during the simulation. According to the black line in this plot which describes the hydrogen bonds the number of bonds is about 300. This number is supported by the table above. Since the black line shows nearly no variation during the whole simulation it seems that there is no change in the number of hydrogen bonds. But by looking at Figure 19b we can see that the number of hydrogen bonds change since there is much fluctuation in the curve. Although the number of hydrogen bonds varies between 280 and 335 a trend can be seen. In the beginning the average number is about 310 then it goes down to about 300 and rises again to 320. So we see that the number of hydrogen bonds first declines a bit but after one third of the simulation it rises again. By comparing this trend with the one shown in Figure 20b we can say that they are completely contrary. First the number of hydrogen bonds is low, then rises a bit and after about one third of the time it falls again. It has to be recognized that the number of extrenal hydrogen bonds is always much higher than the number of internal ones since it ranges from 740 to 860 but it is interesting that they are completely opposed. It is obvious that they have to be like this because of the movement of the shape of the protein. Since there is movement which is indicated by the alternating hydrogen bonds we can say that the protein is very flexible during the simulation. The red lines in Figure 19a and Figure 20a display the pairs within 0.35 nm. There are much more pairs within this distance inside of the protein (1400-1500 pairs) than with the surrounding solvents (1000 -1200 pairs). Additionally there is much more variation in the number of pairing with the solvents during the simulation than inside of the protein.

Ramachandran plot

Protocol and background information

Ramachandran plot of our simulation	general Ramachandran plot
Figure21: Ramachandran plot of our protein	Figure22: General Ramachandran plot (<ref>http://en.wikipedia.org/wiki/File:Ramachandran_plot_general_100K.jpg</ref>)

Figure 21 shows the Ramachandran plot of the protein predicted by MD. As we can see the regions for beta sheets and alpha helices are very black and also the part for lefthanded helices. Additionally to these fields the other three corners are black. By comparing it to the general Ramachandran plot (Figure 22) we can say that there are much more black fields in the plot of the simulation. This shows that the angles are not that concentrated on one position but vary a lot. Since there are regions which are completely white it is obvious that some positions and angle combinations do not occur in the simulated protein. The fact that there are so many different angle positions and not only the ones like in the general Ramachandran plot could indicate that this protein is flexible.

Analysis of dynamics and time-averaged properties

RMSD matrix

Protocol and background information

Figure23: RMSD matrix of the structures of our protein during the simulation

Figure 23 shows the correlation between the several structures of our protein during the simulation. It is obvious that there has to be a diagonal which is turquoise and blue, as there is no distance between two identical structures. As we can see there is only one other part in the matrix which is turquoise and it is in the end of the simulation between 6000 ps and 10000 ps. This shows that these groups of structures are all quite similar and the simulations reaches an equilibrium. Additionally there are red parts between 6500 ps till 9000 ps and 1000 ps till 2500 ps. This shows that the group of structures which are similar in the end are quite different from the groups in the first part of the simulation. It is also very interesting that the groups of structures which are completely in the beginning of the simulation seem to be very different to the whole rest of structures during the simulation since the border of the matrix is red and only in the bottom left corner it is colored green. This indicates that the structures at the beginning of the MD simulation change a lot at that time as no energetically favourable structure had been found yet.

Cluster analysis

Protocol and background information

Figure24: Visualisation of the cluster of structure groups

Figure25: Plot of the RMSD values of the clusters

The programm was able to find 542 cluster. In figure Figure 24 the clustered structures of the protein are visualised. The plot of the clustures show the RMSD values of the different plots. The RMSD values range from 0.07 nm to 0.57 nm. The fact that these values are quite low indicates that the all of the groups of structures are not completely different. As we can see most of the clusters have an RMSD value of about 0.35 which shows that that the main part of the structures have a bit similarity to other groups of structures. There is also a little number of groups with a value of 0.57 which shows that these groups of structures only have a bit similartiy during the simulation. Since the peaks between 0.1 and 0.2 are very small there only a few groups of structures which show a very high similarity during the simulation.
Furthermore we compared two of the clusters to each other by comparing the structures. We chose cluster 1 and cluster 2 for this comparison. Since the RMSD value is 0.709 we can see that the clusters are not completely different and there are groups of structures in the clusters which still have similarities.

Internal RMSD

Protocol and background information

Figure26: Internal RMSD of our protein during the simulation

The internal RMSD values ranges from 0.1 nm to 0.45 nm so we can see that there is a lot fluctuation during the simulation. As we can see in Figure 26 in the beginning the values are very low but then they rise very fast until 1500 ps. After this point they only range from 0.3 nm to 0.4 nm, which is not a huge variation. After about 5000 ps they rise again a bit so that the average value for the following time is 0.4 nm. After 10000 ps it seems that the RMSD converges against 04.nm.

Mutation M82L

A brief check of results

Protocol and background information

How many frames are in the trajectory file and what is the time resolution?

frames: 2001
time resolution: 5

How long did the simulation run in real time (hours), what was the simulation speed (ns/day) and how many years would the simulation take to reach a second?

real time: 1d03h11:10
simulation speed: 8.828 ns/day
simulation speed: 310388 years/second

Which contribution to the potential energy accounts for most of the calculations?

potential energy: -1.24452e+06

Visualization of results

Protocol and background information

Figure27a: MD simulation of the movement of the mutated BCKDHA

Figure27b: Visualisation of the simulated protein with ngmx

In Figure 27a the movement of the whole protein and especially of the side chains is shown. As we can see the part on the bottom left side of the protein which is colored blue has the most movement. Additionally to this part the red part on the ride side also seems to show motion but not as much as the blue colored part. Figure 27b shows another visualisation of the protein which is produced with ngmx.