Normal Mode Analysis BCKDHA
- 1 Introduction
- 2 WEBnm@
- 3 ElNemo
- 4 Anisotropic Network Model web server
- 5 oGNM – Gaussian network model
- 6 NOMAD-Ref
- 7 All-atom NMA using Gromacs on the NOMAD-Ref server
- 8 Discussion
- 9 Advantages and Disadvantages from NMA and MD
- 10 References
Normal Mode Analysis is a powerful tool to examine large global motions of proteins. A normal mode of an oscillating system is a pattern of motion in which all parts of the system move with the same frequency and in phase. Proteins can be modeled as harmonic oscillating systems. NMA methods calculate low-frequency modes for a protein which correspond to collective motions of the complete protein. In this task we applied several NMA-methods to our protein BCKDHA and performed an all-atom NMA with small molecule (2LYZ).
WEBnm@<ref>http://apps.cbu.uib.no/webnma/home</ref> provides two different modes:
The Single Analysis calculates the lowest frequency normal modes of the given protein and offers different types of calculations to analyse the modes that were calculated. The force field used for the Normal Modes Calculations is the C-alpha force field. It uses only the C-alpha atoms of the protein which assigned the masses of the whole residue they represent.
The different types of calculation are:
- deformation energies of each mode
- calculation of normalized squared atomic displacements (results are provided for each low frequency mode, either as raw data or as plots with displacement vs. residue number)
- interactive visualization of the modes using vector field representation or vibrations
Comparative Analysis (beta version):
The Comparative Analysis calculates and compares the normal modes of a set of aligned protein structures. This tool is still under development. It also provides three types of calculations:
- Deformation Energy profiles
- Atomic Fluctuation profiles
- Conformational Overlap Comparison
- Single Analysis: structure file in the pdb format
- Comparative Analysis: a file containing the sequence alignment of the proteins which should be compared and a protein structure file for each of the proteins. The alignment file needs to be written in the Fasta format, and the header line of each sequence should contain the name of the structure file as first field, and the chain in the last field.
Below are the values of the deformation energy for modes 7 to 20
|Mode Index||Deformation Energy|
WEBnm@ visualised the normalized squared atomic displacements for the first five modes (modes 7 to 11).
Figures 1-5 display the first five normal modes of our protein. Figure 6-10 show the square of the displacement of each C-alpha atom, normalized so that the sum over all residues is equal to 100. The highest values correspond to the most displaced regions. Cluster of peaks identify significantly big regions. Isolated peaks reflect local flexibility and are not relevant.
|mode 7||mode 8||mode 9||mode 10||mode 11|
The calculated normal modes of Webnma differ in the amplitude of movement. While modes 7 and 9-11 show the highest peak and therefore the most movement for residues 0-25, mode 8 has the highest peak for the last 40 residues in the sequence.
The normal modes calculated by Webnma show that the most displaced regions of the branched-chain alpha-keto acid dehydrogenase complex are the beginning and the end of the protein sequence. The two ends of the protein sequence which are also the outermost parts of the protein structure show some kind of hinge-movement. The protein motion could be described as an opening and closing complex.
The input for ElNemo is a protein structure in PDB format. From this PDB file only the residues that are encoded by ATOM are used in the calculations. The other residues are not taken into account. If there are other residues which should be used in the calculations they have to be encoded by ATOM. Additionally there are a lot of options which can be chosen.
- Properties of the first 100 lowest frequency modes (frequency, collectivity of atom movement, overlap of each mode with the observed conformational change (if two conformations are available) and its corresponding amplitude)
- 3D animations from three orthogonal viewpoints in large and small sizes
- Comparison of a normal mode perturbed structure and a second conformation in terms of RMSD and number of residues that are closer than 3Å can be done
- Cross plot where the analysis of distance fluctuations between all CA atoms is shown. Red (decreasing) and blue dots (increasing) indicate the residues for which the distance changes significantly in movement. (The upper left corner indicates the first residue. Grey lines are drawn every 10 residues, yellow lines are drawn every 100 residues.)
- ElNémo Webserver<ref>http://www.igs.cnrs-mrs.fr/elnemo/start.html</ref>
- K. Suhre and Y.-H. Sanejouand, ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement<ref>Karsten Suhre and Yves-Henri Sanejouand, ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement, Nucl. Acids Res, 2004</ref>
CA distance fluctuations for the five modes
|mode 7||mode 8||mode 9||mode 10||mode 11|
Figures 11-14 show that the greatest distance fluctuations are between the 10-20 first amino acids and the rest of the protein (residues 50-400). While mode 7 calculated only distance decreases, mode 8 seemed to have calculated almost only increasing distances between the first ~20 residues and the rest of the protein. The cross plots for mode 9 and 10 (figures 13 and 14) show strong distance fluctuations (decreases for residues 1-10 and increases for residues 10-20) between the first 20 residues and the rest of the protein. Mode 11 as displayed in figure 15 calculated completely different distance fluctuations. Here the highest distance fluctuations are between the last 40 residues and the rest of the protein. There are both increasing and decreasing distances. The totally different cross plot leads to the assumption, that the calculated mode 11 differs quite a lot from the other normal modes. It is very likely, that here the last part of the protein shows the greatest movement.
ElNemo prepared different views from three orthologuous viewpoints with MolScript for each mode.
The mode shown in figure 16 agrees with the distance fluctuation seen in figure 11. The very beginning of the peptide chain moves away from the rest of the protein. It looks like a hinge-movement.
The mode displayed in figure 17 shows that the beginning of the peptide sequence moves towards the protein. This observation can be confirmed when looking at the cross plot given in figure 12, where the decreasing distance for the first residues is given by blue dots. This mode shows another hinge-movement.
As seen at the distance fluctuations plot (figure 13), the distances for the first residues in the peptide chain vary, some are decreasing and some are increasing. This can be explained by a twisting peptide sequence, where some residues come closer to the protein core and other move apart.
Mode 10 behaves similarily to mode 9, only the beginning of the protein chain seems not to be twisting but to be pulled in and out. This observation agrees with the increasing and decreasing distances shown in figure 14.
Figure 20 shows a hinge-movement of the last part of the protein sequence. The helix-part shown in red moves to and apart from the protein core, which is also displayed in figure 15.
ElNemo calculated very different normal modes. Some of the normal modes show some kind of hinge-movement at the one end of the protein, another mode shows the movement of the other end in the protein. All in all we can say, that our protein seems only to be flexible at the outermost parts, while the core of the protein is very stable.
Anisotropic Network Model web server
The ANM Webserver<ref>http://ignmtest.ccbb.pitt.edu/cgi-bin/anm/anm1.cgi</ref> provides NMA Analysis with the anisotropic network model (ANM) which is an elastic network (EN).
- PDB id or PDB file
- Chain id
- Model (for multi-model files such as from NMR)
- Cutoff for interaction between Cα atoms in Å (set to 15Å)
- Distance weight for interaction between Cα atoms (set to 3.0)
Output: The ANM Webserver offers a broad range of output files to analyse the computed normal modes more precisely. On the main page you can visualize the calculated 20 first modes. It is possible to scale the amplitude and frequency of motion, to display vectors and the protein in different ways and colors. Furthermore the following options are available
- Download files
- Create PDB (motion)
- Create PyMol script
- Get anisotropic temp. factors
- B-factors/mode fluctuations
- Distance fluctuations and deformation energy
In the following we show the movements, the distance fluctuations, the deformation energies per position and the B-factors for each mode.
As the first 5 modes could not be displayed via pymol, we will analyse modes 6-11 in the following section.
Overall energy: 108.50491155
Mode 6 shows two centers of movement: First, the beginning of the peptide sequence is twisting and turning as shown in Figure 21a. This fluctuations can also be seen in the fluctuation of individual residues according to experimental b-factors (Figure 21b). The distance matrix (Figure 21c) shows that most of the residues in the protein are correlated (red), only the residues at the beginning of the peptide chain are anti-correlated (blue). The white zones indicate weak correlations. So besides the very anti-correlated beginning of the peptide chain, there is a part in the end of the protein that seems to be correlated weakly. This can also be observed when looking at Figure 21a, where another hinge-movement can be detected at the right.
Overall energy: 105.33659403
Mode 7 shows twisting movements at both ends of the protein (see Figure 22a). The distribution of b-factors (Figure 22b) and the correlation matrix (Figure 22c) agree with this observation.
Overall energy: 114.7349691
The movements given in mode 8 (Figure 23a) seem to be very similar to the ANM mode 7. But looking at the errors in Figure 23a one can see that the movement goes in exactly the opposite direction. The fluctuations per residue for mode 8 (Figure 23b) show also a much higher amplitude than for mode 7.
Overall energy: 187.72640385
Mode 9 displays a small turning movement of the beginning of the peptide sequence (see Figure 24a). This observation can be confirmed when looking at the fluctuations of single residues (Figure 24b), where a peak exists only for the residues 6-30, and at the correlation matrix (Figure 24c), which shows a highly anti-correlated region for the beginning of the peptide sequence. Again, as in mode 6, the end part of the protein shows some evidence of movement, too, as the correlation here is very weak.
Overall energy: 183.87695037
The ANM mode 10 shows the most movement of the protein. The whole protein seems to be turning and both ends are moving like hinges (see Figure 25a). This strong movements can also be detected when looking at the many small peaks in the b-factors (Figure 25b) and the small zones of weak and no correlation in the distance matrix (Figure 25c).
Almost all modes calculated by ANM and discussed above agree in the following point: The flexible regions of our protein are the beginning and the end of the protein sequence. Only mode 10 differs from this observation, as a lot more motions all over the whole protein are visible. These motions however are still very weak according to the twisting and wiggling sequence ends.
oGNM – Gaussian network model
The oGNM Webserver<ref>http://ignm.ccbb.pitt.edu/Online_GNM.htm</ref> calculates the equilibrium dynamics of any structure submitted in PDB format, using the Gaussian Network Model (GNM).
- PDB id or PDB file
- No. of nodes to represent a nucleotide (1 or 3)
- Cutoff for amino acid pairs
- Cutoff for nucleotide pairs
- Preferred visualization engine (Jmol or Chime)
Output: The oGNM Webserver provides an comprehensive overview over the first 20 calculated normal modes. It is possible to display the slow modes, slow eigenvectors, slow average, slow av1-3 and RMSD of two modes side-by-side. The output includes:
- The mobility profiles of residues corresponding to the 20 slowest modes of motion predicted by the GNM
- The average profile resulting from the first 2 slowest modes
- The associated eigenvalues (21 of them, including the zero eigenvalue)
- The predicted and experimental B-factors, and the correlation coefficient between the two sets of B-factors
- The spring constant (g) in units of kcal/mol.Å2
- The cross-correlation between residue fluctuations, plotted as a correlation map (for structures containing less than 2000 nodes)
- The nodes included in the GNM analysis, summarized in the .ca file
In the following section we are going to discuss the 5 lowest frequency modes (after the first six zero modes) calculated by oGNM. The following figures show the mobility of the protein for each computed normal mode, colored from blue to red in the order of increasing mobilities, as well as the fluctuations per residue.
The oGNM mode 7 shows a mobile part at the one end of the protein (Figure 26a). This mobility is also displayed in the fluctuations per residue (Figure 26b), where a peak for residues 18-23 indicates high flexibility, while the rest of the protein seems to be very stable.
As in mode 7 there are peaks in the beginning of the protein sequence in mode 8(see Figure 28b). But contrary to mode 7 there are three peaks, indicating three separated centers of movement with a stable part in between. Additionally to these flexible parts there are some regions in the middle of the protein but these peaks are very low so it seems that these parts are also quite stable. This is also shown in the visualization of the protein since the parts in the middle are only light pink (see Figure 28a) which indicates that they are probably stable.
We have the same picture in mode 9 as in the modes before (see Figure 28a). There are two main peaks between position 16 and 23 and a very small peak at position 10 (see Figure 28b) suggesting that there are two flexible regions in the beginning. The really low peak in the beginning and also the one at position 300 can be neglected since they are so low so that we can not be sure about the flexibility. The rest of the protein is predicted to be stable.
In mode 10 the most flexible part is completely in the beginning of the protein at position 10 and the peaks which are a bit lower are between position 16 and 23 (see Figure 29b). Although it is completely the other way around than in mode 9, both modes point out that these regions are flexible. This is visualized by the red coloring of these parts of the protein (see Figure 29a). As in mode 8 and 9 there is a low peak around position 300. Since this peak is in all three modes we have to implicate it but in all three modes the peak is really low so it is probably a region with only a little flexibility.
Mode 11 is more as mode 7 because of missing a peak in the middle of the protein (see Figure 30a). In this mode there are two peaks in the beginning of the protein indicating that this is a very flexible region. The rest of the protein is completely stable shown by the blue coloring (see Figure 30b) and the fact that there is no peak after position 24.
By comparing the fluctuations of all five modes with each other (see Figure 31a) we can see that although the peaks are not all at the exactly same position they are all in the beginning of the protein indicating that all modes predict this region to be flexible and the rest of the protein is more or less stable. Figure 31b shows the cross correlations between residue fluctuations for modes 7-11. Fluctuation vectors in the same direction have values of +1 and are colored dark red indicating the motions are fully correlated. Fully anti-correlated motions are displayed in dark blue and are given by values of around -1. By the light blue regions on the left and top border we can see that these regions are either correlated nor anti-correlated. This can be explained by the results above because the several peaks at the beginning of the protein are never really the same but variate a lot. But these regions differ completely from the rest of the plot which is partly dark red or dark blue. It is not possible to conclude which regions are flexible and which are stable basing on this plot but we can see that the beginning of the protein sequence has to be different than the rest of the protein.
By comparing all five modes with each other we can see that they are all very similiar. In all five cases the region in the beginning of the protein is predicted to be flexible. It is not completely clear which positions are more flexible or which are more stable but overall the beginning of the protein is flexible and the rest is stable. This is shown very good in the plot where all fluctuations are compared with each other. Additionally the cross correlation plot indicates that the beginning of the protein is different than the rest of the protein.
One disadvantage of this server is that it doesn't provide output pdbs which could be used to generate animated gif-pictures. The color code and the residual fluctuation plots however are very clear and straightforward so it isn't hard to identify flexible and stable regions. It is, however, not possible to determine the way of movement from the still image.
The NOMAD <ref>[]</ref> server provides a lot of information and options. The interface is quite user friendly as all available parameter choices are explained in detail and there is also the runtime listed for an example NMA, which can be used to estimate the runtime for our own jobs.
The following parameters can be set:
- Number of modes to calculate
- As specified in the task description we wanted to obtain 10 modes. NOMAD does six zero modes which are just translation and rotation. Therefore we set the number of modes to calculate to 16.
- Distance weight parameter
- This parameter is used to introduce a smoother cutoff value that in the original Tirion model. All distances are weightend by exp(-(d_ij/d)^2), where d is the distance weight parameter. As proposed by NOMAD a distance weight parameter of 3Å is well suited for CA-only models. As we are doing no all-atom calculation, the distance weight parameter was set to 3.0Å.
- Cutoff to use for mode calculation
- The cutoff describes which pairs of atomes are linked by a spring of universal length according to the Tirion model (Elastic Network Model). The cutoff was set to 15Å.
- Average Rmsd in output trajectories
- For the average RMSD the default value (3.0) was used.
- Method to use
- Full matrix solver
- Sparse matrix solver
- Here we used the default option, the automatic mode.
The output contains one PDB file and one plot per mode. The plot contains the rmsd per residue, which can be interpreted as the amplitude of movement and which is controlled by the average rmsd of trajectory (input parameter).
|mode 7||mode 8||mode 9||mode 10||mode 11|
In general we can say that only one part of the protein shows motion. The five modes calculated by NOMAD-Ref show all the same kind of movement for the loop at the beginning of the protein (see Figures 32-36, green loop on the right). Figures 37-41 confirm this observation as they all show a high amplitude of movement for the first atoms in the protein, and only very little peaks for the rest of the protein. The only difference is mode 11, where a small peak at the end of the protein can be detected. This peak corresponds to the hinge-movement of the end of the protein sequence (in the picture on the left side of the protein).
All-atom NMA using Gromacs on the NOMAD-Ref server
In order to do the all-atom NMA we needed an appropriate small molecule that contained not more than 2000 atoms. This small protein was found by searching for "all atom nma". We found a paper <ref>Hetunandan Kamisetty, Eric P. Xing and Christopher J. Langmead: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation[]</ref>, where they used the structure of a hen egg-white lysozyme for an all atom NMA. So we did all the calculations for the corresponding PDB entry 2lyz.
First, we needed to prepare our PDB file. The PDB file for 2LYZ protein contains 1001 atoms in total, all lines not beginning with "ATOM" were removed from the PDB file.
The following movies show the all-atom NMA for 2LYZ at 600K
|mode 1||mode 2||mode 3|
The following movies show the all-atom NMA for 2LYZ at 2000K
|mode 1||mode 2||mode 3|
Comparison to an Elastic Network
|mode 1||mode 2||mode 3|
The results for the all-atom NMA for 2LYZ look all very similar. The calculated modes show small motions of 2LYZ, looking like a "breathing" molecule. There is no apparent difference in the different modes calculated at one temperature. Furthermore the motion does not seem to be dependent of the temperature as all modes calculated at 600K and 2000K look similar.
We have applied five different methods to calculate normal modes for BCKDHA. All methods agree in their reported movements. Each method returned a large, hinge-like motion of the C-terminal loop region. Some calculations also show a twisting and wiggling motion of this region. Some other modes show a hinge-movement of the N-terminal helix region. As all methods reported that only the terminal protein regions are flexible we conclude that these movements are functionally important. They could possibly be crucial for binding ligands or another protein. With the knowledge that the branched-chain alpha-keto dehydrogenase is an enzyme complex consisting of two alpha-subunits (encoded for by BCKDHA) and two beta-subunits (encoded by BCKDHB) one can assume that these movements might be necessary for building up the protein complex. Figure 55 shows that the flexible regions (especially the C-terminal loop) are intertwined. This structure might be obtained via the movements the flexible C-terminal loop can perform as we have seen before.
Advantages and Disadvantages from NMA and MD
The basis of Molecular Dynamics (MD) is the solvation of Newton’s equations of motion to yield a trajectory of atomic positions. With MD the molecular motion (including small and large structural fluctuations and comformational transitions) can be described realistically. Therefore MD can be used to reveal structural changes in a protein at the atomic level. Moreover, the effect of the solvent can be taken into account. MD is limited by the approximate nature of the force ﬁelds and by the relatively short time scale (of the order of a nanoseconds) that is computationally accessible.
NMA is a very powerful method to gain insight into the large-scale, shape-changing motions in proteins. The motion is modeled asa superposition of a set of independent harmonic oscillations about the equilibrium atomic position. NMA uses the same force ﬁelds as used in molecular dynamics simulations, but the idea behind it is very simplified. It is assumed that the conformational energy surface at an energy minimum can be approximated by a parabola over the range of thermal fluctuations (which is not correct at physiological termperatures).
Elastic Network models are a special form of NMA, where the protein model is drastically simplified. The distances of all of the elastic connections are taken to be at their minimum energy length, therefore there is no need for energy minimization because. Second, the number of atoms is reduced as only the C-alpha atoms are used. This simplification brings a major advantage in the diagonalization task of NMA.
Advantages of NMA over MD are:
- fast calculations
- no sampling problem
Compared to the
very detailed protein motions that are calculated by Molecular Dynamics simulations, NMA shows only large amplitude motions of the protein. A great advantage of NMA is its speed. As only motions of huge protein parts are calculated, it is much faster than MD.
go back to Maple syrup urine disease main page
go back to Task 8: Molecular Dynamics Simulations
go to Task 10: Molecular Dynamics Analysis