Difference between revisions of "Normal Mode Analysis BCKDHA"
(→All-atom NMA using Gromacs on the NOMAD-Ref server)
|Line 1:||Line 1:|
== Introduction ==
Normal Mode Analysis is a powerful tool to examine large global motions of proteins. A normal mode of an oscillating system is a pattern of motion in
which all parts of the system move with the same frequency and in phase. Proteins can be modeled as harmonic oscillating systems. NMA methods calculate low-frequency modes for a protein which correspond to collective motions of the complete protein.
In this task we applied several NMA-methods to our protein BCKDHA and performed an all-atom NMA with small molecule (2LYZ).
== WEBnm@ ==
== WEBnm@ ==
Revision as of 14:28, 13 August 2011
- 1 Introduction
- 2 WEBnm@
- 3 ElNemo
- 4 Anisotropic Network Model web server
- 5 oGNM – Gaussian network model
- 6 NOMAD-Ref
- 7 All-atom NMA using Gromacs on the NOMAD-Ref server
- 8 Discussion
- 9 Advantages and Disadvantages from NMA and MD
- 10 References
Normal Mode Analysis is a powerful tool to examine large global motions of proteins. A normal mode of an oscillating system is a pattern of motion in which all parts of the system move with the same frequency and in phase. Proteins can be modeled as harmonic oscillating systems. NMA methods calculate low-frequency modes for a protein which correspond to collective motions of the complete protein. In this task we applied several NMA-methods to our protein BCKDHA and performed an all-atom NMA with small molecule (2LYZ).
WEBnm@<ref>http://apps.cbu.uib.no/webnma/home</ref> provides two different modes:
The Single Analysis calculates the lowest frequency normal modes of the given protein and offers different types of calculations to analyse the modes that were calculated. The force field used for the Normal Modes Calculations is the C-alpha force field It uses only the Calpha atoms of the protein which are assigned the masses of the whole residue they represent.
The different types of calculation are:
- deformation energies of each mode
- calculation of normalized squared atomic displacements (results are provided for each low frequency mode, either as raw data or as plots with displacement vs. residue number)
- interactive visualization of the modes using vector field representation or vibrations
Comparative Analysis (beta version):
The Comparative Analysis calculates and compares the normal modes of a set of aligned protein structures. This tool is still under development. It also provides three types of calculations:
- Deformation Energy profiles
- Atomic Fluctuation profiles
- Conformational Overlap Comparison
- Single Analysis: structure file in the pdb format
- Comparative Analysis: a file containing the sequence alignment of the proteins which should be compared and a protein structure file for each of the proteins. The alignment file needs to be written in the Fasta format, and the header line of each sequence should contain the name of the structure file as first field, and the chain in the last field.
Below are the values of the deformation energy for modes 7 to 20
|Mode Index||Deformation Energy|
WEBnm@ visualised the normalized squared atomic displacements for the first five modes (modes 7 to 11). Figures 1-5 display the first five normal modes of our protein. Figure 6-10 show the square of the displacement of each C-alpha atom, normalized so that the sum over all residues is equal to 100. The highest values correspond to the most displaced regions. Cluster of peaks identify significantly big regions. Isolated peaks reflect local flexibility and are not relevant.
|mode 7||mode 8||mode 9||mode 10||mode 11|
The calculated normal modes of Webnma differ in the amplitude of movement. While modes 7 and 9-11 show the highest peak and therefore the most movement for residues 0-25, mode 8 has the highest peak for the last 40 residues in the sequence.
The normal modes calculated by Webnma show that the most displaced regions of the branched-chain alpha-keto acid dehydrogenase complex are the beginning and the end of the protein sequence. The two ends of the protein sequence which are also the outermost parts of the protein structure show some kind of hinge-movement. The protein motion could be described as an opening and closing complex.
The input for ElNemo is a protein structure in PDB format. From this PDB file only the residues that are encoded by ATOM are used in the calculations. The other residues are not taken into account. If there are other residues which should be used in the calculations they have to be encoded by ATOM. Additionally there are a lot of options which can be chosen.
- properties of the first 100 lowest frequency modes (frequency, collectivity of atom movement, overlap of each mode with the observed conformational change (if two conformations are available) and its corresponding amplitude)
- 3D animations from three orthogonal viewpoints in large and small sizes
- Comparison of a normal mode perturbed structure and a second conformation in terms of RMSD and number of residues that are closer than 3Å can be done
- cross plot where the analysis of distance fluctuations between all CA atoms is shown. Red (decreasing) and blue dots (increasing) indicate the residues for which the distance changes significantly in movement. (The upper left corner indicates the first residue. Grey lines are drawn every 10 residues, yellow lines are drawn every 100 residues.)
- ElNémo Webserver<ref>http://www.igs.cnrs-mrs.fr/elnemo/start.html</ref>
- K. Suhre and Y.-H. Sanejouand, ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement<ref>Karsten Suhre and Yves-Henri Sanejouand, ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement, Nucl. Acids Res, 2004</ref>
CA distance fluctuations for the six modes
|mode 7||mode 8||mode 9||mode 10||mode 11|
Figures 11-14 show that the greatest distance fluctuations are between the 10-20 first amino acids and the rest of the protein (residues 50-400). While mode 7 calculated only distance decreases, mode 8 seemed to have calculated almost only increasing distances between the first ~20 residues and the rest of the protein. The cross plots for mode 9 and 10 (figures 13 and 14) show strong distance fluctuations (decreases for residues 1-10 and increases for residues 10-20) between the first 20 residues and the rest of the protein. Mode 11 as displayed in figure 15 calculated completely different distance fluctuations. Here the highest distance fluctuations are between the last 40 residues and the rest of the protein. There are both increasing and decreasing distances. The totally different cross plot leads to the assumption, that the calculated mode 11 differs quite a lot from the other normal modes. It is very likely, that here the last part of the protein shows the greatest movement.
ElNemo prepared different views from three orthologuous viewpoints with MolScript for each mode.
The mode displayed in figure 16 agrees with the distance fluctuation seen in figure 11. The very beginning of the peptide chain moves away from the rest of the protein. It looks like a hinge-movement.
The mode shown in figure 17 shows that the beginning of the peptide sequence moves towards the protein. This observation can be confirmed when looking at the cross plot given in figure 12, where the decreasing distance for the first residues is given by blue dots. This mode shows another hinge-movement.
As seen at the distance fluctations plot (figure 13), the distances for the first residues in the peptide chain vary, some are decreasing and some are increasing. This can be explained by a twisting peptide sequence, where some residues come closer to the protein core and other move apart.
Mode 10 behaves similarily to mode 9, only the beginning of the protein chain seems not to be twisting but to be pulled in and out. This observation agrees with the increasing and decreasing distances shown in figure 14.
Figure 20 shows a hinge-movement of the last part of the protein sequence. The helix-part shown in red moves to and apart from the protein core, which is also displayed in figure 15.
ElNemo calculated very different normal modes. Some of the normal modes show some kind of hinge-movement at the one end of the protein, another mode shows the movement of the other end in the protein. All in all we can say, that our protein seems only to be flexible at the outermost parts, while the core of the protein is very stable.
Anisotropic Network Model web server
The ANM Webserver<ref>http://ignmtest.ccbb.pitt.edu/cgi-bin/anm/anm1.cgi</ref> provides NMA Analysis with the anisotropic network model (ANM) which is an elastic network (EN).
- PDB id or PDB file
- chain id
- model (for multi-model files such as from NMR)
- cutoff for interaction between Cα atoms in Å (set to 15Å)
- distance weight for interaction between Cα atoms (set to 3.0)
Output: The ANM Webserver offers a broad range of output files to analyse the computed normal modes more precisely. On the main page you can visualize the 20 first modes calculated. It is possible to scale the amplitude and frequency of motion, to display vectors and the protein in different ways and colors. Furthermore the following options are available
- download files
- create PDB (motion)
- create PyMol script
- get anisotrpic temp. factors
- B-factors/mode fluctuations
- Distance fluctuations and deformation energy
In the following we show the movements, the distance fluctuations, the deformation energies per position and the B-factors for each mode.
As the first 5 modes could not be displayed via pymol, we will analyse modes 6-11 in the following section.
Overall energy: 108.50491155
Mode 6 shows two centers of movement: First, the beginning of the peptide sequence is twisting and turning as shown in Figure 21a. This fluctuations can also be seen in the fluctuation of individual residues according to experimental b-factors (Figure 21b). The distance matrix (Figure 21c) shows that most of the residues in the protein are correlated (red), only the residues at the beginning of the peptide chain are anti-correlated (blue). The white zones indicate weak correlations. So besides the very anti-correlated beginning of the peptide chain, there is a part in the end of the protein that seems to be correlated weakly. This can also be ovserved when looking at Figure 21a, where another hinge-movement cacn be detected at the right.
Overall energy: 105.33659403 Mode 7 shows twisting movements at both ends of the protein (see Figure 22a). The distribution of b-factors (Figure 22b) and the correlation matrix (Figure 22c) agree with this observation.
Overall energy: 114.7349691 The movements given in mode 8 (Figure 23a) seem to be very similar to the ANM mode 7. But looking at the errors in Figure 23a one can see that the movement goes in exactly the opposite direction. The fluctuations per residue for mode 8 (Figure 23b) show also a much higher amplidude than for mode 7.
Overall energy: 187.72640385 Mode 9 displays a small turning movement of the beginning of the peptide sequence (see Figure 24a). This observation can be confirmed when looking at the fluctuations of single residues (Figure 24b), where a peak exists only for the residues 6-30, and at the correlation matrix (Figure 24c), which shows a highly anti-correlated region for the beginning of the peptide sequence. Again, as in mode 6, the end part of the protein shows some evidence of movement, too, as the correlation here is very weak.
Overall energy: 183.87695037 The ANM mode 10 shows the most movement of the protein. The whole protein seems to be turning and both ends are moving like hinges (see Figure 25a). This strong movements can also be detected when looking at the many small peaks in the b-factors (Figure 25b) and the small zones of weak and no correlation in the distance matrix (Figure 25c).
Almost all modes calculated by ANM and discussed above agree in the following point: The flexible regions of our protein are the beginning and the end of the protein sequence. Only mode 10 differs from this observation, as a lot more motions all over the whole protein are visible. These motions however are still very weak according to the twisting and wiggling sequence ends.
oGNM – Gaussian network model
The oGNM Webserver<ref>http://ignm.ccbb.pitt.edu/Online_GNM.htm</ref> calculates the equilibrium dynamics of any structure submitted in PDB format, using the Gaussian Network Model (GNM).
- PDB id or PDB file
- No. of nodes to represent a nucleotide (1 or 3)
- Cutoff for for amino acid pairs
- Cutoff for nucleotide pairs
- Preferred visualization engine (Jmol or Chime)
Output: The oGNM Webserver provides an comprehensive overview over the first 20 calculated normal modes. It is possible to display the slow modes, slow eigenvectors, slow average, slow av1-3 and RMSD of two modes side-by-side. The output includes:
- the mobility profiles of residues corresponding to the 20 slowest modes of motion predicted by the GNM
- the average profile reuslting from the first 2 slowest modes
- the associated eigenvalues (21 of them, including the zero eigenvalue)
- the predicted and experimental B-factors, and the correlation coefficient between the two sets of B-factors
- the spring constant (g) in units of kcal/mol.Å2
- the cross-correlation between residue fluctuations, plotted as a correlation map (for structures containing less than 2000 nodes)
- the nodes included in the GNM analysis, summarized in the .ca file
The results can also be found .
In the following section we are going to discuss the 5 lowest frequency modes calculated by oGNM. The following figures show the mobility of the protein for each computed normal mode, colored from blue to red in the order of increasing mobilities, as well as the fluctuations per residue.
The oGNM mode 1 shows a mobile part at the one end of the protein (Figure 26a). This mobility is also displayed in the fluctuations per residue (Figure 26b), where a peak for residues 1-30 indicates high felxibility, while the rest of the protein seems to be very stable.
oGNM calculated a very different mobile region of the protein for mode 2. Here only the other end of the protein is flexible, the rest of the protein is more or less stable (compare Figure 27 a and b).
Mode 3 is similar to mode 1, with the exception, that there are two distinct peaks at the beginning of the protein sequence (see Figure 28b), indicating two separated centers of movement with a stable part in between.
The calculated mode 4 is very different form the other oGNM modes. Here almost half of the protein is colored red (FIgure 29a), indicating high mobility. It is noticable, that especially parts, that were moving in the previous modes, are not predicted to move in mode 4. The residue fluctuations (Figure 29b) for mode 4 correlate well with the colored image given in Figure 29a.
Mode 5 is similar to mode 2, where only the last part of the protein seems to be flexible as indicated by the red color in Figure 30a. When taking a closer look at the residue fluctuations for mode 5 (Figure 30b) it is obvious, that there are two separated peaks at the end of the protein, thus there are two centers of movement with a small stable part in between.
Figure 31 shows the cross correlations between residue fluctuations for modes 1-5.
Fluctuation vectors in the same direction have values of +1 and are colored dark red indicating the motions are fully correlated. Fully anti-correlated motions are displayed in dark blue and are given by values of around -1. Figure 31 shows that the first 20-30 residues are correlated among themselves but are totally anti-correlated with the rest of the protein. The same is true for the last 50 residues of the protein chain. The rest of the protein is more or less correlated well, with some parts in the middle that are anti-correlated.
The five first calculated modes from oGNM are very different. They differ in the part of the protein that is flexible as well as in the amount of movement. But as two modes predict a moving peptide sequence at the start of the protein and two modes predict some movement at the end of the protein sequence it is very likely that both the start and the end of the protein sequence are very flexible, while the rest of the protein is quite stable.
This conclusion is confirmed by the cross correlation plot, which shows that only the first and the last residues are correlated well among themselves but totally anti-correlated with the rest of the protein.
One disadvantage of this server is that it doesn't provide output pdbs which could be used to generate animated gif-pictures. The color code and the residual fluctuation plots however are very clear and straightforward so it isn't hard to identify flexible and stable regions. It is, however, not possible to determine the way of movement from the still image.
The NOMAD <ref>[]</ref> server provides a lot of information and options. The interface is quite user friendly as all available parameter choices are explained in detail and there is also the runtime listed for an example NMA, which can be used to estimate the runtime for our own jobs.
The following parameters can be set:
- Number of modes to calculate
- As specified in the task description we wanted to obtain 10 modes. NOMAD does six zero modes which are just translation and rotation. Therefore we set the number of modes to calculate to 16.
- Distance weight parameter
- This parameter is used to introduce a smoother cutoff value that in the original Tirion model. All distances are weightend by exp(-(d_ij/d)^2), where d is the distance weight parameter. As proposed by NOMAD a distance weight parameter of 3Å is well suited for CA-only models. As we are doing no all-atom calculation, the distance weight parameter was set to 3.0Å.
- Cutoff to use for mode calculation
- The cutoff describes which pairs of atomes are linked by a spring of universal length according to the Tirion model (Elastic Network Model). The cutoff was set to 15Å.
- Average Rmsd in output trajectories
- For the average RMSD the default value (3.0) was used.
- Method to use
- Full matrix solver
- Sparse matrix solver
- Here we used the default option, the automatic mode.
The output contains one PDB file and one plot per mode. The plot contains the rmsd per residue, which can be interpreted as the amplitude of movement and which is controlled by the average rmsd of trajectory (input parameter).
|mode 1||mode 2||mode 3||mode 4||mode 5|
In general we can say that only one part of the protein shows motion. The five modes calculated by NOMAD-Ref show all the same kind of movement for the loop at the beginning of the protein (see Figures 32-36, green loop on the right). Figures 37-41 confirm this observation as they all show a high amplitude of movement for the first atoms in the protein, and only very little peaks for the rest of the protein. The only difference is mode 5, where a small peak at the end of the protein can be detected. This peak corresponds to the hinge-movement of the end of the protein sequence (in the picture on the left side of the protein).
All-atom NMA using Gromacs on the NOMAD-Ref server
In order to do the all-atom NMA we needed an appropriate small molecule that contained not more than 2000 atoms. This small protein was found by searching for "all atom nma". We found a paper <ref>Hetunandan Kamisetty, Eric P. Xing and Christopher J. Langmead: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation[]</ref>, where they used the structure of a hen egg-white lysozyme for an all atom NMA. So we did all the calculations for the corresponding PDB entry 2lyz.
First, we needed to prepare our PDB file. The PDB file for 2LYZ protein contains 1001 atoms in total, all lines not beginning with "ATOM" were removed from the PDB file.
The following movies show the all-atom NMA for 2LYZ at 600K
|mode 1||mode 2||mode 3|
The following movies show the all-atom NMA for 2LYZ at 2000K
|mode 1||mode 2||mode 3|
Comparison to an Elastic Network
|mode 1||mode 2||mode 3|
The results for the all-atom NMA for 2LYZ look all very similar. The calculated modes show small motions of 2LYZ, looking like a "breathing" molecule. There is no apparent difference in the different modes calculated at one temperature. Furthermore the motion does not seem to be dependent of the temperature as all modes calculated at 600K and 2000K look similar.
We have applied five different methods to calculate normal modes for BCKDHA. All methods agree in their reported movements. Each method returned a large, hinge-like motion of the C-terminal loop region. Some calculations also show a twisting and wiggling motion of this region. Some other modes show a hinge-movement of the N-terminal helix region. As all methods reported that only the terminal protein regions are flexible we conclude that these movements are functionally important. They could possibly be crucial for binding ligands or another protein. With the knowledge that the branched-chain alpha-keto dehydrogenase is an enzyme complex consisting of two alpha-subunits (encoded for by BCKDHA) and two beta-subunits (encoded by BCKDHB) one can assume that these movements might be necessary for building up the protein complex. Figure 55 shows that the flexible regions (especially the C-terminal loop) are intertwined. This structure might be obtained via the movements the flexible C-terminal loop can perform as we have seen before.
Advantages and Disadvantages from NMA and MD
go back to Maple syrup urine disease main page
go back to Task 8: Molecular Dynamics Simulations
go to Task 10: Molecular Dynamics Analysis