Task 9: Normal Mode Analysis

From Bioinformatikpedia
Revision as of 17:29, 31 August 2011 by Pfeiffenberger (talk | contribs) (Webnma)

There are several forces acting within a protein. Most of these forces have one or more equilibrium states. In reality the protein is flexible. That means, that the atoms of the system swing around these equilibrium states. These swinging around the equilibrium states can be approximated by the harmonic approximation. The forces can then be replaced by a less complex system of springs. With this model it is possible to calculate large motions of the protein by Normal Mode Analysis.

In opposite to Molecular Dynamics this is very fast and it is not a simulation it is a calculation of possible simple and large motions.

In this task we want to try several servers, which calculate normal mode analysis. A detailed description of the task can be found here. For most runs we used the structure 1J8U for our protein PAH. In the case of the all-atom NMA we used a smaller protein to test the server and this kind of approach.

Webnma

The server WEBnm@ uses only the C-alpha atoms to calculate the normal modes. The calculations are performed using the Molecular Modelling Toolkit with the C-alpha force field.

The used input format is very easy (see below).

Figure 1: Input mask of Webnma


Results

The server offers different statistics and visualizations to the performed NMA calculation.

  • An atomic displacement analysis plots the displacement of each C-alpha atom, therefore one can inspect the parts of the protein, which are the most displaced for each mode. These plots are shown for PAH below.
  • Mode visualization and vector field analysis. The vector field representation of the modes shows the directions and (relative) amplitude of the displacements undergone by different parts of the protein as vectors. The visualization and the vector field analysis can be downloaded as vmd-files. But these files seem to be broken. The mode visualizations of PAH are shown below.

Plots of atomic displacement

Mode 7

PAH WEBNMA Mode7plot.png

Mode 8

PAH WEBNMA Mode8plot.png

Mode 9

PAH WEBNMA Mode9plot.png

Mode 10

PAH WEBNMA Mode10plot.png

Mode 11

PAH WEBNMA Mode11plot.png

Visualization of Normal Modes

Mode 7

PAH webnma Mode7.gif

The movement is some kind of shear motion. This shear motion seems to compress and relax the binding pocket and the corresponding canal to the pocket. The protein as a whole seems to breath.

Mode 8

PAH webnma Mode8.gif

The movement is some kind of shear motion, but on another axis compared to Mode 7. The motion seems to influence mostly the canal to the binding pocket.

Mode 9

PAH webnma Mode9.gif

The movement is some kind of breathing. Especially the oligomerization domain seems to be influenced of the motion.

Mode 10

PAH webnma Mode10.gif

The movement is some kind of shear motion. In opposite to Mode 7 the size of the binding pocket and the canal to the pocket are not influenced.

Mode 11

PAH webnma Mode11.gif

Between the catalytic and the oligomerization domain is some kind of hinge motion. The movement of the catalytic domain itself is difficult to describe. The catalytic domain seems to breath somehow.

ElNemo

ElNemo is a webserver using the Elastic Network Model to calculate the normal modes of proteins (Tirion, 1996; Tama et al., 2000; Delarue and Sanejouand, 2002). The Elastic Network is a network model of a protein. Usually the nodes in this network are the C-alpha atoms and the edges are the interactions within the protein simulated as springs. This approach sufficient to calculate the low-frequency normal modes of a protein.

The input form of elNemo seems to be not that simple at the first glance. The form is splitted in several subsections.

In the basic section you just have to define the protein structure by file or by copy-and-paste of the atom-section. Additionally one should define a job-title and specify an e-mail address. The e-mail address is useful, because the jobs need usually some time.

PAH elnemo basic input field.png

In the supplementary section you can define how many slow modes you want to see. Additionally you can define the number of structures calculated for each mode by defining the maximum and minimum amplitude and the step of amplitude.

PAH elnemo supplementary input field.png

The comparing section can be used to define a second conformation of the protein. elNemo calculates then the contribution of each normal mode to the second structure.

PAH elnemo comparing input field.png

The last section is for experts. Here you can define the granularity of the elastic network model. That means you can define the size of the substructures used for nodes and the threshold for interactions used as edges.

PAH elnemo expert input field.png

Compared to the default settings we just changed the cutoff for interactions to 15 Angstrom in the expert section.

Results

ElNemo calculates several statistics and visualizations for the calculated Normal Modes:

  • In an Overall normal mode analysis ElNemo calculates statistics concerning collectivity and overlap. Collectivity means in this case a measure for the collective protein motion in each calculated normal mode. The overlap is calculated if a second initial structure was used in the comparing section of the input form. The overlap measures how much a normal mode contributes to the second observed conformation.
  • B-factors. The B-factor is a measure for flexibility in a protein. Using the first 100 normal modes elNemo can calculate the expected B-factor of each residue and can compare it to the B-factors of the initial structure.
  • Individual normal mode statistics. For each mode elNemo calculates several statistics. If the option was used animated images are generated showing the movement according to the normal mode. C-alpha maximum distance fluctuations between all pairs of C-alpha atoms are calculated for each mode. The normalized mean square displacement of all C-alpha atoms in the protein is calculated. A graph displaying the distance variation between successive pairs of CA atoms in the two extreme conformations that were computed for this mode (maximum/minimum amplitude) is calculated.
  • Additionally one can calculate more fine grained models for each normal mode.
  • Additionally one can calculate superpositions of two normal modes.

Collectivity

Low Collectivity of Mode 4:

PAH ELNEMO COLLECTIVITY.png

B-Factor

The following plot was created by R and the raw numbers calculated by elNemo. ElNemo itself has a very nice webpage representation of the data, but it is not that nice for the wiki.

PAH elNemo Bfactors.png

Visualization of Normal Modes

Mode 7

PAH ELNEMO Mode7 1.gif PAH ELNEMO Mode7 2.gif PAH ELNEMO Mode7 3.gif

The motion is a shear motion. This is especially obvious in the last image.

Mode 8

PAH ELNEMO Mode8 1.gif PAH ELNEMO Mode8 2.gif PAH ELNEMO Mode8 3.gif

The motion is a shear motion. This is especially obvious in the last image.

Mode 9

PAH ELNEMO Mode9 1.gif PAH ELNEMO Mode9 2.gif PAH ELNEMO Mode9 3.gif

The motion is a shear motion. This is especially obvious in the last image.

Mode 10

PAH ELNEMO Mode10 1.gif PAH ELNEMO Mode10 2.gif PAH ELNEMO Mode10 3.gif

The motion is some kind of breathing.

Mode 11

PAH ELNEMO Mode11 1.gif PAH ELNEMO Mode11 2.gif PAH ELNEMO Mode11 3.gif

The motion is a shear motion. This is more obvious in the first illustration.

CA-fluctuations

Mode 7

PAH elnemo caflutuations Mode7.png

Mode 8

PAH elnemo caflutuations Mode8.png

Mode 9

PAH elnemo caflutuations Mode9.png

Mode 10

PAH elnemo caflutuations Mode10.png

Mode 11

PAH elnemo caflutuations Mode11.png

Anisotropic Network Model web server

The Anisotropic Network Model web server is a web server for the calculation of the normal modes of proteins. This web server uses the Anisotropic network model (ANM) for the calculation of the normal modes (Bahar et al., 2000; Bahar et al., 2001). The ANM is a kind of Elastic Network. In this concrete case each node is the Cα atom of a residue and the edges are interactions between interacting nodes. Two nodes are interacting if their distance is below a certain threshold. The interactions are calculated by harmonic potentials. In contrast to the Gaussian Network Model the orientation of each interaction is considered in the calculation of the normal modes.

The input form of the server is very simple. Besides the definition of the used initial protein structure, one just has to/can define the cutoff distance for interactions and a distance weight. The distance weight adjusts the correlation between calculated and experimental B-factors.


PAH ANMA input.png

In our case we just adjusted the weight factor to 3.

Results

The server calculates several statistics and visualizations to the performed normal mode analysis:

  • On the result web page it is possible to play around with several values in the visualization. For example one can choose the normal mode to be displayed and for this mode one can choose the frequency and amplitude of motion. One can add vectors to the visualization, which describe the movement of different parts of the protein. One can also adjust the coloring, the visualization of bonds and atoms and adding labels to chosen amino acids. It is also possible to have a look at the elastic network model for different distance thresholds. It is also possible to focus on the interactions between chains. This visualization on the web site is very powerful.
  • The server offers the option to calculate a pdb file for one normal mode. The user can choose the resolution of the different conformations listed in this file.
  • The user can obtain pymol scripts for the different normal modes.
  • It is possible to calculate the anisotropic temperature factors.
  • The user can calculate the B-factors on the basis of the calculated normal modes.
  • It is possible to calculate the distance fluctuations and deformation energy between residues within the protein for a normal mode. Here the user can choose two different approaches: scalar or vectorial. In the vectorial view differences of orientation are also considered.
  • It is possible to create a plot for the calculated B-factors of different normal modes and the experimental B-factors. However the average calculated B-factors do not work properly and do not appear in the plot. Additionally the B-factors of the normal modes seem to be unnormalized.
  • It is possible to have a look at the different calculated eigenvalues, which represent the frequency of the normal mode in the normal mode analysis.
  • The server offers a feature, where one can calculate the covariance of different normal modes. But at least for us this feature did not work.

The listing above shows the sever is powerful and offers a lot of additional statistics. But some of them did not work properly at least for our analysis.

Visualization of Normal Modes

Mode 1

PAH ANMA Mode1.gif

The motion seems to be a shear motion.

Mode 2

PAH ANMA Mode2.gif

The motion seems to be a shear motion.

Mode 3

PAH ANMA Mode3.gif

There seems to be a shear motion between two loops forming the tunnel to the active site and a hinge motion between the oligomerization domain and the catalytic domain.

Mode 4

PAH ANMA Mode4.gif

There seems to be a hinge motion of between the catalytic and the oligomerization domain.

Mode 5

PAH ANMA Mode5.gif

There seems to be some kind of shear motion. The protein at all seems to breath somehow.

B-factors

The experimental B-factors

PAH ANMA Bfactors experimental.png

The calculated B-factors for the different Normal Modes

PAH ANMA Bfactors modes.png

Residue displacement analysis

Mode 1

PAH ANMA distmapscalar1.png

PAH ANMA dfegraphA1.png

Mode 2

PAH ANMA distmapscalar2.png

PAH ANMA dfegraphA2.png

Mode 3

PAH ANMA distmapscalar3.png

PAH ANMA dfegraphA3.png

Mode 4

PAH ANMA distmapscalar4.png

PAH ANMA dfegraphA4.png

Mode 5

PAH ANMA distmapscalar5.png

PAH ANMA dfegraphA5.png

oGNM – Gaussian network model

Introduction to the method

oGNM is a online web-service for the computation of structural dynamics in proteins by using a Gaussian network model. This means that this method is able to compute isotropic and Gaussian fluctuations of nodes (which are our C-alpha atoms). This web-service originated from the iGNM database which computed the dynamic of 20 058 structures accessible in the PDB database. The main advantages of oGNM are :

  • that it is not limited to relatively small proteins or to a single domain of a protein.
  • it is very fast. results for the different modes are returned within seconds.
  • it allows the computation of not only proteins but also of RNA/DNA alone and proteins with RNA/DNA complexes.

Another advantage of oGNM is their comprehensive offer of different output formats such as:

  • a mobility profile for each residue of the protein for all 20 modes of motion
  • an average profile
  • eigenvalues of each mode
  • predicted and average B-factors
  • the spring constant g
  • a cross correlation map which shows the correlation of residue fluctuation for the selected modes
  • a .ca file which describes the nodes used in the analysis

Execution

Submission of job was done via the webinterface (see figure below). We choose the structure 1J8U for our PAH protein and selected a 15 Angstrom cutoff.

Input.png
Input mask: shows the selected structure (1J8U) and a cutoff of 15 Angstrom selected for computation

Results

For each represented mode below we show a colored 3D structure (colored continuously from blue to red with the meaning that blue areas are less flexible than red areas, which are predicted to be highly flexible) and a fluctuation diagram which represents the residue on the x-axis and its corresponding fluctuation on the y-axis (a higher peak means more fluctation for this residue).

Mode 7

3d mode7.png Mode7 pah.png
Mobility of protein structure 1J8U for mode 7 Fluctuation diagram for protein structure 1J8U for mode 7

The 3D structure is colored entirely in blue this means there was no flexible region predicted. However, the fluctuation diagram shows high peaks at approximated residue positions 150, 340, 360, 410 and 420.

Mode 8

3d mode8.png Mode8 pah.png
Mobility of protein structure 1J8U for mode 8 Fluctuation diagram for protein structure 1J8U for mode 8

The 3D structure is colored entirely in blue this means there was no flexible region predicted. However, the fluctuation diagram shows high peaks at approximated residue positions 150, 340, 410 and 420.

Mode 9

3d mode9.png Mode9 pah.png
Mobility of protein structure 1J8U for mode 9 Fluctuation diagram for protein structure 1J8U for mode 9

The 3D structure is colored entirely in blue this means there was no flexible region predicted. However, the fluctuation diagram shows high peaks at approximated residue positions 150 and 340.

Mode 10

3d mode10.png Mode10 pah.png
Mobility of protein structure 1J8U for mode 10 Fluctuation diagram for protein structure 1J8U for mode 10

The 3D structure is colored entirely in blue this means there was no flexible region predicted. However, the fluctuation diagram shows high peaks at approximated residue positions 150, 340 and 420.

Mode 11

3d mode11.png Mode11 pah.png
Mobility of protein structure 1J8U for mode 11 Fluctuation diagram for protein structure 1J8U for mode 11

The 3D structure is colored entirely in blue this means there was no flexible region predicted. However, the fluctuation diagram shows high peaks at approximated residue positions 150, 180 and 420.

Discussion

The analysis of the colored 3D structures showed that all parts of the 1J8U structure are supposed to be rigid (the whole protein was colored in blue) which is somehow quite unlikely since other methods showed that there is motion of the protein. Therefore, we assume that this is probably due to a bug in the method. However, analyzing the fluctuation diagram of each mode gave us some information regarding the flexibility of our structure. We could identify three distinguished peaks in an overlay of the fluctuation diagrams of modes 7 to 11. To be more precise we identified common peaks on approximated positions 150, 340 and 420. If we spend more time looking at the properties at the mentioned position we can find some interesting facts.

For position 150 we identified that it is close to the know disease associated mutations G148S and D151H and even still close to the mutation R158Q which was more exhaustive analyzed by us in a previous task. Although, there is no known disease associated mutation at position 150 itself. Furthermore, this peak is located on the secondary structure element turn.

For the next peak, position 340, we identified that it is located on the known disease associated mutation I340T. Furthermore, it is very close to the active site position S349 and is located on a beta-strand.

For the last peak, position 420, we identified that it is close to the known disease associated mutations T418N, T418P and E422K. Although, there is no known disease associated mutation at position 420 itself. Furthermore, this peak is located on the secondary structure element beta-strand.


Mode7to11 pah oGNM.png
Overlay of all fluctuation diagrams of modes 7 to 11

NOMAD-Ref

Introduction to the method

NOMAD-Ref is an on-line web-server which allows to study the flexibility and motion of biological macro molecules by using normal mode analysis. Normally normal modes of a biological macro molecule can be calculated by calculating the eigenvectors of the Hessian matrix which is generated from the energy landscape around a local minima. However, computing the local minima is not trivial and takes some time. Although, it was observed that minimization of a structure might have not such a big impact on the lower modes of normal mode analysis. Hence, NOMAD-Ref uses a elastic network model (also known as ENM) which basically represents a molecular network by harmonic potentials for a given cutoff.

Jobs submitted to the NOMAD-Ref server are queued and executed on a Opteron server. User are then redirected to the waiting webpage which shows them their position in the queue and when they can expect to get their result done. Results are then saved for two weeks on the webserver and can be privately accessed via a unique URL. Input of a protein structure to the NOMAD-Ref web service is done via a PDB formated file. For calculations all ATOM and HETATM entries of this PDB file are considered. However, alternative residue lines are discarded.

Execution

Submission of job was done via the webinterface (see figure below). We choose the structure 1J8U for our PAH protein. We selected the following parameter:

  • PDB-File: 1J8U
  • Number of modes to calculate: 16
  • Distance weight parameter: 3.0
  • Cutoff to use for mode calculations: 15.0
  • Average RMSD: 3.0
  • Method to use: automatic
Input j18u nomad ref.png
Input mask: shows the selected parameter to calculate the normal modes for structure 1J8U

Result

Results shown for the modes below show an animated 3D structure which visualizes the motion of the protein for this mode. In addition, there is a cRMS diagram shown for each mode to show the fluctuation of each atom of the structure.

Mode 7

Mode7 animaion.gif Mode 7 pah.png
Mobility of protein structure 1J8U for mode 7 cRMS diagram for protein structure 1J8U for mode 7


In this mode we can observe a shear motion between the two loops at residue position 140 and 240. Which looks like the protein is opening and closing the active side. The cRMS diagram shows three remarkable peaks at approximated atom position 250, 1300 and 2000.

Mode 8

Mode8 animaion.gif Mode 8 pah.png
Mobility of protein structure 1J8U for mode 8 cRMS diagram for protein structure 1J8U for mode 8

In this mode we can observe a shear motion between the N-terminus and the loop at residue position 140.Furthermore we can observe a shear like motion between the two loops at position 140 and 240. Which looks like the protein is opening and closing the active side. The cRMS diagram show several remarkable peaks at approximated atom position 250, 490, 510, 900, 1200, 1400, 1600, 1700, 2000, 2100, 2300 and 2500.

Mode 9

Mode9 animaion.gif Mode 9 pah.png
Mobility of protein structure 1J8U for mode 9 cRMS diagram for protein structure 1J8U for mode 9


In this mode we can observe a shear motion between the two loops at residue position 140 and 240. Furthermore we can observe a hinge like motion at the C-terminus. The cRMS diagram shows one peak at atom position 2400.

Mode 10

Mode10 animaion.gif Mode 10 pah.png
Mobility of protein structure 1J8U for mode 10 cRMS diagram for protein structure 1J8U for mode 10

In this mode we can observe a shear motion between the two loops at residue position 140 and 240. Furthermore we can observe a hinge like motion at the C-terminus and N-terminus (which are pretty close that is why they form one big hinge motion). The cRMS diagram shows one peak at atom position 300.

Mode 11

Mode11 animaion.gif Mode 11 pah.png
Mobility of protein structure 1J8U for mode 11 cRMS diagram for protein structure 1J8U for mode 11

In this mode we can observe a shear motion between the two loops at residue position 280 and 380. Furthermore we can observe a hinge like motion at the C-terminus . The cRMS diagram shows several peaks at atom position 200, 300, 600 and 2400.

Discussion

For NOMAD-Ref we could observe strong shear motion between the loops at position 140 and 240 in almost all modes. Except for mode 10 where we had a shear like motion between 280 and 380. In addition to this, we observed in some modes a hinge like motion of the N-terminal and C-terminal end.

All-atom NMA using Gromacs on the NOMAD-Ref server

Introduction to the method

NOMAD-Ref is an on-line web-server which allows to study the flexibility and motion of biological macro molecules by using normal mode analysis. In opposite to the standard normal mode analysis of NOMAD-Ref with an elastic network a GROMACS force field-based method is used. This involves a minimization step of the structure and thus takes more time than the standard method.

The disadvantage of this method is that it only allows the submission of very small biological molecules with a size of up to 2000 atoms (hydrogen atoms included).

Execution

We had to do the analysis for this method with a different protein. This was necessary because our protein has more than 2000 atoms. So we decided to use the structure 1BPT which is a bovine pancreatic trypsin inhibitor. In order to get our submission working we had to modify our PDB file first. We removed all lines which do not start with the ATOM identifier. Then we set the temperature to 600k and 2000k and defined that we want to calculate the first 16 modes.

Input gromacs 600k.png
Input mask: shows the selected parameter to calculate the normal modes for structure 1BPT

Result

In the following subsections are the 3D motions of the 1BPT structure visualized. This was done at temperatures at 600k and 2000k. In addition we also did a normal mode analysis with the normal NOMAD-Ref server to compare the results.

600k

mode 7 mode 8 mode 9 mode 10 mode 11
Mode7 600k animaion.gif Mode8 600k animaion.gif Mode9 600k animaion.gif Mode10 600k animaion.gif Mode11 600k animaion.gif
Mode12 600k animaion.gif Mode13 600k animaion.gif Mode14 600k animaion.gif Mode15 600k animaion.gif
mode 12 mode 13 mode 14 mode 15


2000k

mode 7 mode 8 mode 9 mode 10 mode 11
Mode7 2000k animaion.gif Mode8 2000k animaion.gif Mode9 2000k animaion.gif Mode10 2000k animaion.gif Mode11 2000k animaion.gif
Mode12 2000k animaion.gif Mode13 2000k animaion.gif Mode14 2000k animaion.gif Mode15 2000k animaion.gif
mode 12 mode 13 mode 14 mode 15

Comparison to elastic network calculation (NOMAD-Ref)

mode 7 mode 8 mode 9 mode 10 mode 11
Mode7 1bpt animaion.gif Mode8 1bpt animaion.gif Mode9 1bpt animaion.gif Mode10 1bpt animaion.gif Mode11 1bpt animaion.gif

Discussion

When looking at the animated pictures we can see that there is not very much difference in the movement of 1BPT at 600k and 2000k. It seems like there is slightly more distinguished movement at 2000k than at 600k. Even the movement between the different modes is not very visible. However, it seems like that the long loop connecting the C-terminal alpha-helix with the beta-strand seems to be most flexible and moving in our normal mode analysis experiments. Furthermore we can say that the movements in the NOMAD-Ref simulation seem to be more exaggerated compared to the GROMACS based simulations.