Fabry:Normal mode analysis

From Bioinformatikpedia
Revision as of 10:55, 16 August 2012 by Rackersederj (talk | contribs) (Mode Visualization)

Fabry Disease » Normal mode analysis


For further information on the execution, please refer to our Journal

Introduction

<figure id="fig:bindSite">

This figure shows the substrate binding site of the structures 3HG2 (pale blue) and 3HG3 (pale green). In dark blue and dark green we highlighted the residues involved in binding the substrate (residues 203 – 207) α-D-Galactose, which is shown in red.

</figure>

Maybe one of the first questions that can be asked in this task is, why we use low-frequency normal modes. This is explained in the paper of Marc Delarue and Philippe Dumas<ref>Marc Delarue and Philippe Dumas On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models, Proc. Natl. Acad. Sci. (USA), 101, 6957-6962 (2004)</ref>, where they claim, that "many of the structural transitions (...) can be explained by just a few of the lowest-frequency normal modes". The normal modes can be used to generate the general motion of a system by superposition them. Thus we could in principle infer from our analysis in this task how the alpha-galactosidase A, which we examine hydrolyses the terminal alpha-galactosyl moiety of its substrate<ref>Normal mode http://en.wikipedia.org/wiki/Normal_mode, July 5th, 2012</ref>.

We decided to use the structures 3HG2 and 3HG3, which represent the human α-Galactosidase catalytic mechanism with empty active site and substrate bound, respectively (see <xr id="fig:bindSite"/>). From this, we hope to getter an insight into the mechanism and a possibility to compare normal modes and the behaviour of the molecule.

Possible normal modes:
The structure 3HG2 has 781 C-alpha atoms in its pdb file, thus 781*3 - 6 = 2337 normal modes could be calculated in principle for this structure without any cutoff by elNémo. Since the structure has a total of 6765 atoms, 20289 NMs could be calculated by WEBnm@.
The structure 3HG3 has 793 C-alpha atoms in its pdb file, thus 793*3 - 6 = 2373 normal modes could be calculated in principle for this structure without any cutoff by elNémo. Since the structure has a total of 7537 atoms, 22605 NMs could be calculated by WEBnm@.


WEBnm@

WEBnm@ <ref>Hollup SM, Sælensminde G, Reuter N. WEBnm@: a web application for normal mode analysis of proteins BMC Bioinformatics. 2005 Mar 11;6(1):52 </ref> claim to administer simple and automated computation of low-frequency normal modes for proteins as well as their analysis in order to clarify if it is beneficial to perform a complete study on the protein in question.
The server calculates Normal Modes with the help of the MMTK package <ref>Hinsen K, The Molecular Modelling Toolkit: a new approach to molecular simulations, J Comput Chem, 21:79-85, 2000</ref>, which is an Open Source program library for molecular simulation applications. A C-alpha force field <ref>Hinsen K, Petrescu AJ, Dellerue S, Bellissent-Funel MC, Kneller GR, Harmonicity in slow protein dynamics, Chemical Physics, 261:25-37, 2000</ref> is used and only these C-alpha atoms are used, but with a weight assigned that corresponds to the masses of the whole residue they represent.
The server provides a bunch of analysis tools and all results can be downloaded without any problems. The tools are:

  • deformation energies of each mode
  • eigenvalues
  • calculation of normalized squared atomic displacements
  • calculation of normalized squared fluctuations
  • interactive visualization of the modes using vector field representation or vibrations
  • correlation matrix



Deformation energies and eigenvalues

<figtable id="tab:eigen">

</figure>

</figure>

<figure id="fig:eigen3HG2">
Shown are the Eigenvalues for each mode from 7 to 57 from the structure 3HG2. Except for a jump between mode 16 and 17 the increase is almost linear.
<figure id="fig:eigen3HG3">
Shown are the Eigenvalues for each mode from 7 to 57 from the structure 3HG2. Due to several jumps in the values, the increase is not really linearly.

</figtable>

<figure id="fig:avEn">

In this plot the average energies of the lowest 14 modes, calculated by WEBnm@ are compared for 3GH2 (green) and 3HG3 (blue).

</figure>

In <xr id="fig:eigen3HG2"/> and <xr id="fig:eigen3HG3"/> the Eigenvalues for the first 50 modes for both examined structures are plotted. The increase of the values shows a decrease of amplitude of motion in the modes, since there is an invers relationship between the Eigenvalues and the amplitude. Hence, mode 7 has the highest amplitude.

Since the eigenvalues correspond to the frequencies and a low frequency tends to describe a global movement of the protein our assumption is confirmed, that the lower modes express global movement, while higher modes rather show many smaller local movements. <ref>Normal Mode (Harmonic) Analysis http://cmm.cit.nih.gov/intro_simulation/node26.html, August 14th, 2012</ref>


Average Energies

In figure <xr id="fig:avEn"/> the average deformation energies of the lowest 14 modes of both catalytic mechanism are compared. For most modes the energies of the modes for 3HG2 are bigger than those for 3HG3. This means, that the amplitude of the motion is in general slighty lower in 3HG2 than in 3HG3 (see also Deformation Energies Table for the values of the average energies).

Mode Visualization

In this section, we want to visually inspect the motion of the protein in the 6 smallest modes that were identified by WEBnm@. For a description of the modes of the molecules 3HG2 and 3HG3 see <xr id="tab:webnma_3hg2" /> and <xr id="tab:webnma_3hg3"/>, respectively. Modes 7 through 9 are similar for both molecules and mode 11 of 3HG2 corresponds to mode 10 in 3HG3. Although mode 7 of 3HG2 moves outwards, while mode 7 of 3HG3 moves inwards, making them moving in opposite directions. Of course, all similar modes can be explained in both states of the α-galactosidase catalytic mechanism, especially, mode 11/10, where one active site is in the process of releasing the substrate and one is binding it. Left is to compare both modes 12, where 3HG2's looks like closing in on the substrate that is to be bound, while 3HG3's mode 12 could be the beginning of releasing the hydrolized sugar.
For the modes 3HG2 10 and 3HG3 11 we do not have an explanation, why they could explain the function of our protein.
Concluding, it can be said, that the protein α-Galactosidase is rather rigid in most parts, except for the residues that connect chain A and B and the big and small part of each chain.

<figtable id="tab:webnma_3hg2"> In this table the 6 modes are shown, that were calculated by WEBnm@. Depicted is the structure 3HG2, which represents the Human α-galactosidase catalytic mechanism with empty active site in cyan and the substrate binding site at position 203 to 207 highlighted in red.

WEBnm@ mode 7 shows a shearing motion in the groove between chain A and B, which might relate to the function of the galactosidase, namely the hydrolysis of terminal α-D-galactose. Here almost the whole protein stays rigid, while the small part linking both chains moves.




In WEBnm@ mode 8 the two monomeres of the structure make an opening movement, thus enlargening the groove between them and exposing the binding pockets for the substrate and the active site. Again there is not much flexibility, but in the linking part of the dimer.



Again in WEBnm@ mode 9 it seems that mostly the region that connects chain A and B is rigid (cf. mode 7 and 8), now bending both upper parts of the molecule to the left and both lower parts to the right. This may be explained by "shoving" the sugar into the active site, which is located in the front of the picture, while releasing the sugar that has been hydrolized in the active site in the back of the picture (which cannot be seen).
WEBnm@ mode 10 : Here all four parts of the protein (both upper parts of the chains and both lower parts of it) seems to bend towards the center of the groove and while doing so turning to the left, where a little more parts than only the bridge between the chains need to be flexible. We cannot think of an explanation for this kind of movement, but again being involved in bringing the sugar inside the active site.






The WEBnm@ mode 11 looks like an opening or closing motion if only either the front part or the back part of the dimer (the bigger part of the right chain and the smaller part of the left chain or vice versa) is considered. Looking at the whole molecule, we see that both upper parts move to the right and both lower parts move to the left, which gives the impression of turning an imaginary bound molecule in the groove between the chains. In this movement again the part that binds both chains together has to be non-rigid, but also the part that lies between the bigger part, which contains the substrate binding site, and the smaller part has to be elastic.
In WEBnm@ mode 12 the molecule seems again to make some kind of a closing motion, where the central part of the dimer (the part that connects both chains) is pushed downwards, bringing all four parts (big and small part of both chains) to close up towards the middle. This could again be due to having to hold on to a sugar that has to be bound to the active site in order to be cleaved. To do so, again mainly the connecting part has to be flexible. In addition, some parts of the chains have to bend, to really close up the opening.



</figtable>

<figtable id="tab:webnma_3hg3"> In this table the 6 modes are shown, that were calculated by WEBnm@. Depicted is the structure 3HG3, which represents the Human α-galactosidase catalytic mechanism with bound substrate (green, α-D-Galactose with bound α-D-Glucose) in cyan and the substrate binding site at position 203 to 207 highlighted in red.

WEBnm@ mode 7shows a shearing motion in the groove between chain A and B, which might relate to the function of the galactosidase, namely the hydrolysis of terminal α-D-galactose. It can be observed, that the binding groove moves around the substrate. Here almost the whole protein stays rigid, while the small part linking both chains moves.



In WEBnm@ mode 8 the two monomeres of the structure make an opening movement, thus enlargening the groove between them and exposing the binding pockets for the substrate and the active site. Again there is not much flexibility, but in the linking part of the dimer. The movement makes it possible for the substrate to enter the binding site, which becomes exposed.


Again in WEBnm@ mode 9 it seems that mostly the region that connects chain A and B is rigid (cf. mode 7 and 8), now bending both upper parts of the molecule to the left and both lower parts to the right. It can be observed that the binding pocket closes in on the sugar and the smaller part of the left chain helps fixating the substrate, while the sugar that has been hydrolized in the active site in the back of the picture (which cannot be seen) is released
The WEBnm@ mode 10 looks like an opening or closing motion if only either the front part or the back part of the dimer (the bigger part of the right chain and the smaller part of the left chain or vice versa) is considered. Looking at the whole molecule, it seems that the active site in the front releases the substrate, while the active site in the back is in the process of binding a new sugar that has to be hydrolized. In this movement again the part that binds both chains together has to be non-rigid, but also the part that lies between the bigger part, which contains the substrate binding site, and the smaller part has to be elastic.
WEBnm@ mode 11 is the first mode, that is different from all the other modes examined so far. It gives the impression of a breathing or relaxing a spring. The whole dimer expands towards the outside by stretching many bonds throughout the molecule. In this motion the active site moves away from the substrate.








In WEBnm@ mode 12 the front part of the dimer (small part of the left chain and big part of the right chain) moves away from the back part, resulting in a small rotation of each of the active sites around their substrate. Again, both the connecting part between both chains and between the bigger and the smaller part has to be elastic.







</figtable>


Motion

Atomic Displacement Analysis

<figure id="fig:atomDisplacement3HG2">

This figure displays the normalized square of the atomic displacements of the C alphas of the structure 3HG2 for the modes 7 to 12 calculated by WEBnm@. This molecule is a homodimer and hence both chains are shown - chain A in green, chain B in blue. Position 1 through 31 are skipped, since they form the signal peptide and are cleaved.

</figure>

<figure id="fig:atomDisplacement3HG3">

This figure displays the normalized square of the atomic displacements of the C alphas of the structure 3HG3 for the modes 7 to 12 calculated by WEBnm@. This molecule is a homodimer and hence both chains are shown - chain A in green, chain B in blue. Position 1 through 31 are skipped, since they form the signal peptide and are cleaved.

</figure>

In <xr id="fig:atomDisplacement3HG2"/> and <xr id="fig:atomDisplacement3HG3"/> the square of the atomic displacements of the C alphas of the examined structures are shown. These are normalized in a way, such that the sum over all residues is equal to 100. With these plots we can find out, which regions are displaced most, i.e. which move the most; this is shown by the highest values. It is recommended to look for clustered peaks, which identify significantly big regions. Local flexibility (a single peak) is of less importance.

In both figures, chain A and B are colored different. From this we can see, that although in <xr id="tab:webnma_3hg2"/> and <xr id="tab:webnma_3hg3"/> the motion of both chains in a mode looks alike, but in general it is not perfectly equal or even differs a lot. A good example for a significant variation is mode 7 of the structure 3HG2 (see <xr id="fig:atomDisplacement3HG2"/>, upper left). While the first part of both chains (approximately until position 200) behaves similiar, except for a different amplitude, the second half differs with chain B showing much more movement. This action can be oberserved in <xr id="tab:webnma_3hg2"/> only after a very close inspection.
For the dimer it seems to be easier to act different when no substrate is bound, since the atomic displacements of the chains in the modes of 3HG3 seem to be much more alike than those of 3HG2.

All in all, the substrate binding site itself (residue 203 to 207) seems to be rather ridgid, except for maybe mode 10 in 3HG3, while the ends of both chains (the last 50 residues) are fairly flexible. This can best be seen in both modes 12.

Fluctuations

<figure id="fig:fluctuation_3HG2">

Here the normalized square of the fluctuation of each Calpha atom (for all non-trivial modes) for the structure 3HG2 is shown. This molecule is a homodimer and hence both chains are shown - chain A in green, chain B in blue. Position 1 through 31 are skipped, since they form the signal peptide and are cleaved. In the left plot, both chains are superimposed to show to which extend they behave alike.

</figure>

<figure id="fig:fluctuation_3HG3">

Here the normalized square of the fluctuation of each Calpha atom (for all non-trivial modes) for the structure 3HG3 is shown. This molecule is a homodimer and hence both chains are shown - chain A in green, chain B in blue. Position 1 through 31 are skipped, since they form the signal peptide and are cleaved. In the left plot, both chains are superimposed to show to which extend they behave alike.

</figure>

Fluctuation is the sum of the atomic displacements of each C alpha atom in each non-trivial mode weighted by the inverse of their corresponding eigenvalues. These are normalized in a way, such that the sum over all residues is equal to 100. The fluctuations of the structures 3HG2 and 3HG3 are shown in <xr id="fig:fluctuation_3HG2"/> and <xr id="fig:fluctuation_3HG3"/>, respectively.

The plots support our previous assumption, that the chains of the substrate bound structure 3HG3 act much more similiar than those of the structure with an empty active site, since the overlap is almost perfect in the left plot of <xr id="fig:fluctuation_3HG3"/>.

Again we can observe that the binding site is an rigid island among two moderate flexible regions, which probably are responsible for opening and closing the binding pocket and the needed movement for the breake down of the bound sugar.
Towards the end of the chain more motion can be observed, which is needed for the independant movement of both chains.


Correlation Matrix

<figtable id="tab:corrMatr">

</figure>

</figure>

<figure id="fig:corrMatr3HG2">
In this plot the correlation of motions between all Cα in 3HG2 is shown, where a red coloring means a positive correlation ranging from 0 to 1 and blue indicating a negative relationship in the same range. The darker the color, the more extreme the value is. From the origin, first chain A is plotted (green numbers), followed by chain B where the position numbers are shown in blue.
<figure id="fig:corrMatr3HG3">
In this plot the correlation of motions between all Cα in 3HG3 is shown, where a red coloring means a positive correlation ranging from 0 to 1 and blue indicating a negative relationship in the same range. The darker the color, the more extreme the value is. From the origin, first chain A is plotted (green numbers), followed by chain B where the position numbers are shown in blue.

</figtable>

In the plots in <xr id="tab:corrMatr"/> both correlation matrices of 3HG2 and 3HG3 are shown. Over all, the plots look very similar, showing a positive correlation of the first part of both chain A and B to each other and also of both second halves to each other. And a negative correlation of the first half of chain A to the secon half of chain B and vice versa. The chains among themselves are rather strong positively correlated along the less strict diagonal and in the second half of the chain and negatively correlated in the rest. This underlines our statement (see section Mode Visualization) that the protein is quite rigid, with only the connecting parts being flexible and also that as well both chains move away from each other or towards each other, as the two halves of each chain itself can move independently.
The only difference between the plot of 3HG2 and 3HG3 is the strength of the correlations, where the colors in the second plot are darker, indicating a stronger correlation and therefore a higher amplitude which we have already seen in section Average Energies.

Overlap Analysis

General error. A trouble ticket has already been send on 04.07.2012 11:22

ElNémo

ElNémo is the web-interface to the "Elastic Network Model", which is a tool to compute the low frequency normal modes of a protein. It performs several analyses, including

  • an overall normal mode analysis taking into account frequency, collectivity of atom movement, overlap of each mode with the observed conformational change (if two conformations provided) and the corresponding amplitude
  • an individual normal mode analysis with animations and RMSD for each mode
  • an Analysis of distance fluctuations between all CA atoms (cross plot)
  • a B-factor analysis
  • a comparison of a normal mode perturbed structure and a second conformations in terms of RMSD and number of residues that are closer than 3 Angstrom

Additionally you can compute combined models from two modes. Also the server claims that there is a way to download all the results at once, which only produced errors in our case.

We decided to examine the 6 lowest modes that ElNémo provides to obtain a result that is comparable to the 6 modes by WEBnm@.

B-factor analysis

3HG2: Correlation= 0.651 for 781 C-alpha atoms.
3HG3: Correlation= 0.779 for 793 C-alpha atoms.

Mode Visualization

<figtable id="tab:elnemo_3hg2"> This table shows the 6 lowest modes, that were calculated by ElNémo. Depicted is the structure 3HG2, which represents the Human α-galactosidase catalytic mechanism with empty active site in cyan and the substrate binding site at position 203 to 207 highlighted in red.

ElNémo mode 7 of 3HG2
ElNémo mode 8 of 3HG2
ElNémo mode 9 of 3HG2
ElNémo mode 10 of 3HG2
ElNémo mode 11 of 3HG2
ElNémo mode 12 of 3HG2

</figtable>

<figtable id="tab:elnemoSURF_3hg3"> This table shows the 6 lowest modes, that were calculated by ElNémo. Depicted is the structure 3HG3, which represents the Human α-galactosidase catalytic mechanism with bound substrate (green, α-D-Galactose with bound α-D-Glucose) in cyan and the substrate binding site at position 203 to 207 highlighted in red.

ElNémo mode 7 of 3HG3
ElNémo mode 8 of 3HG3
ElNémo mode 9 of 3HG3
ElNémo mode 10 of 3HG3
ElNémo mode 11 of 3HG3
ElNémo mode 12 of 3HG3

</figtable>


Distance variation

<figure id="fig:FABRY_caStrain3HG3">

...

</figure>

<figure id="fig:FABRY_caStrain3HG2">

...

</figure>

This graph displays the distance variation between successive pairs of CA atoms in the two extreme conformations that were computed for this mode (DQMIN/DQMAX). Large distance variations can be an indicator for residue pairs that support the important strain in that particular normal mode movement. Note that residue pairs between chain breaks or at flexible ends of the protein may also exhibit large CA-CA distance variations. If more than one residues ae grouped together into a rigid block (NRBL>1), CA-CA distance variations between CA atoms in the same block will be very low.

This feature is still experimental and will be further developped in the future.


Distance fluctuations

<figtable id="tab:elnemo_fluct_3hg2"> ...

Distance fluctuations of mode 7 (3HG2)
Distance fluctuations of mode 8 (3HG2)
Distance fluctuations of mode 9 (3HG2)
Distance fluctuations of mode 10 (3HG2)
Distance fluctuations of mode 11 (3HG2)
Distance fluctuations of mode 12 (3HG2)

</figtable>

<figtable id="tab:elnemo_fluct_3hg2"> ...

Distance fluctuations of mode 7 (3HG3)
Distance fluctuations of mode 8 (3HG3)
Distance fluctuations of mode 9 (3HG3)
Distance fluctuations of mode 10 (3HG3)
Distance fluctuations of mode 11 (3HG3)
Distance fluctuations of mode 12 (3HG3)

</figtable>

This matrix displays the maximum distance fluctuations between all pairs of CA atoms and between the two extreme conformations that were computed for this mode (DQMIN/DQMAX). Distance increases are plotted in blue and decreases in red for the strongest 10% of the residue pair distance changes. Every pixel corresponds to a single residue. Grey lines are drawn every 10 residues, yellow lines every 100 residues (counting from the upper left corner).


Comparison

Domains

CATH

Divides each chain into two domains, Aldolase class I from position 32 to 324 and from 325 to 421 a Golgi alpha-mannosidase II, which is a "mainly beta" domain thus containing only loops and beta-sheets in our protein. TODO: Can be seen?

SCOP

No result could be obtained from this ressource.

Pfam

No additional information could be found on Pfam.

References

<references/>