Fabry:Normal mode analysis
Fabry Disease » Normal mode analysis
For further information on the execution, please refer to our Journal
Introduction
Maybe one of the first questions that can be asked in this task is, why we use low-frequency normal modes. This is explained in the paper of Marc Delarue and Philippe Dumas<ref>Marc Delarue and Philippe Dumas On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models, Proc. Natl. Acad. Sci. (USA), 101, 6957-6962 (2004)</ref>, where they claim, that "many of the structural transitions (...) can be explained by just a few of the lowest-frequency normal modes". The normal modes can be used to generate the general motion of a system by superposition them. Thus we could in principle infer from our analysis in this task how the alpha-galactosidase A, which we examine hydrolyses the terminal alpha-galactosyl moiety of its substrate<ref>Normal mode http://en.wikipedia.org/wiki/Normal_mode, July 5th, 2012</ref>.
We decided to use the structures 3HG2 and 3HG3, which represent the human α-Galactosidase catalytic mechanism with empty active site and substrate bound, respectively (see <xr id="fig:bindSite"/>). From this, we hope to getter an insight into the mechanism and a possibility to compare normal modes and the behaviour of the molecule.
Possible normal modes:
The structure 3HG2 has 781 C-alpha atoms in its pdb file, thus 781*3 - 6 = 2337 normal modes could be calculated in principle for this structure without any cutoff by elNémo. Since the structure has a total of 6765 atoms, 20289 NMs could be calculated by WEBnm@.
The structure 3HG3 has 793 C-alpha atoms in its pdb file, thus 793*3 - 6 = 2373 normal modes could be calculated in principle for this structure without any cutoff by elNémo. Since the structure has a total of 7537 atoms, 22605 NMs could be calculated by WEBnm@.
WEBnm@
WEBnm@ <ref>Hollup SM, Sælensminde G, Reuter N. WEBnm@: a web application for normal mode analysis of proteins BMC Bioinformatics. 2005 Mar 11;6(1):52 </ref> claim to administer simple and automated computation of low-frequency normal modes for proteins as well as their analysis in order to clarify if it is beneficial to perform a complete study on the protein in question.
The server calculates Normal Modes with the help of the MMTK package <ref>Hinsen K, The Molecular Modelling Toolkit: a new approach to molecular simulations, J Comput Chem, 21:79-85, 2000</ref>, which is an Open Source program library for molecular simulation applications. A C-alpha force field <ref>Hinsen K, Petrescu AJ, Dellerue S, Bellissent-Funel MC, Kneller GR, Harmonicity in slow protein dynamics, Chemical Physics, 261:25-37, 2000</ref> is used and only these C-alpha atoms are used, but with a weight assigned that corresponds to the masses of the whole residue they represent.
The server provides a bunch of analysis tools and all results can be downloaded without any problems. The tools are:
- deformation energies of each mode
- eigenvalues
- calculation of normalized squared atomic displacements
- calculation of normalized squared fluctuations
- interactive visualization of the modes using vector field representation or vibrations
- correlation matrix
Deformation energies and eigenvalues
<figtable id="tab:eigen">
</figure></figure>
<figure id="fig:eigen3HG2"> |
<figure id="fig:eigen3HG3"> |
</figtable>
<figure id="fig:avEn">
</figure>
In <xr id="fig:eigen3HG2"/> and <xr id="fig:eigen3HG3"/> the Eigenvalues for the first 50 modes for both examined structures are plotted. The increase of the values shows a decrease of amplitude of motion in the modes, since there is an invers relationship between the Eigenvalues and the amplitude. Hence, mode 7 has the highest amplitude.
Since the eigenvalues correspond to the frequencies and a low frequency tends to describe a global movement of the protein our assumption is confirmed, that the lower modes express global movement, while higher modes rather show many smaller local movements. <ref>Normal Mode (Harmonic) Analysis http://cmm.cit.nih.gov/intro_simulation/node26.html, August 14th, 2012</ref>
Average Energies
In figure <xr id="fig:avEn"/> the average deformation energies of the lowest 14 modes of both catalytic mechanism are compared. For most modes the energies of the modes for 3HG2 are bigger than those for 3HG3. This means, that the amplitude of the motion is in general slighty lower in 3HG2 than in 3HG3 (see also Deformation Energies Table for the values of the average energies).
Mode Visualization
In this section, we want to visually inspect the motion of the protein in the 6 smallest modes that were identified by WEBnm@.
For a description of the modes of the molecules 3HG2 and 3HG3 see <xr id="tab:webnma_3hg2" /> and <xr id="tab:webnma_3hg3"/>, respectively. Modes 7 through 9 are similar for both molecules and mode 11 of 3HG2 corresponds to mode 10 in 3HG3. Although mode 7 of 3HG2 moves outwards, while mode 7 of 3HG3 moves inwards, making them moving in opposite directions. Of course, all similar modes can be explained in both states of the α-galactosidase catalytic mechanism, especially, mode 11/10, where one active site is in the process of releasing the substrate and one is binding it. Left is to compare both modes 12, where 3HG2's looks like closing in on the substrate that is to be bound, while 3HG3's mode 12 could be the beginning of releasing the hydrolized sugar.
For the modes 3HG2 10 and 3HG3 11 we do not have an explanation, why they could explain the function of our protein.
Concluding, it can be said, that the protein α-Galactosidase is rather rigid in most parts, except for the residues that connect chain A and B and the big and small part of each chain.
<figtable id="tab:webnma_3hg2">
In this table the 6 modes are shown, that were calculated by WEBnm@. Depicted is the structure 3HG2, which represents the Human α-galactosidase catalytic mechanism with empty active site in cyan and the substrate binding site at position 203 to 207 highlighted in red.
</figtable>
<figtable id="tab:webnma_3hg3"> In this table the 6 modes are shown, that were calculated by WEBnm@. Depicted is the structure 3HG3, which represents the Human α-galactosidase catalytic mechanism with bound substrate (green, α-D-Galactose with bound α-D-Glucose) in cyan and the substrate binding site at position 203 to 207 highlighted in red.
</figtable>
Motion
Atomic Displacement Analysis
In <xr id="fig:atomDisplacement3HG2"/> and <xr id="fig:atomDisplacement3HG3"/> the square of the atomic displacements of the C alphas of the examined structures are shown. These are normalized in a way, such that the sum over all residues is equal to 100. With these plots we can find out, which regions are displaced most, i.e. which move the most; this is shown by the highest values. It is recommended to look for clustered peaks, which identify significantly big regions. Local flexibility (a single peak) is of less importance.
In both figures, chain A and B are colored different. From this we can see, that although in <xr id="tab:webnma_3hg2"/> and <xr id="tab:webnma_3hg3"/> the motion of both chains in a mode looks alike, but in general it is not perfectly equal or even differs a lot. A good example for a significant variation is mode 7 of the structure 3HG2 (see <xr id="fig:atomDisplacement3HG2"/>, upper left). While the first part of both chains (approximately until position 200) behaves similiar, except for a different amplitude, the second half differs with chain B showing much more movement. This action can be oberserved in <xr id="tab:webnma_3hg2"/> only after a very close inspection.
For the dimer it seems to be easier to act different when no substrate is bound, since the atomic displacements of the chains in the modes of 3HG3 seem to be much more alike than those of 3HG2.
All in all, the substrate binding site itself (residue 203 to 207) seems to be rather ridgid, except for maybe mode 10 in 3HG3, while the ends of both chains (the last 50 residues) are fairly flexible. This can best be seen in both modes 12.
Fluctuations
Fluctuation is the sum of the atomic displacements of each C alpha atom in each non-trivial mode weighted by the inverse of their corresponding eigenvalues. These are normalized in a way, such that the sum over all residues is equal to 100. The fluctuations of the structures 3HG2 and 3HG3 are shown in <xr id="fig:fluctuation_3HG2"/> and <xr id="fig:fluctuation_3HG3"/>, respectively.
The plots support our previous assumption, that the chains of the substrate bound structure 3HG3 act much more similiar than those of the structure with an empty active site, since the overlap is almost perfect in the left plot of <xr id="fig:fluctuation_3HG3"/>.
Again we can observe that the binding site is an rigid island among two moderate flexible regions, which probably are responsible for opening and closing the binding pocket and the needed movement for the breake down of the bound sugar.
Towards the end of the chain more motion can be observed, which is needed for the independant movement of both chains.
Correlation Matrix
<figtable id="tab:corrMatr">
</figure>
</figure>
<figure id="fig:corrMatr3HG2"> |
<figure id="fig:corrMatr3HG3"> |
</figtable>
In the plots in <xr id="tab:corrMatr"/> both correlation matrices of 3HG2 and 3HG3 are shown. Over all, the plots look very similar, showing a positive correlation of the first part of both chain A and B to each other and also of both second halves to each other. And a negative correlation of the first half of chain A to the secon half of chain B and vice versa. The chains among themselves are rather strong positively correlated along the less strict diagonal and in the second half of the chain and negatively correlated in the rest. This underlines our statement (see section Mode Visualization) that the protein is quite rigid, with only the connecting parts being flexible and also that as well both chains move away from each other or towards each other, as the two halves of each chain itself can move independently.
The only difference between the plot of 3HG2 and 3HG3 is the strength of the correlations, where the colors in the second plot are darker, indicating a stronger correlation and therefore a higher amplitude which we have already seen in section Average Energies.
Overlap Analysis
General error. A trouble ticket has already been send on 04.07.2012 11:22
ElNémo
ElNémo is the Web-interface to the "Elastic Network Model", which is a tool to compute the low frequency normal modes of a protein.
We decided to examine the 6 lowest modes that ElNémo provides to obtain a result that is comparable to the 6 modes by WEBnm@.
B-factor analysis
3HG2: Correlation= 0.651 for 781 C-alpha atoms.
3HG3: Correlation= 0.779 for 793 C-alpha atoms.
Mode Visualization
<figtable id="tab:elnemo_3hg2">
This table shows the 6 lowest modes, that were calculated by ElNémo. Depicted is the structure 3HG2, which represents the Human α-galactosidase catalytic mechanism with empty active site in cyan and the substrate binding site at position 203 to 207 highlighted in red.
</figtable>
<figtable id="tab:elnemoSURF_3hg3"> This table shows the 6 lowest modes, that were calculated by ElNémo. Depicted is the structure 3HG3, which represents the Human α-galactosidase catalytic mechanism with bound substrate (green, α-D-Galactose with bound α-D-Glucose) in cyan and the substrate binding site at position 203 to 207 highlighted in red.
</figtable>
Distance variation
This graph displays the distance variation between successive pairs of CA atoms in the two extreme conformations that were computed for this mode (DQMIN/DQMAX). Large distance variations can be an indicator for residue pairs that support the important strain in that particular normal mode movement. Note that residue pairs between chain breaks or at flexible ends of the protein may also exhibit large CA-CA distance variations. If more than one residues ae grouped together into a rigid block (NRBL>1), CA-CA distance variations between CA atoms in the same block will be very low.
This feature is still experimental and will be further developped in the future.
Distance fluctuations
This matrix displays the maximum distance fluctuations between all pairs of CA atoms and between the two extreme conformations that were computed for this mode (DQMIN/DQMAX). Distance increases are plotted in blue and decreases in red for the strongest 10% of the residue pair distance changes. Every pixel corresponds to a single residue. Grey lines are drawn every 10 residues, yellow lines every 100 residues (counting from the upper left corner).
Comparison
Domains
CATH
Divides each chain into two domains, Aldolase class I from position 32 to 324 and from 325 to 421 a Golgi alpha-mannosidase II, which is a "mainly beta" domain thus containing only loops and beta-sheets in our protein. TODO: Can be seen?
SCOP
No result could be obtained from this ressource.
Pfam
No additional information could be found on Pfam.
References
<references/>