Difference between revisions of "Normal mode analysis TSD"

From Bioinformatikpedia
(Normal modes)
(Conclusion)
 
(36 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
<div style = "align:left;float: left;"> « Previous [[Molecular Dynamics Simulations TSD]] </div> <div style = "align:right;float: right;"> [[MD simulation analysis TSD]] Next » </div>
  +
<br>
  +
 
The journal for this task can be found [[Normal mode analysis TSD Journal|here]].
 
The journal for this task can be found [[Normal mode analysis TSD Journal|here]].
   
Line 15: Line 18:
 
Calculations are performed using the C-alpha force field where only the C-alpha atoms are considered and assigned the according whole residue masses. A coarse-grained model is employed and frequencies and energies are interpreted on relative scales and therefore reported without units.<br>
 
Calculations are performed using the C-alpha force field where only the C-alpha atoms are considered and assigned the according whole residue masses. A coarse-grained model is employed and frequencies and energies are interpreted on relative scales and therefore reported without units.<br>
 
Provided are: '''deformation energies''' of each mode, '''eigenvalues''', calculation of normalized squared '''atomic displacements''', calculation of normalized squared '''fluctuations''', interactive visualization of the modes using '''vector field representation''' or vibrations and the '''correlation matrix'''.<br>
 
Provided are: '''deformation energies''' of each mode, '''eigenvalues''', calculation of normalized squared '''atomic displacements''', calculation of normalized squared '''fluctuations''', interactive visualization of the modes using '''vector field representation''' or vibrations and the '''correlation matrix'''.<br>
For comparison of dynamics of related protein, the web server additionally features comparative analyses of protein structures<ref name=webnam>http://apps.cbu.uib.no/webnma/about</ref>.
+
For comparison of dynamics to related proteins, the web server additionally features comparative analyses of protein structures<ref name=webnam>http://apps.cbu.uib.no/webnma/about</ref>.
   
   
Line 22: Line 25:
   
 
<figtable id="tbl:defengs">
 
<figtable id="tbl:defengs">
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px;" cellpadding="2"
+
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px;margin-left:20px;margin-top:20px;" cellpadding="2"
|+ <caption><font size="-1">Deformation energies and Eigenvalues.</font></caption>
 
 
! scope="row" align="left" |
 
! scope="row" align="left" |
 
| align="right" | [[File:TSD DefEnergies.png|thumb|200px]]
 
| align="right" | [[File:TSD DefEnergies.png|thumb|200px]]
 
| align="right" | [[File:TSD_Eigenvalues.png|thumb|200px]]
 
| align="right" | [[File:TSD_Eigenvalues.png|thumb|200px]]
 
|-
 
|-
  +
|+ style="caption-side: bottom; text-align: left;" | <font size="1"><div align="justify">'''Table 1:''' Deformation energies and Eigenvalues.</div></font>
 
|}
 
|}
 
</figtable>
 
</figtable>
Line 40: Line 43:
 
<figtable id="tbl:denerg">
 
<figtable id="tbl:denerg">
 
{| class="wikitable", style="width:550px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
 
{| class="wikitable", style="width:550px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
|+ <caption><font size="-1">Energie scores. All modes with their according deformation energies.</font></caption>
+
|+ style="caption-side: bottom; text-align: left;" |<font size="1"><div align="justify">'''Table 2:''' Energie scores. All modes with their according deformation energies.</div></font>
 
|- align="center"
 
|- align="center"
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode index
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode index
Line 89: Line 92:
   
 
<figure id="webnma_correlation">
 
<figure id="webnma_correlation">
[[File:TSD_Corr.png|thumb|250px|<font size=1>'''Figure:''' Correlation matrix. Shown is the correlated movement of the C-alphas in the protein on a range from -1 (anti-correlated) in blue, 0 (uncorrelated) with no color to 1 (correlated) in red.]]
+
[[File:TSD_Corr.png|thumb|250px|<font size="1"><div align="justify">'''Figure 1:''' Correlation matrix. Shown is the correlated movement of the C-alphas in the protein on a range from -1 (anti-correlated) in blue, 0 (uncorrelated) with no color to 1 (correlated) in red.</div></font>]]
 
</figure>
 
</figure>
   
Line 98: Line 101:
 
<!--<br style="clear:both;"/>
 
<!--<br style="clear:both;"/>
 
<figure id="atomicdisplacement">
 
<figure id="atomicdisplacement">
[[File:2GK1A_NGTdeleted.pdb.mode7to12plot.png|thumb|400px|<font size=1>'''Figure:''' Atomic displacement.]]
+
[[File:2GK1A_NGTdeleted.pdb.mode7to12plot.png|thumb|400px|<font size="1"><div align="justify">'''Figure 2:''' Atomic displacement.</div></font>]]
 
</figure> -->
 
</figure> -->
   
Line 105: Line 108:
   
 
<figtable id="tbl:fluct">
 
<figtable id="tbl:fluct">
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px;" cellpadding="2"
+
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px;margin:20px;" cellpadding="2"
|+ <caption><font size="-1">Fluctuations. The fluctuations are the sum of the atomic displacements in each mode weighted by the inverse of their corresponding eigenvalues.The fluctuations are plotted on the right and displayed on the structure on the left (high fluctuation red, low displacement blue).</font></caption>
+
|+ style="caption-side: bottom; text-align: left;" |<font size="1"><div align="justify">'''Table 2:''' Fluctuations. The fluctuations are the sum of the atomic displacements in each mode weighted by the inverse of their corresponding eigenvalues.The fluctuations are plotted on the right and displayed on the structure on the left (high fluctuation red, low displacement blue).</div></font>
 
! scope="row" align="left" |
 
! scope="row" align="left" |
 
| align="right" | [[File:TSD_Rendered.png|thumb|200px]]
 
| align="right" | [[File:TSD_Rendered.png|thumb|200px]]
Line 115: Line 118:
   
   
The fluctuations of the modes are shown in <xr id="tbl:fluct"/>. These are the normalized squares of the displacement of each C-alpha atom averaged over all modes weighted by their eigenvalue. Local movements can be recognized by small peaks whereas wide peaks indicate larger flexible protein regions. The highest displacement is expressed by a loop in the middle of the sequence around the residue SER281. Further more there is a clear distinction between loops and ends of structural elements that show increased fluctuations and regions with a solid secondary structure that show no variation. Around the active site there is no fluctuation as well.
+
The fluctuations of the modes are shown in <xr id="tbl:fluct"/>. These are the normalized squares of the displacement of each C-alpha atom averaged over all modes weighted by their eigenvalue. Local movements can be recognized by small peaks whereas wide peaks indicate larger flexible protein regions. The highest displacement is expressed by a loop in the middle of the sequence around the residue SER281. Further more there is a clear distinction between loops and ends of structural elements that show increased fluctuations and regions with a solid secondary structure that show no variation. Around the active site there is no fluctuation as well. This suggest that the main protein framework stays stable and expresses no significant structural changes during the interaction with its environment.
   
 
<br style="clear:both;"/>
 
<br style="clear:both;"/>
   
 
<figtable id="tbl:overlap">
 
<figtable id="tbl:overlap">
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px;" cellpadding="2"
+
{| class="wikitable" style="float: right; border: 2px solid darkgray; width:500px; margin:20px;" cellpadding="2"
|+ <caption><font size="-1">Overlap analysis with 2gjx chain A.</font></caption>
+
|+ style="caption-side: bottom; text-align: left;" |<font size="1"><div align="justify">'''Table 3:''' Overlap analysis with 2gjx chain A.</div></font>
 
! scope="row" align="left" |
 
! scope="row" align="left" |
 
| align="right" | [[File:TSD_Overlapplot2.png|thumb|200px]]
 
| align="right" | [[File:TSD_Overlapplot2.png|thumb|200px]]
Line 135: Line 138:
 
<div style="float:right">
 
<div style="float:right">
 
<figtable id="tbl:web">
 
<figtable id="tbl:web">
{| class="wikitable", style="width:800px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
+
{| class="wikitable", style="width:800px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000;margin:20px;"
|+ <caption><font size="-1">Lowest modes from WEBnm@ calculation.</font></caption>
+
|+ <font size="1"><div align="justify">'''Table 4:''' Lowest modes from WEBnm@ calculation with their visualization, deformation energy and atomic displacement. Within the visualization there is the NGT ligand displayed in orange and the important residues in yellow.</div></font>
 
|- align="left"
 
|- align="left"
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode
Line 177: Line 180:
   
   
  +
<xr id="tbl:web"/> shows all mode animations with their corresponding deformation energies and atomic displacements. In the mode visualization on the base structure the NGT ligand is shown in orange and the important residues in yellow.
<xr id="tbl:web"/> shows all modes with their corresponding deformation energies and atomic displacements. All modes differ in their displacement plots as well as in their distinct motions. The atomic displacement goes much higher with lower modes (up to 3) and only up to 1.5 in mode 12 that is a sign for the large and more global motions in low deformation energy modes. The loop on the left of the protein that was identified to have a lot of displacement in <xr id="tbl:fluct"/> can be spotted in most modes but more evidently in the lower ones. As to why this region could be so motile can only be speculated as it is not near the catalytic site and therefore no functional advantage is apparent.
 
  +
  +
All modes differ in their displacement plots as well as in their distinct motions. The atomic displacement goes much higher with lower modes (up to 3) and only up to 1.5 in mode 12 that is a sign for the large and more global motions in low deformation energy modes. The loop on the left of the protein that was identified to have a lot of displacement in <xr id="tbl:fluct"/> can be spotted in most modes but more evidently in the lower ones. As to why this region could be so motile and whether it could be important for function can only be speculated as it is not near the catalytic site and therefore no functional advantage is apparent.
   
 
Mode 7 exhibits a hinge movement with many parts of the protein staying rigid.
 
Mode 7 exhibits a hinge movement with many parts of the protein staying rigid.
Mode 8 has more motile protein ends that move in contrary directions to the rest of the protein. In mode 9 the right and left end of the protein twist even further in opposite directions to each other. In mode 10 the loop around SER281 develops a life of its own where as the rest of the protein stays comparably still. From mode 11 on the protein rotates around its own axis but besides there are no more larger parts moving independently but rather small parts that vibrate.
+
Mode 8 has more motile protein ends that move in contrary directions to the rest of the protein. In mode 9 the right and left end of the protein twist even further in opposite directions to each other. In mode 10 the loop around SER281 develops a life of its own whereas the rest of the protein stays comparably still. From mode 11 on the protein exhibits center oriented breathing-like motions mostly along the y-axis but besides there are no more larger parts moving independently but rather small parts that vibrate. The residues around the active site form a comparably stable and fixed part of the whole protein which suggests that this special conformation is vital for the proper binding of the ligand.
   
The domains colored blue and red can be distinguished best in mode 8 and 9 where they move like a hinge against each other. Modes 11 and 12 on the contrary are to rigid in the y-axis motions as to show signs of separate domains.
+
The 2 domains colored blue and red can be distinguished best in mode 8 and 9 where they move like a hinge against each other. Modes 11 and 12 on the contrary are to rigid in the y-axis motions as to show signs of separate domains. In mode 7 the domain separation is not apparent as the domain boundaries remain rigid.
   
 
<br style="clear:both;"/>
 
<br style="clear:both;"/>
Line 195: Line 200:
 
Finally there is a feature still termed experimental, that shows the '''distance variation between successive residues''', again using the two most extreme conformations for the current mode. These values are shown per-residue and should therefore allow one to point out residue pairs that seem to play an important role in mediating the movement of the mode.
 
Finally there is a feature still termed experimental, that shows the '''distance variation between successive residues''', again using the two most extreme conformations for the current mode. These values are shown per-residue and should therefore allow one to point out residue pairs that seem to play an important role in mediating the movement of the mode.
   
 
===Normal Modes===
 
How are the normal modes calculated, that is from which part of the structure? How many normal modes could in principle be calculated for your protein without any cutoff.
 
describe what movements you observe: hinge-movement, “breathing”…
 
Which regions of your protein are most flexible, most stable?
 
Try the comparison/upload of second structure option, if: (i) you have PDB structures in different conformations or (ii) your protein has a bound ligand. Then either upload a structure with and one without the ligand, or delete the ligand in your structure. Note: Due to the force field that considers only C_alpha atoms, only changes in the backbone will give results. The model does not resolve changes in side-chain positions or SNPs.
 
   
 
==== 2gk1:A without NGT ====
 
==== 2gk1:A without NGT ====
 
<div style="float:right; display:inline-block;">
 
<div style="float:right; display:inline-block;">
 
<figtable id="tbl:2gk1a_nongt_elnemo_intro">
 
<figtable id="tbl:2gk1a_nongt_elnemo_intro">
{| class="wikitable", style="width:1050px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
+
{| class="wikitable", style="width:1050px;margin-left:20px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
  +
|+ <font size="1"><div align="justify">'''Table 5: Lowest frequency modes obtained from elNemo'''. Shown is a visualisation of the mode, as well as distance fluctuation map as explained in the [[Normal_mode_analysis_TSD#elNemo|introduction]] and the per-residue mean squared displacement. For the visualization and mean squared displacement, the first domain is highlighted in red and the second one in blue. For the distance fluctuation map, red denotes a distance decrease and blue an increase.</div></font>
|+ <caption><font size="-1">TODO caption</font></caption>
 
 
|- align="left"
 
|- align="left"
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode
Line 257: Line 256:
   
 
<figure id="fig:collandfreq">
 
<figure id="fig:collandfreq">
[[Image:TSD 2gk1 elnemo CollectAFrequ.png|300px|thumb|<caption>'''Frequency and Collectivity values of the first 100 modes found by elNemo'''.</caption>]]
+
[[Image:TSD 2gk1 elnemo CollectAFrequ.png|300px|thumb|<font size="1"><div align="justify">'''Figure 2 :''' Frequency and Collectivity values of the first 100 modes found by elNemo.</div></font>]]
 
</figure>
 
</figure>
  +
<!-- TODO which type of movement, b-factor correlation, collectivity -->
 
 
TODO which type of movement, b-factor correlation, collectivity
 
 
 
Considering the distinction of the two domains (highlighted in different colors), they are most apparent in mode 8 and to some degree in modes 9 and 10. On the other hand, judging from the motion alone, they would be very hard to spot in mode 7. It is generally evident, that the two alpha-helices of the second domain that are situated towards the face of the first domain, remain comparably flexible. This is visible, especially in mode 8 where both are significantly stretched in a spring-like motion. Furthermore concerning the distinction of the two domains, the previous findings are also found in the distance fluctuation maps which show the most evident distinction in the map of mode 8 while two domains are essent
 
Considering the distinction of the two domains (highlighted in different colors), they are most apparent in mode 8 and to some degree in modes 9 and 10. On the other hand, judging from the motion alone, they would be very hard to spot in mode 7. It is generally evident, that the two alpha-helices of the second domain that are situated towards the face of the first domain, remain comparably flexible. This is visible, especially in mode 8 where both are significantly stretched in a spring-like motion. Furthermore concerning the distinction of the two domains, the previous findings are also found in the distance fluctuation maps which show the most evident distinction in the map of mode 8 while two domains are essent
 
<!-- intro domain movement should also be observable in the map by large blocks! -->
 
<!-- intro domain movement should also be observable in the map by large blocks! -->
Line 269: Line 265:
   
 
Analysing the mean square displacement of all C-alpha atoms also gives some additional insight. The first thing that becomes apparent is that every single mode contains a peak in the middle of the plot, corresponding to resides around SER281 in the original PDB file. Indeed the mode animations show that this loop region at the very left of the figures, exhibits strong movement in every node observed. The movement is even in a somewhat similar fashion in modes 8,9 and 10. An initial idea that comes to mind is that this loop somehow mediates binding to the beta-subunit, however the binding interface of the two subunits forming the catalytically active Hexosaminidase A dimer is far from the loop in question. Instead it seems to be protruding into free space. The strong movement could therefore either be an artefact or fulfilling some function that is not apparent from the data obtained so far. It should be noted that the loop also has a high B-factor in the original resolved structure, therefore contributing positively towards the afore mentioned B-factor correlation value.<br/>
 
Analysing the mean square displacement of all C-alpha atoms also gives some additional insight. The first thing that becomes apparent is that every single mode contains a peak in the middle of the plot, corresponding to resides around SER281 in the original PDB file. Indeed the mode animations show that this loop region at the very left of the figures, exhibits strong movement in every node observed. The movement is even in a somewhat similar fashion in modes 8,9 and 10. An initial idea that comes to mind is that this loop somehow mediates binding to the beta-subunit, however the binding interface of the two subunits forming the catalytically active Hexosaminidase A dimer is far from the loop in question. Instead it seems to be protruding into free space. The strong movement could therefore either be an artefact or fulfilling some function that is not apparent from the data obtained so far. It should be noted that the loop also has a high B-factor in the original resolved structure, therefore contributing positively towards the afore mentioned B-factor correlation value.<br/>
In addition, one can also observe that the first domain shows an above average, mean square displacement in all modes, but 11. This is interesting, as it was not immediately apparent from only the animations and this first domain is not the one that contain the catalytic site or any of the residues previously considered important for the enzymes function. It neither seems to be essential for dimer formation. Genereally there seems to be hardly any information on the function of this domain. Another possibility might be that it mediates binding to the GM2 activator protein, which is essential for catalytic activity. However the where on Hex A the activator protein binds also seems to be unknown. <br/>
+
In addition, one can also observe that the first domain shows an above average, mean square displacement in all modes, but 11. This is interesting, as it was not immediately apparent from only the animations and this first domain is not the one that contain the catalytic site or any of the residues previously considered important for the enzymes function. It neither seems to be essential for dimer formation. Genereally there seems to be hardly any information on the function of this domain in the literature which might help in analysis if the modes. Another possibility might be that it mediates binding to the GM2 activator protein, which is essential for catalytic activity. However the where on Hex A the activator protein binds also seems to be unknown. <br/>
 
Finally, a region that shows almost no movement in all of the plots is the beginning of the second domain. This can likely be explained that there is only a short loop region that is preceded by a beta-strand participating in a large beta-sheet and followed by a beta-strand that is part of the central beta-barrel of the second domain. Therefore there seems to be little room left for movement without introducing major changes in the whole structure.
 
Finally, a region that shows almost no movement in all of the plots is the beginning of the second domain. This can likely be explained that there is only a short loop region that is preceded by a beta-strand participating in a large beta-sheet and followed by a beta-strand that is part of the central beta-barrel of the second domain. Therefore there seems to be little room left for movement without introducing major changes in the whole structure.
   
  +
<!-- TODO mention that the loop moves, but the articulation point is somewhere farther in, the structure itself remains rigid -->
 
TODO mention that the loop moves, but the articulation point is somewhere farther in, the structure itself remains rigid
 
 
<br style="clear:both;">
 
<br style="clear:both;">
   
 
==== 2gk1:A with NGT ====
 
==== 2gk1:A with NGT ====
  +
Although highly interesting, there seems to be no possibility to perform this kind of analysis with elNemo.
Couldn't find out yet how to do it.
 
 
 
 
 
 
 
<!-- for each mode: fig1-3,connect,correlation -->
 
<!-- for each mode: fig1-3,connect,correlation -->
 
 
<!-- find out maybe why RMSD mostly not present, find out DEF. why -->
 
<!-- find out maybe why RMSD mostly not present, find out DEF. why -->
 
<!-- correlation between msd and the other distance ion the map. remember those in the map are chosen from the maximum difference! (these however seem to be one per mode, so its different in any case) -->
 
<!-- correlation between msd and the other distance ion the map. remember those in the map are chosen from the maximum difference! (these however seem to be one per mode, so its different in any case) -->
 
 
<!-- correlation between freq and something else, maybe coverage?
 
<!-- correlation between freq and something else, maybe coverage?
 
b-factor per residue correlation?
 
b-factor per residue correlation?
Line 294: Line 282:
   
 
==Comparison==
 
==Comparison==
  +
<!-- <div style="float:right; display:inline-block;">
Can you observe notable differences between the normal modes calculated by the different servers?
 
 
<div style="float:right; display:inline-block;">
 
 
<figtable id="tbl:comp_elnemo_webnma_modes">
 
<figtable id="tbl:comp_elnemo_webnma_modes">
{| class="wikitable", style="width:300px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
+
{| class="wikitable", style="width:300px;margin:20px; border-collapse: collapse; border-style: solid; border-width:0px; border-color: #000"
|+ <caption><font size="-1">'''Mapping between similar normal modes of both methods.''' Concerning modes 10 and 11, the movement is more pronounced in elNemo, however the directionality clearly shows that both hits describe the same normal modes.</font></caption>
+
|+ <font size="1"><div align="justify">'''Table 6:''' Mapping between similar normal modes of both methods. Concerning modes 10 and 11, the movement is more pronounced in elNemo, however the directionality clearly shows that both hits describe the same normal modes.</div></font>
 
|- align="left"
 
|- align="left"
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode WEBnm@
 
! style="border-style: solid; border-width: 0 0 2px 0" | Mode WEBnm@
Line 322: Line 308:
 
</figtable>
 
</figtable>
 
</div>
 
</div>
  +
-->
 
Mappings between the normal modes predicted by both servers are shown in <xr id="tbl:comp_elnemo_webnma_modes"/>. Not only can every mode can be associated between WEBnm@ and elNemo, the methods also agree on ranking by frequency. Generally it can be seen that the modes by elNemo seem to have a higher amplitude WHY_TODO(Can we someplot the amplitude in comparison? overlay the distance travelled maybe?).
+
Normal modes predicted by both servers are nearly equal. Not only can every mode can be associated between WEBnm@ and elNemo, the methods also agree on ranking by frequency. Generally it can be seen that the modes by elNemo seem to have a higher amplitude while the direction of movement remains the same.
   
 
<br style="clear:both;">
 
<br style="clear:both;">
   
 
= Comparison to Molecular Dynamics =
 
= Comparison to Molecular Dynamics =
  +
The highly flexible loop can also be identified as a flexible region in the MD analysis. This feature is retained through all simulations, whether a mutation is present or not. However the amount of movement shown, is only little compared to the normal modes.<br/>
When your MD simulations are finished, compare the lowest-frequency normal modes with your MD simulation using visualization software, e.g. PyMol or VMD. Can you observe different movements or similar dynamics? If possible, compare an overlay of the lowest-frequency modes to your MD simulation. You can superimpose the normal modes for example in VMD.
 
  +
The resulting movement of the MD analysis, even when filtered for only global position changes, seems less coordinated than that of the NMA. In addition, there is no mode that seems comparable in motion to the MD trajectory and would therefore suggest and overlay.<br/>
What are the advantages and disadvantages of NMA compared to MD?
 
  +
A major drawback of the MD analysis, apart from the obvious increase in runtime, is the fact that gaps in the structure are problematic and therefore the first domain had to be excluded. This could contribute to lack of similarity between MD and NMA. While only the second domain is directly involved in catalysis, MD analysis suggests that it might still be important to maintain structure and rigidity in some parts of the second domain. The NMA can not compete with the capillary motions of MD but it is very advantageous in that it can model the entire alpha subunit with both domains. As shown above this can be vital, since several modes show significant movement within this domain but more importantly movement between the two domains relative to each other providing further insights into the protein dynamics.
  +
 
=Conclusion=
 
=Conclusion=
  +
NMA has proven to be an easy to apply and efficient possibility for protein motions analysis. Although only the backbone is considered, most of the computed modes provided a substantial base knowledge for the identification or possible reinforcement of the interactions of the two Hex A alpha subunits' domains. Unfortunately it was not possible to integrate a ligand into the analysis, which might have given implications to the global movement of the protein during catalysis. Furthermore, it would be interesting to see how the prediction methods handle the heterodimer Hex A that is the actual biologically active unit.
  +
 
= References =
 
= References =
 
<references/>
 
<references/>

Latest revision as of 21:13, 31 August 2012


The journal for this task can be found here.

Introduction

Normal mode analysis aims at providing an overview of protein motility of large global areas by displaying harmonic motions of the oscillating system of a molecule. Anharmonic motions are neglected and only the backbone is considered.

For this task 2gk1 is chosen as reference structure since it contains a ligand similar to the native one and therefore allows to observe the effect the presence of the ligand exerts on the normal modes. If not otherwise noted, the focus will be on low-frequency modes, since they are thought to play the most important roles in protein conformational changes <ref name=lowfreq1>Delarue,M. and Dumas,P. (2004) On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models. Proceedings of the National Academy of Sciences of the United States of America, 101, 6957-62.</ref><ref name=lowfreq2>Chou,K.C. (1983) Identification of low-frequency modes in protein molecules. The Biochemical journal, 215, 465-9.</ref>.

The alpha subunit of Hexosaminidase A consists of two domains, which are annoated almost equally in Pfam (first domain from 35-165), SCOP (23-166) and CATH (23-164) and will therefore not be separately evaluated for each annotation.

Elastic network models

WEBnm@

WEBnm@ is a webserver application with the means of providing a simple and automated computation of low frequency normal modes. This analysis offers an opportunity to ascertain whether a protein or a specific protein region undergoes large amplitude movements and thus is predestined for a more thorough analysis.
Calculations are performed using the C-alpha force field where only the C-alpha atoms are considered and assigned the according whole residue masses. A coarse-grained model is employed and frequencies and energies are interpreted on relative scales and therefore reported without units.
Provided are: deformation energies of each mode, eigenvalues, calculation of normalized squared atomic displacements, calculation of normalized squared fluctuations, interactive visualization of the modes using vector field representation or vibrations and the correlation matrix.
For comparison of dynamics to related proteins, the web server additionally features comparative analyses of protein structures<ref name=webnam>http://apps.cbu.uib.no/webnma/about</ref>.


Normal modes

The normal mode analysis was conducted with the 2gk1 chain A with ligand, without ligand and also with the ligand HETATM changed to ATOM. All resulted in the same identical output. It seems the NGT ligand has no effect on the mode calculation. Altogether there are 20 modes calculated of which the first 6 are reported in more detail.

<figtable id="tbl:defengs">

TSD DefEnergies.png
TSD Eigenvalues.png
Table 1: Deformation energies and Eigenvalues.

</figtable>

The deformation energies are displayed in <xr id="tbl:defengs"/> on the left and the eigenvalues in <xr id="tbl:defengs"/> on the right. Deformation energies and eigenvalues reflect the energy associated with each mode and are inversely related to the amplitude of the motion. That means the lower the eigenvalue and deformation energy the lower also the frequency and mode number but more importantly the larger and more global the motion. Thus the low numbered modes have a lot of rigid regions whereas the modes with higher numbers should show more distributed and intricate motions. The deformation energies in this case display a range between 1026.72 to 7962.67.



<figure id="webnma_correlation">

Figure 1: Correlation matrix. Shown is the correlated movement of the C-alphas in the protein on a range from -1 (anti-correlated) in blue, 0 (uncorrelated) with no color to 1 (correlated) in red.

</figure>


The correlation of all residue movements can be viewed in <xr id="webnma_correlation"/>. There are a lot of red regions always in the same distance near the diagonal. This shows a high positive correlation of local clusters of residues with a recurrent pattern. Blue regions show anti-correlation and these can be observed throughout the whole sequence in arbitrary distances. Thus residues are more likely to move accordingly if they are near each other whereas a contrary movement can be found between many different C-alphas. Interestingly most residues stand in some relation to each other as there are more colored than white regions in the matrix which provides evidence for the assumption that the movement is all together rather harmonized.




<figtable id="tbl:fluct">

Table 2: Fluctuations. The fluctuations are the sum of the atomic displacements in each mode weighted by the inverse of their corresponding eigenvalues.The fluctuations are plotted on the right and displayed on the structure on the left (high fluctuation red, low displacement blue).

TSD Rendered.png
2GK1A NGTdeleted fluctuationsplot.png

</figtable>


The fluctuations of the modes are shown in <xr id="tbl:fluct"/>. These are the normalized squares of the displacement of each C-alpha atom averaged over all modes weighted by their eigenvalue. Local movements can be recognized by small peaks whereas wide peaks indicate larger flexible protein regions. The highest displacement is expressed by a loop in the middle of the sequence around the residue SER281. Further more there is a clear distinction between loops and ends of structural elements that show increased fluctuations and regions with a solid secondary structure that show no variation. Around the active site there is no fluctuation as well. This suggest that the main protein framework stays stable and expresses no significant structural changes during the interaction with its environment.


<figtable id="tbl:overlap">

Table 3: Overlap analysis with 2gjx chain A.

TSD Overlapplot2.png
TSD Overlapplot.png

</figtable>

In addition an overlap analysis was conducted with the 2gjx chain A which is the Hexosaminidase A subunit without a bound NGT to see in what extent the normal modes of 2gk1 translate into the 2gjx conformation. The alpha subunits of 2gjx and 2gk1 are very similar in their conformation with an RMSD of 0.218 Å. The overlap of the 2gk1 normal modes and 2gjx is displayed in <xr id="tbl:overlap"/>. The overlap is the calculation of the squared dot product between the difference vector and the full set of normal modes. The modes which are most important for conformational transformation between the modes and the 2gjx structure are 51 77 128 and 136. Their squared overlap is still comparably low. Altogether all modes have a cumulative overlap of 0.3 which shows that the modes do not in a great extent transform the 2gk1 conformation so it resembles the 2gjx.


<figtable id="tbl:web">

Table 4: Lowest modes from WEBnm@ calculation with their visualization, deformation energy and atomic displacement. Within the visualization there is the NGT ligand displayed in orange and the important residues in yellow.

Mode Deformation energy Visualisation Atomic displacement
7 1026.72
TSD 7Animation.gif
2GK1A NGTdeleted.pdb.mode7plot.png
8 1309.04
TSD 8Animation.gif
2GK1A NGTdeleted.pdb.mode8plot.png
9 1746.16
TSD 9Animation.gif
2GK1A NGTdeleted.pdb.mode9plot.png
10 2851.91
TSD 10Animation.gif
2GK1A NGTdeleted.pdb.mode10plot.png
11 3074.36
TSD 11Animation.gif
2GK1A NGTdeleted.pdb.mode11plot.png
12 3970.00
TSD 12Animation.gif
2GK1A NGTdeleted.pdb.mode12plot.png

</figtable>


<xr id="tbl:web"/> shows all mode animations with their corresponding deformation energies and atomic displacements. In the mode visualization on the base structure the NGT ligand is shown in orange and the important residues in yellow.

All modes differ in their displacement plots as well as in their distinct motions. The atomic displacement goes much higher with lower modes (up to 3) and only up to 1.5 in mode 12 that is a sign for the large and more global motions in low deformation energy modes. The loop on the left of the protein that was identified to have a lot of displacement in <xr id="tbl:fluct"/> can be spotted in most modes but more evidently in the lower ones. As to why this region could be so motile and whether it could be important for function can only be speculated as it is not near the catalytic site and therefore no functional advantage is apparent.

Mode 7 exhibits a hinge movement with many parts of the protein staying rigid. Mode 8 has more motile protein ends that move in contrary directions to the rest of the protein. In mode 9 the right and left end of the protein twist even further in opposite directions to each other. In mode 10 the loop around SER281 develops a life of its own whereas the rest of the protein stays comparably still. From mode 11 on the protein exhibits center oriented breathing-like motions mostly along the y-axis but besides there are no more larger parts moving independently but rather small parts that vibrate. The residues around the active site form a comparably stable and fixed part of the whole protein which suggests that this special conformation is vital for the proper binding of the ligand.

The 2 domains colored blue and red can be distinguished best in mode 8 and 9 where they move like a hinge against each other. Modes 11 and 12 on the contrary are to rigid in the y-axis motions as to show signs of separate domains. In mode 7 the domain separation is not apparent as the domain boundaries remain rigid.


elNemo

elNemo is available only as a webserver and employs elastic network models <ref name=elnemo>Suhre,K. and Sanejouand,Y.-H. (2004) ElNemo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement. Nucleic acids research, 32, W610-4.</ref>. Atoms considered are limitied to only C-alpha atoms and using a technique termed 'rotation-translation-block', which groups residues into so-called super-residues, there are hardly any restrictions on the input protein's size. The latter approximation is reported to have only little effect on low frequency modes, and the number of residues grouped together is chosen depending on the protein's size. Therefore, small proteins may only contain one residue per super-residue.

Various measurements are reported as results. For each mode, a collectivity can be calculated that expresses how many atoms are affect by the motion of the mode. The larger the value the more atoms are significantly affected.
Additionally, B-factors are calculated from the first 100 normal modes and scaled to the observed values, reported in the PDB file. From this a global correlation value is calculated which describes how well the modes approximate the general global flexibility of the input structure.
For any given mode there is also a map of distance fluctuations created that shows the movement of each residue during the movement of the mode. For this calculation the two most extreme conformations in the mode are used.
Furthermore, the mean square displacement, describing the distance travelled by an atom, is calculated for every C-alpha atom in a mode. This different from the distance fluctuations before, since here, for every time step, the absolute distances are summed up, resulting in the actual distance travelled, which is a different type of information <ref name=msd1> Weisstein, Eric W. "Mean Square Displacement." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/MeanSquareDisplacement.html</ref> <ref name=msd2>http://isaacs.sourceforge.net/phys/msd.html</ref>.
Finally there is a feature still termed experimental, that shows the distance variation between successive residues, again using the two most extreme conformations for the current mode. These values are shown per-residue and should therefore allow one to point out residue pairs that seem to play an important role in mediating the movement of the mode.


2gk1:A without NGT

<figtable id="tbl:2gk1a_nongt_elnemo_intro">

Table 5: Lowest frequency modes obtained from elNemo. Shown is a visualisation of the mode, as well as distance fluctuation map as explained in the introduction and the per-residue mean squared displacement. For the visualization and mean squared displacement, the first domain is highlighted in red and the second one in blue. For the distance fluctuation map, red denotes a distance decrease and blue an increase.

Mode Frequency Collectivity Visualisation Distance fluctuation Mean squared displacement
7 1.0 0.49
2gk1 elnemo nongt mode7.gif
2gk1 elnemo nongt mode7 fluctu.png
2gk1 elnemo nongt mode7 msd.png
8 1.1 0.54
2gk1 elnemo nongt mode8.gif
2gk1 elnemo nongt mode8 fluctu.png
2gk1 elnemo nongt mode8 msd.png
9 1.29 0.53
2gk1 elnemo nongt mode9.gif
2gk1 elnemo nongt mode9 fluctu.png
2gk1 elnemo nongt mode9 msd.png
10 1.66 0.39
2gk1 elnemo nongt mode10.gif
2gk1 elnemo nongt mode10 fluctu.png
2gk1 elnemo nongt mode10 msd.png
11 1.73 0.30
2gk1 elnemo nongt mode11.gif
2gk1 elnemo nongt mode11 fluctu.png
2gk1 elnemo nongt mode11 msd.png

</figtable>

The five lowest-frequency modes found by elNemo are shown in <xr id="tbl:2gk1a_nongt_elnemo_intro"/>. As can be seen from the collectivity values, more atoms are significantly affected by the movement in modes 7,8 and 9, than in modes 10 and 11. However, this is only a crude measurement. <xr id="fig:collandfreq"/> shows the correlation between frequency of the mode and its collectivity. As can be seen modes with very low frequency seem to have higher collectivity, however the data is very sparse in this region and must be handled cautiously.

<figure id="fig:collandfreq">

Figure 2 : Frequency and Collectivity values of the first 100 modes found by elNemo.

</figure> Considering the distinction of the two domains (highlighted in different colors), they are most apparent in mode 8 and to some degree in modes 9 and 10. On the other hand, judging from the motion alone, they would be very hard to spot in mode 7. It is generally evident, that the two alpha-helices of the second domain that are situated towards the face of the first domain, remain comparably flexible. This is visible, especially in mode 8 where both are significantly stretched in a spring-like motion. Furthermore concerning the distinction of the two domains, the previous findings are also found in the distance fluctuation maps which show the most evident distinction in the map of mode 8 while two domains are essent

Generally, the first domain retains a very rigid structure, in all modes but 11. It should be noted however, that the lack of movement within this domain might be attributed to the unresolved region between residues 74 and 89. Other stable structures are most of the alpha-helices in the second domain, with the exception of the two at the binding face, already mentioned above. The central beta-barrel undergoes movement in every mode, however never by a large distance. How the movement of this region might affect binding strength of the ligand remains to be analysed later on. In total the protein shows far less movement than many of the other group's disease's proteins.

Analysing the mean square displacement of all C-alpha atoms also gives some additional insight. The first thing that becomes apparent is that every single mode contains a peak in the middle of the plot, corresponding to resides around SER281 in the original PDB file. Indeed the mode animations show that this loop region at the very left of the figures, exhibits strong movement in every node observed. The movement is even in a somewhat similar fashion in modes 8,9 and 10. An initial idea that comes to mind is that this loop somehow mediates binding to the beta-subunit, however the binding interface of the two subunits forming the catalytically active Hexosaminidase A dimer is far from the loop in question. Instead it seems to be protruding into free space. The strong movement could therefore either be an artefact or fulfilling some function that is not apparent from the data obtained so far. It should be noted that the loop also has a high B-factor in the original resolved structure, therefore contributing positively towards the afore mentioned B-factor correlation value.
In addition, one can also observe that the first domain shows an above average, mean square displacement in all modes, but 11. This is interesting, as it was not immediately apparent from only the animations and this first domain is not the one that contain the catalytic site or any of the residues previously considered important for the enzymes function. It neither seems to be essential for dimer formation. Genereally there seems to be hardly any information on the function of this domain in the literature which might help in analysis if the modes. Another possibility might be that it mediates binding to the GM2 activator protein, which is essential for catalytic activity. However the where on Hex A the activator protein binds also seems to be unknown.
Finally, a region that shows almost no movement in all of the plots is the beginning of the second domain. This can likely be explained that there is only a short loop region that is preceded by a beta-strand participating in a large beta-sheet and followed by a beta-strand that is part of the central beta-barrel of the second domain. Therefore there seems to be little room left for movement without introducing major changes in the whole structure.


2gk1:A with NGT

Although highly interesting, there seems to be no possibility to perform this kind of analysis with elNemo.

Comparison

Normal modes predicted by both servers are nearly equal. Not only can every mode can be associated between WEBnm@ and elNemo, the methods also agree on ranking by frequency. Generally it can be seen that the modes by elNemo seem to have a higher amplitude while the direction of movement remains the same.


Comparison to Molecular Dynamics

The highly flexible loop can also be identified as a flexible region in the MD analysis. This feature is retained through all simulations, whether a mutation is present or not. However the amount of movement shown, is only little compared to the normal modes.
The resulting movement of the MD analysis, even when filtered for only global position changes, seems less coordinated than that of the NMA. In addition, there is no mode that seems comparable in motion to the MD trajectory and would therefore suggest and overlay.
A major drawback of the MD analysis, apart from the obvious increase in runtime, is the fact that gaps in the structure are problematic and therefore the first domain had to be excluded. This could contribute to lack of similarity between MD and NMA. While only the second domain is directly involved in catalysis, MD analysis suggests that it might still be important to maintain structure and rigidity in some parts of the second domain. The NMA can not compete with the capillary motions of MD but it is very advantageous in that it can model the entire alpha subunit with both domains. As shown above this can be vital, since several modes show significant movement within this domain but more importantly movement between the two domains relative to each other providing further insights into the protein dynamics.

Conclusion

NMA has proven to be an easy to apply and efficient possibility for protein motions analysis. Although only the backbone is considered, most of the computed modes provided a substantial base knowledge for the identification or possible reinforcement of the interactions of the two Hex A alpha subunits' domains. Unfortunately it was not possible to integrate a ligand into the analysis, which might have given implications to the global movement of the protein during catalysis. Furthermore, it would be interesting to see how the prediction methods handle the heterodimer Hex A that is the actual biologically active unit.

References

<references/>