Difference between revisions of "Molecular Dynamics Simulations Analysis Hemochromatosis"

From Bioinformatikpedia
(cluster analysis)
(Pymol analysis of average and bfactor)
 
(39 intermediate revisions by 2 users not shown)
Line 7: Line 7:
   
 
Detailed description: [[Task_10_-_Molecular_Dynamics_Simulations|Molecular dynamics simulations analysis]]
 
Detailed description: [[Task_10_-_Molecular_Dynamics_Simulations|Molecular dynamics simulations analysis]]
  +
  +
Continuation of [[Molecular_Dynamics_Simulations_Hemochromatosis|task 8]]. Analysis of the MD simulations for the wildtype and two mutants (R224W and C282S).
   
 
<br style="clear:both;">
 
<br style="clear:both;">
Line 18: Line 20:
   
 
== Molecular dynamics simulations for HFE ==
 
== Molecular dynamics simulations for HFE ==
Note: All pictures/graphs shown here are from the first run (in case of 1a6zC[wildtype]-pictures) or second run (in case of R224W- or C282S-mutation). The reason for this is described under LINKTOMINDISTTODO.
+
Note: All pictures/graphs shown here are from the first run (in case of 1a6zC[wildtype]-pictures) or second run (in case of R224W- or C282S-mutation). The reason for this is described under section [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#Minimum_distance_between_periodic_boundary_cells| Minimum distance between periodic boundary cells]].
   
   
Line 161: Line 163:
   
 
gcq#236: "Wait a Minute, aren't You.... ? (gunshots) Yeah." (Bodycount)
 
gcq#236: "Wait a Minute, aren't You.... ? (gunshots) Yeah." (Bodycount)
  +
-----------------Run2
  +
  +
Checking file /mnt/home/student/bernhoferm/mstrprkt/task8/run2/models/1a6zC /mdrun_1a6zC/1a6zC_md.xtc
  +
Reading frame 0 time 0.000
  +
# Atoms 68607
  +
Precision 0.001 (nm)
  +
Last frame 2000 time 10000.000
  +
  +
  +
Item #frames Timestep (ps)
  +
Step 2001 5
  +
Time 2001 5
  +
Lambda 0
  +
Coords 2001 5
  +
Velocities 0
  +
Forces 0
  +
Box 2001 5
  +
  +
gcq#266: "Why Weren't You at My Funeral ?" (G. Groenhof)
  +
  +
Checking file /mnt/home/student/bernhoferm/mstrprkt/task8/run2/models/scwrl _C282S/mdrun_scwrl_C282S/scwrl_C282S_md.xtc
  +
Reading frame 0 time 0.000
  +
# Atoms 68603
  +
Precision 0.001 (nm)
  +
Last frame 2000 time 10000.000
  +
  +
  +
Item #frames Timestep (ps)
  +
Step 2001 5
  +
Time 2001 5
  +
Lambda 0
  +
Coords 2001 5
  +
Velocities 0
  +
Forces 0
  +
Box 2001 5
  +
  +
gcq#106: "Count the Bubbles In Your Hair" (The Breeders)
  +
  +
Checking file /mnt/home/student/bernhoferm/mstrprkt/task8/run2/models/scwrl _R224W/mdrun_scwrl_R224W/scwrl_R224W_md.xtc
  +
Reading frame 0 time 0.000
  +
# Atoms 68602
  +
Precision 0.001 (nm)
  +
Last frame 2000 time 10000.000
  +
  +
  +
Item #frames Timestep (ps)
  +
Step 2001 5
  +
Time 2001 5
  +
Lambda 0
  +
Coords 2001 5
  +
Velocities 0
  +
Forces 0
  +
Box 2001 5
  +
  +
gcq#281: "I'll Match Your DNA" (Red Hot Chili Peppers)
  +
   
 
-->
 
-->
Line 180: Line 238:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
<xr id="tab:traj_vis"/> shows the trajectories for the three MD simulations (WT, R224W, and C282S). The wildtype and R224W mostly maintain the starting structure, though some bigger parts are shifted as a whole. In both cases the upper and lower helices of the MHC I domain seem to move closer together and the M1 domain (right part in the figure) rotates a bit around its length-axis. The trajectory for C282S exhibits much less movement as the previous two which are quite flexible. Towards the end of the simulation there is a deformation of the M1 domain which is most likely caused by the missing disulfide bond due to the mutation. This deformation presumably prevents the later (but crucial) binding of beta-2-microglobulin to HFE.
+
<xr id="tab:traj_vis"/> shows the trajectories for the three MD simulations (WT, R224W, and C282S). The wildtype and R224W mostly maintain the starting structure, though some bigger parts are shifted as a whole. In both cases the upper and lower helices of the MHC I domain seem to move closer together and the C1 domain (right part in the figure) shifts around, but retains its general structure. The trajectory for C282S exhibits much less movement as the previous two which are quite flexible. Towards the end of the simulation there is a deformation of the C1 domain which is most likely caused by the missing disulfide bond due to the mutation. This deformation presumably prevents the later (but crucial) binding of beta-2-microglobulin to HFE.
   
 
<br style="clear:both;">
 
<br style="clear:both;">
Line 267: Line 325:
   
   
<figtable id="comparison">
+
<figtable id="minDistRun1">
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
! scope="row" align="left" |
 
! scope="row" align="left" |
Line 279: Line 337:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
The first calculations for the mutations resulted in the minimum distances of TABLETODO. As there should be at least 2 nm distance in between at all time one can see that the mutations show the opposite. Therefore it might be possible that the protein affects itself which is not desired. To see if this states were calculated just by chance (random fluctuations that built up over time into an undesired direction) we repeated the calculations for all three models.
+
The first calculations for the mutations resulted in the minimum distances of <xr id="minDistRun1"/>. As there should be at least 2 nm distance in between at all time one can see that the mutations show the opposite. Therefore it might be possible that the protein affects itself which is not desired. To see if this states were calculated just by chance (random fluctuations that built up over time into an undesired direction) we repeated the calculations for all three models.
   
 
<br style="clear:both;">
 
<br style="clear:both;">
   
<figtable id="comparison">
+
<figtable id="minDistRun2">
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
! scope="row" align="left" |
 
! scope="row" align="left" |
Line 295: Line 353:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
The resulting minimal distances can be seen in TABLETODO.
+
The resulting minimal distances can be seen in <xr id="minDistRun2"/>.
 
We therefore decided to use the model of the first calculation for the wildtype, and the models of the second calculation for the mutation types.
 
We therefore decided to use the model of the first calculation for the wildtype, and the models of the second calculation for the mutation types.
   
Line 316: Line 374:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
In general the RMS fluctuations of all three models look similar. The most differing graph is the one of R224W which shows a major peak at residues 220-235 as well as only a small peak at around residue 20.
+
In general the RMS fluctuations of all three models look similar (see <xr id="tab:rmsf_prot"/>). The most differing graph is the one of R224W which shows a major peak at residues 220-235 as well as only a small peak at around residue 20.
   
 
==== C-Alpha based ====
 
==== C-Alpha based ====
Line 332: Line 390:
   
 
<br style="clear:both;">
 
<br style="clear:both;">
The c-alpha based RMSF shows the same behavior as the whole-protein-based ones: the major differences are at around residue 20 and 220-235 of the R224W mutation.
+
The c-alpha based RMSF (see <xr id="tab:rmsf_ca"/>) shows the same behavior as the whole-protein-based ones: the major differences are at around residue 20 and 220-235 of the R224W mutation.
   
 
==== Statistical values ====
 
==== Statistical values ====
Line 413: Line 471:
 
-->
 
-->
   
  +
With a script (provided by Dr. Marc Offman) we calculated the p-values for the RMS fluctuations to identify significant differences between the wildtype and the mutants. Surprisingly all three pairwise comparisons had very low p-values:
With the GIVENSCRIPTTODO we calculated the values for the t-Test:
 
   
  +
* Wildtype and R224W: 8.079405e-57
1a6zC to R224W :
 
  +
* Wildtype and C282S: 2.87453e-56
8.079405e-57**
 
  +
* R224W and C282S: 5.45069e-25
   
1a6zC to C282S :
 
2.87453e-56**
 
   
  +
These results are to be considered with care though as the significance does not have to be caused by the mutations, but could also be caused by random events during the simulation. In fact, the wildtype alone (simulation 1 compared to simulation 2) showed up as highly significant (different).
R224W to C282S :
 
5.45069e-25**
 
   
 
<br style="clear:both;">
 
<br style="clear:both;">
Line 441: Line 497:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
In these part we evaluate the model averages and b-factors of each position. Because it is an average over all timesteps this averaged structure can be impossible in nature.
+
In this part we evaluate the model averages and b-factors of each position. Because it is an average over all timesteps this averaged structure can be impossible in nature.
   
As one can see there is a big change of b-factors when comparing the wildtype and both mutations we calculated. From both mutations the R224W one shows a bigger difference to the wildtype. As expected the positions "at the edges" <!-- most exposed?--> and those not in a secondary structure tend to have higher fluctuations/higher b-factors. It is worth noting, that even the beta sheets on the right side of the R224W picture (position 180 and higher) have (compared to wildtype and C282S mutation) pretty high b-factors. Also one can see that a little helix is inserted into the average structure of the R224W average model (right part of the picture, high b-factor [red], position ~220-225).
+
As one can see (compare: <xr id="tab:avg_bfactor"/>) there is a big change of b-factors when comparing the wildtype and both mutations we calculated. From both mutations the R224W one shows a bigger difference to the wildtype. As expected the positions "at the edges" <!-- most exposed TODO?--> and those not in a secondary structure tend to have higher fluctuations/higher b-factors. It is worth noting, that even the beta sheets on the right side of the R224W picture (position 180 and higher) have (compared to wildtype and C282S mutation) pretty high b-factors. Also one can see that a little helix is inserted into the average structure of the R224W average model (right part of the picture, high b-factor [red], position ~220-225).
   
  +
Also worth noting is, that the calculation of b-factors through NMA resulted in ones (cf. <xr id="tab:avg_bfactor"/> and [[Normal_Mode_Analysis_Hemochromatosis#Atomic_fluctuations| WebNMA fluctuations]]) similar to our MD calculated wildtype b-factors. It might also be possible that Elnemo (cf. [[Normal_Mode_Analysis_Hemochromatosis#B-factors| elNemo b-factors]]) has also calculated b-factors similar to the MD ones but are hidden in this depiction by our threshold.
 
<br style="clear:both;">
 
<br style="clear:both;">
   
Line 473: Line 530:
   
 
<br style="clear:both;">
 
<br style="clear:both;">
These graphs show the radii of gyration in general als well as in each dimension. From start to end there is a slight increase in general gyrationradius for all models. However the fluctuation of the radius is the weakest for the wildtype and the C282S mutation. There seem to be most fluctuation in the R224W model. Also the R224W mutation has in general the highes radius of gyration.
+
These graphs (<xr id="tab:gyration_ca"/> and <xr id="tab:gyration_prot"/>) show the radii of gyration in general as well as in each dimension. From start to end there is a slight increase in general gyration-radius for all models. However the fluctuation of the radius is the weakest for the wildtype and the C282S mutation. There seem to be most fluctuation in the R224W model. Also the R224W mutation has in general the highes radius of gyration.
   
 
Another striking difference between the models is the low radius of gyration in the Z dimension for the mutations, whereas this type of gyration is in the wildtype only low for the first ~3000ps. In exchange both mutations gain more radius of gyration in the Y dimension compared to the wildtype.
 
Another striking difference between the models is the low radius of gyration in the Z dimension for the mutations, whereas this type of gyration is in the wildtype only low for the first ~3000ps. In exchange both mutations gain more radius of gyration in the Y dimension compared to the wildtype.
Line 479: Line 536:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
=== solvent accessible surface area ===
+
=== Solvent accessible surface area ===
   
 
<figtable id="tab:sas">
 
<figtable id="tab:sas">
Line 492: Line 549:
 
</figtable>
 
</figtable>
 
<br style="clear:both;">
 
<br style="clear:both;">
The preceding graphs show the Solvent accessible surface. As the hydrophilic, total and D Gsolv values are very similar we will just talk about the hydrophobic values.
+
The preceding graphs (see <xr id="tab:sas"/>) show the Solvent accessible surface. As the hydrophilic, total and D Gsolv values are very similar we will just talk about the hydrophobic values.
   
 
The most similar ones between those three are the wildtype and the R224W mutation which differ only in the strength of the fluctuation, of which the mutation has the higher one. Both fluctuate around about the same value.
 
The most similar ones between those three are the wildtype and the R224W mutation which differ only in the strength of the fluctuation, of which the mutation has the higher one. Both fluctuate around about the same value.
Line 511: Line 568:
   
 
<br style="clear:both;">
 
<br style="clear:both;">
In the area per residue averaged over the trajectory no clear difference can be seen.
+
In the area per residue averaged over the trajectory no clear difference can be seen (<xr id="tab:sas_res"/>).
   
 
<br style="clear:both;">
 
<br style="clear:both;">
   
=== hydrogen-bonds between protein and protein / protein and water ===
+
=== Hydrogen-bonds between protein and protein / protein and water ===
 
==== Protein-Protein ====
 
==== Protein-Protein ====
   
Line 530: Line 587:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
The calculations show that in each of the three cases the number of bonds within the protein as well as their distances tend to stay the same.
+
The calculations show that in each of the three cases (see <xr id="tab:hbonds_pp"/>) the number of bonds within the protein as well as their distances tend to stay the same.
   
 
<br style="clear:both;">
 
<br style="clear:both;">
Line 548: Line 605:
 
<br style="clear:both;">
 
<br style="clear:both;">
   
Although the inner-protein bonds seem to stay the same (see PREVIOUSSECTIONTODO), the bonds formed with hydrogen show a different behavior.
+
Although the inner-protein bonds seem to stay the same (compare section [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#Protein-Protein|Protein-Protein]]), the bonds formed with hydrogen show a different behavior (see <xr id="tab:hbonds_pw"/>).
   
In case of the wildtype, both the number of hydrogen bonds as well as the number of pairs within 0.35nm are (compared to the rest) fairly low at first, rising in the first ~600ps. This may be an indication that at first a dense protein state is existent. Also from ~8000-8600ps there is a drop in the number of hydrogen bonds, whereas the number of pairs within 0.35nm does not show an equal behavior. Overall the numbers tend to be at the same level each after the first 600ps rise.
+
In case of the wildtype, both the number of hydrogen bonds as well as the number of pairs within 0.35nm are (compared to the values of the following timesteps) lower at first, rising in the first ~600ps. This may be an indication that at first a dense protein state is existent. Also from ~8000-8600ps there is a drop in the number of hydrogen bonds, whereas the number of pairs within 0.35nm does not show an equal behavior. Overall the numbers tend to be at the same level each after the first 600ps rise.
   
 
The R224W model shows a different behavior over time:
 
The R224W model shows a different behavior over time:
Line 614: Line 671:
   
 
The RMSD fluctuations for C282S exhibit an opposite behavior to R224W. There is a major structural change at 2000ps and some minor changes up to 3000ps. After 3000ps the structure remains almost the same until the end of the simulation. As the mutation is malign it can be assumed that it is a non-functional structure in which the proteins seems to be trapped.
 
The RMSD fluctuations for C282S exhibit an opposite behavior to R224W. There is a major structural change at 2000ps and some minor changes up to 3000ps. After 3000ps the structure remains almost the same until the end of the simulation. As the mutation is malign it can be assumed that it is a non-functional structure in which the proteins seems to be trapped.
  +
  +
<br style="clear:both;">
  +
  +
=== Internal RMSD ===
  +
  +
  +
<figtable id="tab:rmsd_vs_start">
  +
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
  +
<!--
  +
! scope="row" align="left" |
  +
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-ca-vs-start.png|thumb|300px|Wildtype]]
  +
| align="right" | [[File:Hemo_MD_R224W_rmsd-ca-vs-start_Run2.png|thumb|300px|R224W]]
  +
| align="right" | [[File:Hemo_MD_C282S_rmsd-ca-vs-start_Run2.png|thumb|300px|C282S]]
  +
|-
  +
-->
  +
! scope="row" align="left" |
  +
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-allProt-atom-vs-start.png|thumb|300px|Wildtype]]
  +
| align="right" | [[File:Hemo_MD_R224W_rmsd-allProt-atom-vs-start_Run2.png|thumb|300px|R224W]]
  +
| align="right" | [[File:Hemo_MD_C282S_rmsd-allProt-atom-vs-start_Run2.png|thumb|300px|C282S]]
  +
|-
  +
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 21:''' RMSD of the calculated models over time against the beginning structure. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
  +
|}
  +
</figtable>
  +
<br style="clear:both;">
  +
  +
<figtable id="tab:rmsd_vs_avg">
  +
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
  +
<!--
  +
! scope="row" align="left" |
  +
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-ca-vs-average.png|thumb|300px|Wildtype]]
  +
| align="right" | [[File:Hemo_MD_R224W_rmsd-ca-vs-average_Run2.png|thumb|300px|R224W]]
  +
| align="right" | [[File:Hemo_MD_C282S_rmsd-ca-vs-average_Run2.png|thumb|300px|C282S]]
  +
|-
  +
-->
  +
! scope="row" align="left" |
  +
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-all-atom-vs-average.png|thumb|300px|Wildtype]]
  +
| align="right" | [[File:Hemo_MD_R224W_rmsd-all-atom-vs-average_Run2.png|thumb|300px|R224W]]
  +
| align="right" | [[File:Hemo_MD_C282S_rmsd-all-atom-vs-average_Run2.png|thumb|300px|C282S]]
  +
|-
  +
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 22:''' RMSD of the calculated models over time against the average structure (average based on all models over time). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
  +
|}
  +
</figtable>
  +
<br style="clear:both;">
  +
  +
<xr id="tab:rmsd_vs_start"/> and <xr id="tab:rmsd_vs_avg"/> show the RMSD values for the different structural states compared to the starting and average structure respectively.
  +
  +
The first (against starting structure) in particular reflects what could already be seen in the [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#RMSD_matrix|RMSD matrix]] section. The wildtype quickly assumes what seems to be its average structure and then periodically fluctuates around it. R224W slowly deviates from the starting structure until 6500ps where it has a moderate structural change and a strong one later at around 9500ps. C282S has a strong deviation at 2000ps and then remains quite stable till the end.
  +
  +
When compared to the average structure the wildtype reaches its favored conformation at around 2000ps. There are some peaks around 4300ps, 5100ps, 7900ps, and 9100ps, but the RMSD alsways goes down afterwards. At the very end, though, is the highest peak which suggests a disruption of the structural equilibrium. This might be caused by the drop in the [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#Minimum_distance_between_periodic_boundary_cells|MinDist]] between the boundaries below 2nm at the end of the simulation and therefore be a result of unwanted interactions between the simulated proteins.
  +
  +
In contrast to the wildtype R224W has a higher RMSD to its average structure (0.3 compared to 0.2). This suggests a higher number of different states which are still quite similar to the starting structure (RMSD ~0.4). At the end of the simulation the equilibrium is disrupted (like in the wildtype simulation), but this time the reason is unkown as the minimum distance between boundaries is one of the highes values in this time period.
  +
  +
Of the three simulations C282S shows the most stable average structure. It is accomplished by two major structural changes at around 2000ps and 3000ps. After that C282S exhibits the lowest RMSD fluctuations of the trio.
   
 
<br style="clear:both;">
 
<br style="clear:both;">
Line 637: Line 747:
 
| align="right" | [[File:Hemo_MD_C282S_PP cluster-sizes_Run2.png|thumb|300px]]
 
| align="right" | [[File:Hemo_MD_C282S_PP cluster-sizes_Run2.png|thumb|300px]]
 
|-
 
|-
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 21:''' Graphs showing the cluster sizes of the three models. The clustering was based on the whole protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
+
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 23:''' Graphs showing the cluster sizes of the three models. The clustering was based on the whole protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
 
|}
 
|}
 
</figtable>
 
</figtable>
 
<br style="clear:both;">
 
<br style="clear:both;">
   
<figtable id="cluster_size_ca">
+
<figtable id="tab:cluster_size_ca">
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
<!-- Mit reinnehmen? Unklar was man sagen könnt [Vadim]
 
<!-- Mit reinnehmen? Unklar was man sagen könnt [Vadim]
Line 658: Line 768:
 
| align="right" | [[File:Hemo_MD_C282S_MCB cluster-sizes_Run2.png|thumb|300px]]
 
| align="right" | [[File:Hemo_MD_C282S_MCB cluster-sizes_Run2.png|thumb|300px]]
 
|-
 
|-
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 22:''' Graphs showing the cluster sizes of the three models. The clustering was based on the C-alpha atoms of the protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
+
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 24:''' Graphs showing the cluster sizes of the three models. The clustering was based on the C-alpha atoms of the protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
 
|}
 
|}
 
</figtable>
 
</figtable>
 
<br style="clear:both;">
 
<br style="clear:both;">
   
  +
<xr id="tab:cluster_size_prot"/> and <xr id="tab:cluster_size_ca"/> show a clustering of the different structural states during the simulation. A cutoff of 0.18 was used for these clusters and they were separately calculated based on the whole protein (cf. <xr id="tab:cluster_size_prot"/>) and the C-alpha atoms (cf. <xr id="tab:cluster_size_ca"/>) only. As expected there are fewer clusters (conformations) for the C-alpha atoms only due to the exclusion of residue variations (rotations).
   
  +
The wildtype and C282S show a similar behavior in that the first two to three clusters represent the the majority conformations (the ones that are present the most during th simulation). There are 33 (14 C-alpha only) clusters for the wildtype and 25 (13 C-alpha only) clusters for C282S. R224W in contrast has 42 (20 C-alpha only) different clusters and the majority of the conformations is spread throughout more clusters. This suggests that there are not only more conformations, but they are also maintained for a longer time or assumed more often by the HFE mutant (i.e. it is more flexible).
<br style="clear:both;">
 
   
  +
<figtable id="tab:top_clusters">
=== Internal RMSD ===
 
  +
{| class="wikitable" style="float: left; margin: 1em 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
 
<figtable id="tab:rmsd_vs_start">
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
<!--
 
 
! scope="row" align="left" |
 
! scope="row" align="left" |
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-ca-vs-start.png|thumb|300px|Wildtype]]
+
| align="right" | [[File:Hemo_MD_1a6zC_PP_Clusters.png|thumb|300px|Wildtype]]
| align="right" | [[File:Hemo_MD_R224W_rmsd-ca-vs-start_Run2.png|thumb|300px|R224W]]
+
| align="right" | [[File:Hemo_MD_R224W_PP_Run2_Clusters.png|thumb|300px|R224W]]
| align="right" | [[File:Hemo_MD_C282S_rmsd-ca-vs-start_Run2.png|thumb|300px|C282S]]
+
| align="right" | [[File:Hemo_MD_C282S_PP_Run2_Clusters.png|thumb|300px|C282S]]
 
|-
 
|-
-->
 
! scope="row" align="left" |
 
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-allProt-atom-vs-start.png|thumb|300px|Wildtype]]
 
| align="right" | [[File:Hemo_MD_R224W_rmsd-allProt-atom-vs-start_Run2.png|thumb|300px|R224W]]
 
| align="right" | [[File:Hemo_MD_C282S_rmsd-allProt-atom-vs-start_Run2.png|thumb|300px|C282S]]
 
|-
 
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 23:''' RMSD of the calculated models over time against the beginning structure. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
 
|}
 
</figtable>
 
<br style="clear:both;">
 
 
<figtable id="tab:rmsd_vs_avg">
 
{| class="wikitable" style="float: left; margin: 0 0 1em 0; border: 2px solid darkgray;" cellpadding="0"
 
 
<!--
 
<!--
 
! scope="row" align="left" |
 
! scope="row" align="left" |
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-ca-vs-average.png|thumb|300px|Wildtype]]
+
| align="right" | [[File:Hemo_MD_1a6zC_MCB_Clusters.png|thumb|300px|Wildtype]]
| align="right" | [[File:Hemo_MD_R224W_rmsd-ca-vs-average_Run2.png|thumb|300px|R224W]]
+
| align="right" | [[File:Hemo_MD_R224W_MCB_Run2_Clusters.png|thumb|300px|R224W]]
| align="right" | [[File:Hemo_MD_C282S_rmsd-ca-vs-average_Run2.png|thumb|300px|C282S]]
+
| align="right" | [[File:Hemo_MD_C282S_MCB_Run2_Clusters.png|thumb|300px|C282S]]
 
|-
 
|-
 
-->
 
-->
  +
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 25:''' Comparison between the two most abundant clusters based on a cutoff of 0.18. The bigger cluster is colored green, the smaller red. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
! scope="row" align="left" |
 
| align="right" | [[File:Hemo_MD_1a6zC_rmsd-all-atom-vs-average.png|thumb|300px|Wildtype]]
 
| align="right" | [[File:Hemo_MD_R224W_rmsd-all-atom-vs-average_Run2.png|thumb|300px|R224W]]
 
| align="right" | [[File:Hemo_MD_C282S_rmsd-all-atom-vs-average_Run2.png|thumb|300px|C282S]]
 
|-
 
|+ style="caption-side: bottom; text-align: left" |<font size=1>'''Table 24:''' RMSD of the calculated models over time against the average structure (average based on all models over time). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)
 
 
|}
 
|}
 
</figtable>
 
</figtable>
 
<br style="clear:both;">
 
<br style="clear:both;">
   
  +
The two biggest clusters for each simulation are shown in <xr id="tab:top_clusters"/>. The clusters are aligned by the helices only which explains the good match for the MHC I domain, but it also helps to see the relative motions between both domains. The wildtype and C282S again are quite similar in that the both clusters have about the same structure. R224W in contrast shows quite strong inter-domain movement for the two most favored conformations. This further supports the suspected increased flexibility of the R224W mutant.
<xr id="tab:rmsd_vs_start"/> and <xr id="tab:rmsd_vs_avg"/> show the RMSD values for the different structural states compared to the starting and average structure respectively.
 
   
  +
<br style="clear:both;">
The first (against starting structure) in particular reflects what could already be seen in the [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#RMSD_matrix|RMSD matrix]] section. The wildtype quickly assumes what seems to be its average structure and then periodically fluctuates around it. R224W slowly deviates from the starting structure until 6500ps where it has a moderate structural change and a strong one later at around 9500ps. C282S has a strong deviation at 2000ps and then remains quite stable till the end.
 
   
  +
== Conclusion ==
When compared to the average structure the wildtype reaches its favored conformation at around 2000ps. There are some peaks around 4300ps, 5100ps, 7900ps, and 9100ps, but the RMSD alsways goes down afterwards. At the very end, though, is the highest peak which suggests a disruption of the structural equilibrium. This might be caused by the drop in the [[Molecular_Dynamics_Simulations_Analysis_Hemochromatosis#Minimum_distance_between_periodic_boundary_cells|MinDist]] between the boundaries below 2nm at the end of the simulation and therefore be a result of unwanted interactions between the simulated proteins.
 
   
In contrast to the wildtype R224W has a higher RMSD to its average structure (0.3 compared to 0.2). This suggests a higher number of different states which are still quite similar to the starting structure (RMSD ~0.4). At the end of the simulation the equilibrium is disrupted (like in the wildtype simulation), but this time the reason is unkown as the minimum distance between boundaries is one of the highes values in this time period.
 
   
  +
When comparing the mutants' simulations to the wildtype, R224W seems to show a bit more differences than C282S. Especially the whole protein's flexibility seems to be increased, but if this is a blessing or a curse is yet unkown. C282S would seem to be the less harmful mutation of the two based on the MD results, yet it is known to be one of the more malicious ones. Overall we have gained some insight into the behavior of the three variants during the MD simulations, though it was sometimes hard to make sense of the information or to decide if it might be good or bad for the protein's function.
Of the three simulations C282S shows the most stable average structure. It is accomplished by two major structural changes at around 2000ps and 3000ps. After that C282S exhibits the lowest RMSD fluctuations of the trio.
 
  +
  +
This task showed the power of molecular dynamics simulations, but it also showed its weakness. While it provides a lot of data one could easily be deluded by the results of a single run (as the differences between our two runs showed). In order to get reliable result multiple simulations (5+) should be performed to get some kind of average or consensus and to eliminate outliers.
   
 
<br style="clear:both;">
 
<br style="clear:both;">

Latest revision as of 21:26, 31 August 2012

Hemochromatosis>>Task 10: Molecular dynamics simulations analysis


Short task description

Detailed description: Molecular dynamics simulations analysis

Continuation of task 8. Analysis of the MD simulations for the wildtype and two mutants (R224W and C282S).


Protocol

A protocol with a description of the data acquisition and other scripts used for this task is available here.


Molecular dynamics simulations for HFE

Note: All pictures/graphs shown here are from the first run (in case of 1a6zC[wildtype]-pictures) or second run (in case of R224W- or C282S-mutation). The reason for this is described under section Minimum distance between periodic boundary cells.


Calculation statistics

<figtable id="tab:simulation_stats"> Statistics of the MD simulations

Input Calc. time Calc. speed time to reach 1 s
Wildtype 13h31:15 17.750 ns/day 154350,8 years
C282S 13h35:05 17.667 ns/day 155075,9 years
R224W 13h35:02 17.668 ns/day 155067,1 years

</figtable>

GMXcheck revealed for all calculations that all 2001 frames were calculated, resulting in a 10ns model.

Trajectory visualization

<figtable id="tab:traj_vis">

Wildtype
R224W
C282S
Table 2: Visualization of the molecular dynamics simulation trajectories. The calculated states (red) are superimposed on the PDB structure of 1a6zC (green). The timeframe of 10000ps was divided into 51 frames. The trajectories from left to right are: wildtype, R224W, and C282S.

</figtable>

<xr id="tab:traj_vis"/> shows the trajectories for the three MD simulations (WT, R224W, and C282S). The wildtype and R224W mostly maintain the starting structure, though some bigger parts are shifted as a whole. In both cases the upper and lower helices of the MHC I domain seem to move closer together and the C1 domain (right part in the figure) shifts around, but retains its general structure. The trajectory for C282S exhibits much less movement as the previous two which are quite flexible. Towards the end of the simulation there is a deformation of the C1 domain which is most likely caused by the missing disulfide bond due to the mutation. This deformation presumably prevents the later (but crucial) binding of beta-2-microglobulin to HFE.


Energies

Pressure

<figtable id="tab:pressure">

Hemo MD 1a6zC pressure.png
Hemo MD R224W pressure Run2.png
Hemo MD C282S pressure Run2.png
Table 3: different pressures of the three calculated models over time. The red line denotes the average over 100 steps (500ps). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutatuin at position 282 (C282S)

</figtable>

The plots in <xr id="tab:pressure"/> show the pressures of the calculated systems over time. These show that, although the pressures differ greatly in some cases, the average is still at about 0 (with minor fluctuations).


Temperature

<figtable id="tab:temperature">

Hemo MD 1a6zC temperature.png
Hemo MD R224W temperature Run2.png
Hemo MD C282S temperature Run2.png
Table 4: different temperature energies of the three calculated models over time. The red line denotes the average over 100 steps (500ps). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The next thing we calculated were the temperatures. For all three models they can be seen in <xr id="tab:temperature"/>. The maximal deviation from the average is about 4 degrees for all models.


Potential

<figtable id="tab:potential">

Hemo MD 1a6zC potential.png
Hemo MD R224W potential Run2.png
Hemo MD C282S potential Run2.png
Table 5: different potential energies of the three calculated models over time. The red line denotes the average over 100 steps (500ps). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

With gromacs we could also extract the potentials, as can be seen in <xr id="tab:potential"/>. As in the plots before the average fluctuates around the same value for all three models themselves. However, these points differ slightly:

The average potential of the wildtype and the C282S mutation tend to be around the same (~-9.195e+05) whereas the R224W mutation potential is slightly higher (~-9.19e+05).

Total energy

<figtable id="tab:total_energy">

Hemo MD 1a6zC totalEnergy.png
Hemo MD R224W totalEnergy Run2.png
Hemo MD C282S totalEnergy Run2.png
Table 6: different total energies of the three calculated models over time. The red line denotes the average over 100 steps (500ps). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

In <xr id="tab:total_energy"/> the values of the total energies are denoted over the different states in time. Again we get an average with minor fluctuation at around the same value for each model.


All these plots show the same behavior with one exception: average around the same value and look different between the three models. This can be expected as minor changes can introduce or eradicate bindings, therefore changing the overall energies which then influence all further steps. The exception is the potential of the R224W mutation which is slightly higher than the other two models' potentials.


Minimum distance between periodic boundary cells

<figtable id="minDistRun1">

Hemo MD 1a6zC minPeriodicDist.png
Hemo MD R224W minPeriodicDist.png
Hemo MD C282S minPeriodicDist.png
Table 7: different total energies of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The first calculations for the mutations resulted in the minimum distances of <xr id="minDistRun1"/>. As there should be at least 2 nm distance in between at all time one can see that the mutations show the opposite. Therefore it might be possible that the protein affects itself which is not desired. To see if this states were calculated just by chance (random fluctuations that built up over time into an undesired direction) we repeated the calculations for all three models.


<figtable id="minDistRun2">

Hemo MD 1a6zC minPeriodicDist Run2.png
Hemo MD R224W minPeriodicDist Run2.png
Hemo MD C282S minPeriodicDist Run2.png
Table 8: different total energies of the three calculated models over time. The calculations are from the second run. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The resulting minimal distances can be seen in <xr id="minDistRun2"/>. We therefore decided to use the model of the first calculation for the wildtype, and the models of the second calculation for the mutation types.


RMSF for protein and C-alpha

Protein based

<figtable id="tab:rmsf_prot">

Hemo MD 1a6zC prot rmsf.png
Hemo MD R224W prot rmsf Run2.png
Hemo MD C282S prot rmsf Run2.png
Table 9: different RMS fluctuations (based on the whole protein) of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

In general the RMS fluctuations of all three models look similar (see <xr id="tab:rmsf_prot"/>). The most differing graph is the one of R224W which shows a major peak at residues 220-235 as well as only a small peak at around residue 20.

C-Alpha based

<figtable id="tab:rmsf_ca">

Hemo MD 1a6zC ca rmsf.png
Hemo MD R224W ca rmsf Run2.png
Hemo MD C282S ca rmsf Run2.png
Table 10: different RMS fluctuations (based on the the C-alpha atoms of the protein) of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>


The c-alpha based RMSF (see <xr id="tab:rmsf_ca"/>) shows the same behavior as the whole-protein-based ones: the major differences are at around residue 20 and 220-235 of the R224W mutation.

Statistical values

With a script (provided by Dr. Marc Offman) we calculated the p-values for the RMS fluctuations to identify significant differences between the wildtype and the mutants. Surprisingly all three pairwise comparisons had very low p-values:

  • Wildtype and R224W: 8.079405e-57
  • Wildtype and C282S: 2.87453e-56
  • R224W and C282S: 5.45069e-25


These results are to be considered with care though as the significance does not have to be caused by the mutations, but could also be caused by random events during the simulation. In fact, the wildtype alone (simulation 1 compared to simulation 2) showed up as highly significant (different).


Pymol analysis of average and bfactor

<figtable id="tab:avg_bfactor">

Hemo MD 1a6zCProtAvg.png
Hemo MD R224WProtAvg Run2.png
Hemo MD C282SProtAvg Run2.png
Table 11: Pictures of the model averages (average over MD calculated states) colored by the b-factor. The range is from blue (bfactor value beneath threshold [500]) to red (high b-factor values). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

In this part we evaluate the model averages and b-factors of each position. Because it is an average over all timesteps this averaged structure can be impossible in nature.

As one can see (compare: <xr id="tab:avg_bfactor"/>) there is a big change of b-factors when comparing the wildtype and both mutations we calculated. From both mutations the R224W one shows a bigger difference to the wildtype. As expected the positions "at the edges" and those not in a secondary structure tend to have higher fluctuations/higher b-factors. It is worth noting, that even the beta sheets on the right side of the R224W picture (position 180 and higher) have (compared to wildtype and C282S mutation) pretty high b-factors. Also one can see that a little helix is inserted into the average structure of the R224W average model (right part of the picture, high b-factor [red], position ~220-225).

Also worth noting is, that the calculation of b-factors through NMA resulted in ones (cf. <xr id="tab:avg_bfactor"/> and WebNMA fluctuations) similar to our MD calculated wildtype b-factors. It might also be possible that Elnemo (cf. elNemo b-factors) has also calculated b-factors similar to the MD ones but are hidden in this depiction by our threshold.

Radius of gyration

<figtable id="tab:gyration_ca">

Hemo MD 1a6zC ca gyration.png
Hemo MD R224W ca gyration Run2.png
Hemo MD C282S ca gyration Run2.png
Table 12: different gyrations (based on the the C-alphas of the backbone of the protein) of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<figtable id="tab:gyration_prot">

Hemo MD 1a6zC prot gyration.png
Hemo MD R224W prot gyration Run2.png
Hemo MD C282S prot gyration Run2.png
Table 13: different gyrations (based on the the whole protein) of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>


These graphs (<xr id="tab:gyration_ca"/> and <xr id="tab:gyration_prot"/>) show the radii of gyration in general as well as in each dimension. From start to end there is a slight increase in general gyration-radius for all models. However the fluctuation of the radius is the weakest for the wildtype and the C282S mutation. There seem to be most fluctuation in the R224W model. Also the R224W mutation has in general the highes radius of gyration.

Another striking difference between the models is the low radius of gyration in the Z dimension for the mutations, whereas this type of gyration is in the wildtype only low for the first ~3000ps. In exchange both mutations gain more radius of gyration in the Y dimension compared to the wildtype.


Solvent accessible surface area

<figtable id="tab:sas">

Hemo MD 1a6zC SAS.png
Hemo MD R224W SAS Run2.png
Hemo MD C282S SAS Run2.png
Table 14: display of the different solvent accessible surface sizes of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>
The preceding graphs (see <xr id="tab:sas"/>) show the Solvent accessible surface. As the hydrophilic, total and D Gsolv values are very similar we will just talk about the hydrophobic values.

The most similar ones between those three are the wildtype and the R224W mutation which differ only in the strength of the fluctuation, of which the mutation has the higher one. Both fluctuate around about the same value.

The less similar one is the C282S hydrophobic curve. It shows about the same fluctuation strength as the wildtype but in general a decrease towards the end can be seen. Therefore this might be an indicator, that the hydrophobic residues turn to the inside, but we cant be sure as there is no increase in the other values.


<figtable id="tab:sas_res">

Hemo MD 1a6zC resSAS.png
Hemo MD R224W resSAS Run2.png
Hemo MD C282S resSAS Run2.png
Table 15: display of the different solvent accessible surface sizes (normalized to per residue values) of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>


In the area per residue averaged over the trajectory no clear difference can be seen (<xr id="tab:sas_res"/>).


Hydrogen-bonds between protein and protein / protein and water

Protein-Protein

<figtable id="tab:hbonds_pp">

Hemo MD 1a6zC hBondsP2P.png
Hemo MD R224W hBondsP2P Run2.png
Hemo MD C282S hBondsP2P Run2.png
Table 16: the number of hydrogen bonds inside the protein of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The calculations show that in each of the three cases (see <xr id="tab:hbonds_pp"/>) the number of bonds within the protein as well as their distances tend to stay the same.


Protein-Water

<figtable id="tab:hbonds_pw">

Hemo MD 1a6zC hBondsP2W.png
Hemo MD R224W hBondsP2W Run2.png
Hemo MD C282S hBondsP2W Run2.png
Table 17: the number of hydrogen bonds of the protein with water of the three calculated models over time. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

Although the inner-protein bonds seem to stay the same (compare section Protein-Protein), the bonds formed with hydrogen show a different behavior (see <xr id="tab:hbonds_pw"/>).

In case of the wildtype, both the number of hydrogen bonds as well as the number of pairs within 0.35nm are (compared to the values of the following timesteps) lower at first, rising in the first ~600ps. This may be an indication that at first a dense protein state is existent. Also from ~8000-8600ps there is a drop in the number of hydrogen bonds, whereas the number of pairs within 0.35nm does not show an equal behavior. Overall the numbers tend to be at the same level each after the first 600ps rise.

The R224W model shows a different behavior over time:

The first steps show a similar "both numbers low and rising till 600ps" behavior. But instead of having a fluctuation around one constant value there seems to be a slight decrease over time as well as bigger fluctuations. This affects both, the number of bonds as well as pairs within 0.35nm.


The calculated model of the C282S mutation shows a different behavior at start than both preceding described, but a similar behavior to the R224W mutation:

For the number of hydrogen bonds there are very high fluctuations at the first 1000ps (rising till 100ps, drop to 300ps, rise till 1000ps) with a slight overall decrease afterwards like for the R224W mutation. The number of pairs within 0.35nm seem to show a similar behavior, however because of the fluctuations this is not very clear.


Ramachandran plots

<figtable id="tab:ramachandran">

Wildtype
R224W
C282S
Table 18: Ramachandran Plots of the three calculated models. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The ramachandran plots (cf. <xr id="tab:ramachandran"/>) seem to follow the rules for the allowed and forbidden regions with a few exceptions. What is noticable though is that all of the allowed regions are almost filled to the maximum (i.e. no trend towards certain regions). Another difference are the regions above and below the left-handed alpha helix area (Psi: 110 to 180 and -180 to -150; Phi: 50 to 80) which are missing in the R224W mutation and almost non-existant in the C282S mutation. Though these areas have no significant structural element associated with them. Apart from that the plots appear almost the same, although they are hard to analyse as the dots are spread over a wide area and not clustered within distinct regions.


RMSD matrix

<figtable id="tab:rmsd_matrix_prot">

Wildtype
R224W
C282S
Table 19: RMSD matrices of the three calculated models over time (based on the whole protein) showing the RMSD between two models. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<figtable id="tab:rmsd_matrix_mcb">

Wildtype
R224W
C282S
Table 20: RMSD matrices of the three calculated models over time (based on the mainchain and C-betas) showing the RMSD between two models. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The RMSD matrices for the whole protein (cf. <xr id="tab:rmsd_matrix_prot"/>) and the mainchain and C-beta atoms (cf. <xr id="tab:rmsd_matrix_mcb"/>) are almost identical. The C-beta matrices have slighty lower RMSD values which is not surprising as only a subset of the atoms is taken into consideration. The small difference between the two matrix groups suggests that most of the structural changes are based on backbone rearrangements and not on orientational changes (rotations) of the residues.

The wildtype seems to periodically change between conformations as the RMSD goes up (green-yellow) and down (light blue) along the x-axis (time) for most of the different structural states (y-axis). The only noticable changes are in the at the beginning as the structure that is present in the first 1000ps seems to be quickly discarded (highest RMSD with the other states). Overall the changes are minor as the maximum RMSD is about 0.65 which is still considered quite low.

The R224W mutations appears to be very stable in the beginning, but has two structural changes towards the end. A moderate one between 6500-8000ps and a strong one between 9500-10000ps. For the rest of the time it has even less structural fluctuations than the wildtype.

The RMSD fluctuations for C282S exhibit an opposite behavior to R224W. There is a major structural change at 2000ps and some minor changes up to 3000ps. After 3000ps the structure remains almost the same until the end of the simulation. As the mutation is malign it can be assumed that it is a non-functional structure in which the proteins seems to be trapped.


Internal RMSD

<figtable id="tab:rmsd_vs_start">

Wildtype
R224W
C282S
Table 21: RMSD of the calculated models over time against the beginning structure. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<figtable id="tab:rmsd_vs_avg">

Wildtype
R224W
C282S
Table 22: RMSD of the calculated models over time against the average structure (average based on all models over time). From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<xr id="tab:rmsd_vs_start"/> and <xr id="tab:rmsd_vs_avg"/> show the RMSD values for the different structural states compared to the starting and average structure respectively.

The first (against starting structure) in particular reflects what could already be seen in the RMSD matrix section. The wildtype quickly assumes what seems to be its average structure and then periodically fluctuates around it. R224W slowly deviates from the starting structure until 6500ps where it has a moderate structural change and a strong one later at around 9500ps. C282S has a strong deviation at 2000ps and then remains quite stable till the end.

When compared to the average structure the wildtype reaches its favored conformation at around 2000ps. There are some peaks around 4300ps, 5100ps, 7900ps, and 9100ps, but the RMSD alsways goes down afterwards. At the very end, though, is the highest peak which suggests a disruption of the structural equilibrium. This might be caused by the drop in the MinDist between the boundaries below 2nm at the end of the simulation and therefore be a result of unwanted interactions between the simulated proteins.

In contrast to the wildtype R224W has a higher RMSD to its average structure (0.3 compared to 0.2). This suggests a higher number of different states which are still quite similar to the starting structure (RMSD ~0.4). At the end of the simulation the equilibrium is disrupted (like in the wildtype simulation), but this time the reason is unkown as the minimum distance between boundaries is one of the highes values in this time period.

Of the three simulations C282S shows the most stable average structure. It is accomplished by two major structural changes at around 2000ps and 3000ps. After that C282S exhibits the lowest RMSD fluctuations of the trio.


Cluster analysis

<figtable id="tab:cluster_size_prot">

Hemo MD 1a6zC PP cluster-sizes.png
Hemo MD R224W PP cluster-sizes Run2.png
Hemo MD C282S PP cluster-sizes Run2.png
Table 23: Graphs showing the cluster sizes of the three models. The clustering was based on the whole protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<figtable id="tab:cluster_size_ca">

Hemo MD 1a6zC MCB cluster-sizes.png
Hemo MD R224W MCB cluster-sizes Run2.png
Hemo MD C282S MCB cluster-sizes Run2.png
Table 24: Graphs showing the cluster sizes of the three models. The clustering was based on the C-alpha atoms of the protein. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

<xr id="tab:cluster_size_prot"/> and <xr id="tab:cluster_size_ca"/> show a clustering of the different structural states during the simulation. A cutoff of 0.18 was used for these clusters and they were separately calculated based on the whole protein (cf. <xr id="tab:cluster_size_prot"/>) and the C-alpha atoms (cf. <xr id="tab:cluster_size_ca"/>) only. As expected there are fewer clusters (conformations) for the C-alpha atoms only due to the exclusion of residue variations (rotations).

The wildtype and C282S show a similar behavior in that the first two to three clusters represent the the majority conformations (the ones that are present the most during th simulation). There are 33 (14 C-alpha only) clusters for the wildtype and 25 (13 C-alpha only) clusters for C282S. R224W in contrast has 42 (20 C-alpha only) different clusters and the majority of the conformations is spread throughout more clusters. This suggests that there are not only more conformations, but they are also maintained for a longer time or assumed more often by the HFE mutant (i.e. it is more flexible).

<figtable id="tab:top_clusters">

Wildtype
R224W
C282S
Table 25: Comparison between the two most abundant clusters based on a cutoff of 0.18. The bigger cluster is colored green, the smaller red. From left to right: 1a6zC (wildtype), mutation at position 224 (R224W) and mutation at position 282 (C282S)

</figtable>

The two biggest clusters for each simulation are shown in <xr id="tab:top_clusters"/>. The clusters are aligned by the helices only which explains the good match for the MHC I domain, but it also helps to see the relative motions between both domains. The wildtype and C282S again are quite similar in that the both clusters have about the same structure. R224W in contrast shows quite strong inter-domain movement for the two most favored conformations. This further supports the suspected increased flexibility of the R224W mutant.


Conclusion

When comparing the mutants' simulations to the wildtype, R224W seems to show a bit more differences than C282S. Especially the whole protein's flexibility seems to be increased, but if this is a blessing or a curse is yet unkown. C282S would seem to be the less harmful mutation of the two based on the MD results, yet it is known to be one of the more malicious ones. Overall we have gained some insight into the behavior of the three variants during the MD simulations, though it was sometimes hard to make sense of the information or to decide if it might be good or bad for the protein's function.

This task showed the power of molecular dynamics simulations, but it also showed its weakness. While it provides a lot of data one could easily be deluded by the results of a single run (as the differences between our two runs showed). In order to get reliable result multiple simulations (5+) should be performed to get some kind of average or consensus and to eliminate outliers.


References

<references/>