Canavan Task 7 - Structure-based mutation analysis

From Bioinformatikpedia
Revision as of 14:54, 26 June 2012 by Vorbergs (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
A mermaid found a swimming lad,
Picked him for her own,
Pressed her body to his body,
Laughed; and plunging down
Forgot in cruel happiness
That even lovers drown.

William Butler Yeats

Protocol

Further information can be found in the protocol.

Mapping of mutations

<figure id="aspa_mut_overview">

<xr nolink id="aspa_mut_overview"/>
ChainA of the aspa structure is shown in cartoon representation. Chain B is shown in surface representation to emphasize the dimer interface. The intermediate substrate analog is shown in blue. Disease causing mutations are shown in red and non-disease causing SNPs are colored in orange.

</figure>

We use the same mutations for this analysis as for the sequence based mutation analysis.

As a struture we used 2O4H which was crystallized as a dimer (see protocol for further info). For our analysis we will only consider chainA.

In <xr id="aspa_mut_overview"/> you can see an overview of the Aspa structure with the mutations mapped onto the structure.


In <xr id="mutation_vis_table"/> we analysed each mutation seperately regarding to its location in the structure. We also compared the Scwrl output to the Pymol mutation output, that we created manually via the mutagenesis plugin. In cases, where the residue in buried and there are restrictions in space there are hardly differences between the Pymol and the Scwrl output. Especially, since Scwrl only chooses the best suiting rotamer and inserts it into the structure without further optimising the mutated structure, one cannot expect large differences. Only for solvent exposed residues on the surface of the protein, there are a lot of degrees of freedom and Scwrl chooses different rotamers.


<figtable id="mutation_vis_table">

<xr nolink id="mutation_vis_table"/> Description of the structural environment of each mutation and visualization of the pymol mutation and the scwrl mutation output.
Mutation Comment Pymol Scwrl
E285A E285 is located in the binding pocket with a distance of 3.6 A to the substrate analog of 2O4H. It does not interact with the substrate, but has hydrogen bonds interactions to T118 and the backbone of Y288. These hydrogen bonds can not be established by the mutant residue alanine.
There are no differences between the Pymol mutation and the scwrl output.
<figure id="e285a_hb" >
<xr nolink id="e285a_hb"/>
</figure>
<figure id="e285a_scwrl" >
<xr nolink id="e285a_scwrl"/>
</figure>
A305E A305 is located at the end of the 13th beta sheet at the C-terminus of the protein. In figure <xr nolink id="a305e_crowded"/> the mutated residue glutamic acid is shown in red. The space at this position is rather crowded, so that alanine as a small residue fits very well in this position. Glutamic acid instead, hardly finds space and overlaps with neighboring residues.
There is a slight difference in orientation for the carboxyl group between the pymol and scwrl output. Although this group is rotated in the scwrl output, there will still be steric clashes with neighboring residues.
<figure id="a305e_crowded">
<xr nolink id="a305e_crowded"/>
</figure>
<figure id="a305e_scwrl">
<xr nolink id="a305e_scwrl"/>
</figure>
G123E G123 is located at the beginning of the fourth beta strand in Aspartoacylase. This strand is not buried but solvent accessible. In <xr nolink id="g123e_space"/>, the mutated residue E123 is presented in red. Eventhough glutamic acid is much larger than glycin, there is enough space and no clashes occur. Furthermore, Glycin is not involved in any interactions.
There is only a slight difference in orientation for glutamic acid. But this change in orientation should not have a large influence, since there is enough space to neighboring residues and no interactions with them are possible.
<figure id="g123e_space">
<xr nolink id="g123e_space"/>
</figure>
<figure id="g123e_scwrl">
<xr nolink id="g123e_scwrl"/>
</figure>
R71H R71 is located at the end of the fourth helix in the active site of the enzyme. It is involved in substrate binding via one Hbond. It also forms other Hbonds with an active water molecule and D68. In <xr nolink id="r71h_hbonds"/> the mutated residue H71 is presented in red and the intermediate substrate analog in blue. H71 is not able to build the Hbonds with the substrate or neighboring residues as does R71.
Pymol and scwrl output the same conformation for histidine. In both cases the mutated residue is not able to form those Hbonds that arginine could form.
<figure id="r71h_hbonds">
<xr nolink id="r71h_hbonds"/>
</figure>
<figure id="r71h_scwrl">
<xr nolink id="r71h_scwrl"/>
</figure>
R71K R71 is located at the end of the fourth helix in the active site of the enzyme. It is involved in substrate binding via one Hbond. It also forms other Hbonds with an active water molecule and D68. In <xr nolink id="r71k_hbonds"/> the mutated residue K71 is presented in red and the intermediate substrate analog in blue. K71 almost has the same shape as R71 and might also be able to form an Hbond to the substrate or D68.
There is no difference between the pymol and scwrl mutation result. Lysine is not able to form Hbonds in this specific orientation, but still it might be possible.
<figure id="r71k_hbonds">
<xr nolink id="r71k_hbonds"/>
</figure>
<figure id="r71k_scwrl">
<xr nolink id="r71k_scwrl"/>
</figure>
K213E K213 is located on the loop connecting the N-, and C-terminal of the enzyme. It is on the surface of the protein, far away from the binding site or the dimer interaction site. Change of colour (sorry, schlechte Absprache im Team ;-)): In <xr nolink id="k213e_pymol"/>, the mutated residue K213E as computed by PyMol is presented in blue and the reference structure and residue in green. The mutated residue Glutamic Acid is able to form an HBond with the neighbouring helix. <xr nolink id="k213e_scwrl"/> shows the mutated residue in as computed by scwrl. <figure id="k213e_pymol">
<xr nolink id="k213e_pymol"/>
</figure>
<figure id="k213e_scwrl">
<xr nolink id="k213e_scwrl"/>
</figure>
V278M V278M is located on a loop in the C-terminal of the enzyme. It is also on the surface of the protein, far away from the binding site or the dimer interaction site. In <xr nolink id="v278m_pymol"/>, the mutated residue V278M as computed by PyMol is presented in red and the reference structure and residue in green. Both the mutated and the original residue are part of a beta-sheet and form two HBonds each. <xr nolink id="v278m_scwrl"/> shows the mutated residue as computed by scwrl. <figure id="v278m_pymol">
<xr nolink id="v278m_pymol"/>
</figure>
<figure id="v278m_scwrl">
<xr nolink id="v278m_scwrl"/>
</figure>
M82T M82T is located on a loop in the N-terminal of the enzyme. It is on the surface of the protein, not close to the interaction site, but in the neighbourhood of the dimer interaction site. In <xr nolink id="m82t_pymol"/>, the mutated residue M82T as computed by PyMol is presented (colours as before). This residue does not form any HBonds. <xr nolink id="m82t_scwrl"/> shows the mutated residue as computed by scwrl. <figure id="m82t_pymol">
<xr nolink id="m82t_pymol"/>
</figure>
<figure id="m82t_scwrl">
<xr nolink id="m82t_scwrl"/>
</figure>
E235K This residue is again located on a loop in the C-terminal region of the enzyme. It also lies on the surface of the protein, in close neighbourhood to the dimer interaction site. In <xr nolink id="e235k_pymol"/>, the mutated residue E235K as computed by PyMol is presented (colours as before). This residue points out into the solvent and does not form any intra-molecular HBonds. <xr nolink id="e235k_scwrl"/> shows the mutated residue as computed by scwrl. <figure id="e235k_pymol">
<xr nolink id="e235k_pymol"/>
</figure>
<figure id="e235k_scwrl">
<xr nolink id="e235k_scwrl"/>
</figure>
I270T This residue is located on a beta-strand in the C-Terminal region of the protein where it forms HBonds to a neighbouring beta-strand. It also lies on the surface of the protein, and also in close neighbourhood to the dimer interaction site. In <xr nolink id="i270t_pymol"/>, the mutated residue I270T as computed by PyMol is presented (colours as before). <xr nolink id="i270t_scwrl"/> shows the mutated residue as computed by scwrl. <figure id="i270t_pymol">
<xr nolink id="i270t_pymol"/>
</figure>
<figure id="i270t_scwrl">
<xr nolink id="i270t_scwrl"/>
</figure>

</figtable>

Comparing SCWRL4 and FoldX mutants

In general one can say, that FoldX and Scwrl result in the same choice of rotamers. Only for solvent exposed mutant residues there are differences. However when not only looking at the mutant residues, one finds, that foldx also optimizes the conformation of neighboring residues. In one extreme case, foldx chooses a rotamer for a neighboring residue, that is rotated 180° compared to the original position (compare C218 in mutation A305E). This is due to steric hindrance of the mutated residue glutamic acid, that is significantly larger than the original residue alanin.


<figtable id="mutation_vis_table">

<xr nolink id="mutation_vis_table"/> Comparison of the scwrl and foldx outputs as well as comparison of the minimized structures for foldx and scwrl respectively.
Mutation Comment Superposition FoldX minimised
Detailed description of the differences between the scwrl, foldx and minimzed structures for each mutation based on the pictures shown in the reight columns. Comparison of scrwl (blue) and foldx (red) outputs relative to the original structure of aspartoacylase 2O4H shown in green. Comparison of the foldx (red) and minimized foldx (pale red) output relative to the original structure of aspartoacylase 2O4H shown in green.
A305E There is only a slight difference in the orientation of the carboxyl group of the mutated residue E305. In both cases E305 is able to form a hydrogen bond to Gln138. A305 was not able to build this interaction. But the most important difference is, that foldx changed the orientation of C218. This change in orientation prevents clashing of E305 with C218 as it is the case in the scwrl output.
After the minimisation of the foldx structure, C218 is moved still a bit further away from 305E due to sterical hindrance.
<figure id="a305e_compare_scwrl_fold">
A305e compare scwrl fold.png
</figure>
A305e foldx min.png
E235K Foldx and Scwrl chose different rotamers for K235. This position is solvent exposed on the surface and no interaction partners are in reach. Lysine is positively charged and therefore suitable for a mutation on the surface of a protein, where it can interact with water molecules.
After the minimisation of the foldx structure, the orientation of lysine changed only slightly
E235k compare scwrl fold.png
E235k foldx min.png
E285A Alanine has no rotamers and therefore there is no difference for the foldx and scwrl output for this residue. But again, for surrounding residues (Y288, Q184,...) foldx generates different rotamers.
There are no differences between the original foldx structure and the minimized one.
E285a compare scwrl fold.png
E285a foldx min.png
G123E G123 had no interactions with neighboring residues or with the solvent. E123 has the same rotamer in the foldx and scwrl model. In the case of foldx, the residue Y153 is moved slightly, so that E123 can form an Hbond with its hydroxyl group. The substitution of E for G is furthermore suitable for the solvent exposed position for interaction with water molecules.
There are no differences between the original foldx structure and the minimized one.
G123e compare scwrl fold.png
G123e foldx min.png
I270T Both methods chose the same rotamer for isoleucine which also identically matches the position of the native isoleucine residue. Neither isoleucine nor threonine can perform interactions with neighboring residues. The nearest residue being P232 at a distance of 3.9 A.
There are no differences between the original foldx structure and the minimized one.
I270t compare scwrl fold.png
I270t foldx min.png
K213E Foldx and scwrl chose different rotamers for this solvent exposed residue. No neighboring residues are in close proximity to perform interactions. The mutant residue glutamic acid is also able to form Hbonds with water molecules of the solvent though.
There are no differences between the original foldx structure and the minimized one.
K213e compare scwrl fold.png
K213e foldx min.png
M82T Both methods found the same rotamer that has no interactions to other residues. M82 also has no interactions to neighboring residues. This position is solvent exposed. Yet, M82 as well as the mutant residue threonine are hydrophobic and do not interact with solvent.
There are no differences between the original foldx structure and the minimized one.
M82t compare scwrl fold.png
M82t foldx min.png
R71H The histidine rotamer that scwrl chose is able to form an hbond with asp68, whereas the slightly different orientated foldx rotamer is not. Yet the native residue R71 can form three interactions: with asp68, the substrate and one coordinated water molecule of the binding site. and foldx chose different rotamers that are both do not form any Hbonds. This is a difference to the scwrl result for the R71H mutant that could at least form one hbond. This result is surprising, since lysine is far more similar to arginine than histidine and we expected similar possible interactions for arginine and lysine.

In the foldx model on the other hand, the neighboring residue K291 moves closer into the direction of H71. Because of the reduced distance, K291 and H71 might be able to form hydrophobic interactions

There are no differences between the original foldx structure and the minimized one.
R71h compare scwrl fold.png
R71h foldx min.png
R71k Scwrl and foldx chose different rotamers that both do not form any Hbonds. This is a difference to the scwrl result for the R71H mutant that could at least form one hbond. This result is surprising, since lysine is far more similar to arginine than histidine and we expected similar possible interactions for arginine and lysine.
There are no differences between the original foldx structure and the minimized one.
R71k compare scwrl fold.png
R71k foldx min.png
V278M Scwrl and foldx chose methionine rotamers that are orientated into opposing directions. foldx chose the rotamer that is oriented towards the inner of the protein and scwrl chose the rotamer that is oriented towards solvent. In this case, one can consider the foldx rotamer more reasonable since methionine is a fairly hydrophobic residue.
After the minimization of the foldx structure, the orientation of neigboring residue K297 changed slightly.
V278m compare scwrl fold.png
V278m foldx min.png

</figtable>

Foldx Energies

In the following you can see the energies that Foldx calculated for the Wildtype protein.

BackHbond       =               -173.03
SideHbond       =               -55.19
Energy_VdW      =               -363.71
Electro         =               -16.40
Energy_SolvP    =               478.93
Energy_SolvH    =               -482.34
Energy_vdwclash =               39.18
energy_torsion  =               28.46
backbone_vdwclash=              176.45
Entropy_sidec   =               182.61
Entropy_mainc   =               461.61
water bonds     =               -2.66
helix dipole    =               -1.74
loop_entropy    =               0.00
cis_bond        =               4.69 
disulfide       =               0.00
kn electrostatic=               0.00
partial covalent interactions = -12.90
Energy_Ionisation =             1.17
Entropy Complex =               0.00
-----------------------------------------------------------
Total          = 				  88.70

Further minimization of Scwrl and FoldX structures

In general, one can say, that after the first minimization round, the foldx structures had slightly better energy scores:

  • average energy Scwrl structures (1x min.): -7002 (without A305E mutant)
  • average energy Foldx structures (1x min.): -7365


For all structures, except for one, the energies became higher after five times minimizing them:

  • average energy Scwrl structures (5x min.): -6910
  • average energy Foldx structures (5x min.): -7193

In <xr id="plot_minimise"/> the distribution of the energy scores is visualized. It can easily be observed that the energy difference between the mutations are only minor. For each mutation it was possible to minimize the structure, so that is has comparable values to the wild type structure. Hence, from these small differences in energy one cannot infer the severety of one mutation.

<figtable id="min_eval">

<figure id="compare_a305e_min">
<xr nolink id="compare_a305e_min"/>
The crystal structure 2O4H is shown in green. The Foldx structure after first round of minimization is shown in red. The Scwrl mutant structure after the first round of minimization is shown in dark blue and after the fifth round in cyan. C218 in the Foldx structure has already a different conformation compared to the original conformation. For Scwrl, only in the five times minimized structure does this residue have an orientation that does not lead to steric clashes with E305.
</figure>
<figure id="plot_minimise">
<xr nolink id="plot_minimise"/>
Distribution of the energy scores for the mutant structures created with Foldx and Scwrl and minimized for one and five rounds. The y-axis was scaled so that the Scwrl energy for A305E after one minimisation step is not visible.
</figure>

</figtable>

<figtable id="min_values">

<xr nolink id="min_values"/>In this table the energy values calculated after the minimization steps are listet for the Scwrl, Foldx and WT structures.
E285A A305E G123E R71H R71K K213E V278M M82T E235K I270T WT
Scwrl Min -6993-3605-6979-7017-7079-6939.5-7002-69821-7033.5-6995
Foldx Min -7330-7273-7363-7404-7456-7308-7364-7369-7412-7373
Scwrl 5xMin-6893-6884-6884.5-6929-6993-6892-6909-6885-6933-6896
Foldx 5xMin -7189-7169-7172-7211-7260-7149-7176.5-7191-7225-7187.5
WT Min -7379
WT 5xMin-7040
Diff Scwrl-1003279-95-88-86-47.5-93-97-101-99
Diff Foldx-141-105-190-192-196-159-188-178-186.5-186

</figtable>


There was one interesting case, namely the Scwrl A305E mutant. After the first minimization round, this structure had an energy score of only -3605. After five times minimizing it had a score of -6884. So in this case, the energy dropped after further minimization. In comparison, the Foldx structure for this mutant had an energy of -7273 after one minimization round.

We investigated the structures and found, that Foldx chose a different conformer for the neighboring residue C218 of the mutant residue E305 (as already mentioned in <xr id="mutation_vis_table"/>) to avoid steric clashes. Scwrl did not do that. However, after five minimization rounds, C218 and other neighboring residues got moved as well, to avoid the steric clash. So this might be the reason, why the Scwrl structure has a lower energy after five minimization rounds than compared after only one round.(compare <xr id="compare_a305e_min"/>)

Yet, a lot of other residues have been moved as well in the minimization procedure and it exceeds the purpose to investigate all changes that occured.

Gromacs

Scwrl and foldx produced very similar results in our case, so we did not favour one over the other as an input for gromacs and simply chose the scwrl models. fetchPDB is a script that uses wget to download a pdb file for a given identifier.
For repairPDB, we used option -jprot to extract the protein only. Other options include removing hydrogens or waters, renumbering the sequence, extracting the DNA, selecting individual chains etc.
We called Scwrl4 again to make sure there are no chain breaks, and then proceeded with the instructions given in the task description. For more detailled info on our calls, please check the protocol.


The .mdp File

  • (cpp = C preprocessor)
  • define = -DFLEXIBLE -DPOSRES: dflexible introduces use of flexible water in your topology, dposres includes position restraints
  • implicit_solvent = GBSA: implicit solvent via Generalised Born formalism
  • integrator = steep: use steepest descent algorithm for energy minimisation
  • emtol = 1.0: energy minimisation tolerance: the minimisation is converged when the maximum force is smaller than this value
  • nsteps = max number of steps
  • nstenergy = 1: frequency to write energy to energy file
  • energygrps=System: group(s) to write energy file
  • ns_type = grid: neighbour searching via a grid
  • coulombtype, rcoulomb, rvdw: some coulomb calculation options
  • constraints = none: no constraints apart from those specified in the topology file. I.e., harmonic or morse potential is applied
  • pbc = no: no periodic boundary conditions


Gromacs Results

The minimisation took between 212 and 363 steps until it converged.


<figtable id="gromacs_energies"> <xr nolink id="gromacs_energies"/> Gromacs bond, angle and potential changes for the ten analysed mutations. Mutations colored in red are disease causing.

Mutant Wildtype A305E E285A E235K R71H R71K K213E V278M M82T G123E I270T
Bond
784.267 1552.99 993.458 992.481 1205.1 1196.9 1076.31 948.572 1195.51 1204.47 956.843
Angle 2891.77 2812.05 2762.23 2768.67 2850.82 2777.43 2790.79 2774.97 2760.1 2793.93 2772.99
Potential
(difference)
-32496.8 -19478.2
(Δ13018)
-31389.9
(Δ1107)
-31528.5
(Δ968)
-30720
(Δ1776)
-30872
(Δ1624)
-32140.8
(Δ356)
-31939.8
(Δ557)
-31075.1
(Δ1421)
-31741.8
(Δ755)
-32043
(Δ453)

</figtable>


From <xr id="gromacs_energies"/>, it is interesting to notice that the average bond energy is much lower for the wildtype compared to all mutations.

The same, however, does not hold true for the angle energies - these are much closer to each other.

There is a definite change in potential for the mutations as can be seen in the table above. However, values vary greatly, and it is therefore difficult to generalise or define a cutoff. Additionally, some disease-causing mutations have a lower potential difference than those who are not disease-causing.

Discussion

In general we find it very hard to draw conclusions just from the score. As can be seen from the Gromacs and Minimisation results, there is no correlation between deleterious mutants and the energy of the structure. Yet we feel, that this result is not surprising. A deleterious mutant might as well have a good structure but still be unfunctional. The functional implications do not necessarily go along with structural defects.

Only major deviations in the energy of the structure, as it was the case for the Scwlr mutant structure for A305E give a hint, that there definitely is something wrong.