Canavan Disease: Task 09 - Structure-based Mutation Analysis

From Bioinformatikpedia

Structure-based mutation analysis examines the effects of mutations on the protein structure. This is done via the calculation of energetic changes.


Several PDB-structures are available for aspartoacylase. To work on structure based mutation analysis, the one to choose should have some constraints: The structure should have a high resolution with a small R-factor. The pH value should be near the physiological pH and there should be no gaps.
There was no structure without any gaps, but the missing residues were only at the beginning and end of the structure. Only one PDB-structure contained further information such as pH value. Therefore 2O4H was chosen: It is the human brain aspartoacylase complex with intermediate analog (N- 2 phosphonomethyl-L-aspartate), with a resolution range of 2.7 Angstroms. It is one single crystal analyzed with X-ray diffraction in October 2006 using a temperature of 100 Kelvin and a pH of 6.0. Its free R value is 0.271. The missing residues range from positions 1-9 in the beginning and 311-313 in the end of the protein.


Some mutations from the previous Task were chosen to work with. Those are displayed in <xr id="visu"></xr>:

<figtable id="visu">

Visualization of Mutations of Interest
Mutation Description Visualization Sec. Struc. Information
Asn121Asp From the previous Task it is known that Asn->Ile is disease causing. Here a detailed look on that specific mutation is of interest.
CanvanT9 Mutation Asn121Asp.png
LOOP possibly disease causing
His21Pro This mutation is definitively disease causing, since it is important for binding the zinc ion. The position is part of the active center and therefore of special interest.
CanvanT9 Mutation His21Pro.png
LOOP metal binding,
disease causing
Pro149Ala Since Proline is known to be a HELIX breaker and this mutation is very near to a HELIX the Proline might be necessary for the structure.
CanvanT9 Mutation Pro149Ala.png
near HELIX
Pro257Arg According to Task 08 the mutation is expected to have no effect on the protein.
CanvanT9 Mutation Pro257Arg.png
Thr166Ile This position is known to be part of the binding region. It is listed to be disease causing and is exactly the changing point between LOOP and HELIX.
CanvanT9 Mutation Thr166Ile.png
near HELIX
in binding region,
disease causing
Visualization of some mutations in 2O4H (orange). The mutated amino acid is colored blue.
Information about functional residues were taken from Uniprot.



To compare wild type and mutant amino acids the following approaches were made: Comparison of H-bonds and predicting structural change using SCWRL and foldX.
Comparing wild type to mutant structures no potential clashes were found in neither of the mutations. No holes were introduced to the structure. The mutations should not have a high effect on changing hydrophilicity in the core, nor hydrophobicity on the surface. Only Thr->Ile, which is part of the core region changes extremely from a more hydrophilic to a more hydrophobic amino acid. A visual comparison can be found in <xr id="engy"></xr>.

<figtable id="engy">

Comparisons on a Visual Basis
Mutation H-bonds SCWRL and foldX
Canavan Mutation Asn121Asp hbond.png
Canavan Scwrl FoldX Asn121Asp.png
Canavan Mutation His21Pro hbond.png
CanavanScwrl-foldx His21Pro.png
Canavan Mutation Pro149Ala hbond.png
Canavan Scwrl FoldX Pro149Ala.png
Canavan Mutation Pro257Arg hbond.png
Canavan Scwrl FoldX Pro257Arg.png
Canavan Mutation Thr166Ile hbond.png
Canavan Scwrl FoldX Thr166Ile.png
For the visualization of H-bonds, the original sequence is colored green and the mutated sequence blue.
Additionally there is the comparison of the predicted structure changes by SCWRL (red) and foldX (cyan).


As it can be seen in <xr id="engy"></xr>, mutating the sidechains leads in some approaches (mutate via Pymol (blue) and create a new structure with SCWRL (red) or foldX (cyan)) to a loss of one potential hydrogen bond. However, as the residue is not located within a secondary structure element that relies on stabilization via H-bonds, the impact is not expected to be extreme. Unfortunately the approaches do not consider whether the site is a functional residue or not.
The observable differences of the sidechain conformation between Pymol, SCWRL and foldX is redundant as only one state of the sidechain dynamics is displayed at a time. The natural movement of the sites is not taken into account (compare Asn->Asp).

Energy Comparison

<figtable id="energy_comp">

Energy Comparison
Mutation SCWRL foldX
MT Corr. MT Corr.
Wild type 400.00 144.46
Asn121Asp 403.94 1.01 145.46 1.01
His21Pro 395.75 0.99 145.85 1.01
Pro129Ala 401.93 1.00 148.04 1.02
Pro257Arg 414.53 1.04 149.63 1.04
Thr166Ile 432.25 1.08 147.35 1.02
Energy comparison of SCWRL and foldX. The correlation was calculated as mutation energy divided by wild type energy.
Therefore any wider derivation from 1.0 is predicted to have an effect on the protein. Thus, only Thr->Ile would represent an effect in the SCWRL approach.


<xr id="energy_comp"></xr> shows that only the mutation from Threonine to Isoleucine at position 166 seems to have an effect due to the higher value presented in SCWRL. foldX would not predict any effects. The effect on the protein function that is introduced in ASPA by changing Histindine to Proline at position 21 can not be captured by the energy calculations as it does not take into account that with the missing Histidine the metal binding site is corrupt. This holds true for Threonine to Isoleucine, which is part of a substrate binding region, what is maybe the reason why it is not detected by foldX. As it is the only mutation for which it is uncertain if it is disease causing or not, the change from Asparagine to Aspartic acid at position 121 could not be predicted as having an effect using the structure based mutation analysis as well.
The complete result list from foldX can be found in the Supplement at the end of this Task.


To minimize the energy of the models, minimise was used. Unfortunately SCWRL was not able to produce any output. Therefore <xr id="minim"></xr> shows the different energies of foldX in five iterations whereas each iteration uses the output of the previous minimization as input. The detailed values can be found in the Supplement at the end of this Task.

<figure id="minim">

Development of calculated energies from foldX. The minimization for each mutation was a recursive progress: Each iteration uses the output of the previous minimization as input.


As it can be seen in <xr id="minim"></xr> the second iteration shows the optimal energy result for the structure, with further iterations raising the level. Additionally when comparing all mutated structures, the known disease causing mutations (His21Pro and Thr166Ile) show the highest energy levels. Asn121Asp referring to the energy comparisons are predicted to have no effect. Looking at the minimization, the mutation is on the same level as His21Pro. Therefore it might be possibly disease causing. This is the same result as the assumption. A precise prediction if it is disease causing or not remains unclear.