Difference between revisions of "Structure-based mutation analysis (Phenylketonuria)"

From Bioinformatikpedia
(SCWRL)
(Energy comparisons)
 
(107 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Page still under construction!!!'''
 
 
== Summary ==
 
== Summary ==
In [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Sequence-based_mutation_analysis_%28Phenylketonuria%29 Task 8] the sequence of PAH was used for finding mutational effects, now the structure will be taken for these analysis. But how to find out, if a mutation changes the structure? Therefore, one calculates the energy of all atoms for the wildtype and the mutated structure and compares the results for changes. There are two different methods for this calculations given: Quantum Mechanics (QM) and Molecular Mechanics (MM). In QM the energy of all electrons in a protein is calculated. It is one of the most accurated methods, but it is very time consuming. In MM the energy of a system is calculated as a function of nuclear positions. It is very fast and easy to calculate, but it ignores electronic motions and is not as accurate as QM. Since QM is too time intensive and the results of MM are nearly as good as the ones calculated with QM, we use MM for the further analysis. Molecular Mechanics uses force fields for the energy calculation, which is defined as a sum of terms. The terms are non-bonded (electrostatic and Van-der-Waals) and bonded (Bond stretching, Angle stretching, bond rotation) interactions. For the structure based mutation analysis the SCWRL and FoldX webserver were used.
+
In [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Sequence-based_mutation_analysis_%28Phenylketonuria%29 Task 8] the sequence of PAH was used for finding mutational effects, now the structure will be taken for these analysis. But how to find out, if a mutation changes the structure? Therefore, one calculates the energy of all atoms for the wildtype and the mutated structure and compares the results for changes. There are two different methods for this calculations given: Quantum Mechanics (QM) and Molecular Mechanics (MM). In QM the energy of all electrons in a protein is calculated. It is one of the most accurated methods, but it is very time consuming. In MM the energy of a system is calculated as a function of nuclear positions. It is very fast and easy to calculate, but it ignores electronic motions and is not as accurate as QM. Since QM is too time intensive and the results of MM are nearly as good as the ones calculated with QM, we use MM for the further analysis. Molecular Mechanics uses force fields for the energy calculation, which is defined as a sum of terms. The terms are non-bonded (electrostatic and Van-der-Waals) and bonded (Bond stretching, Angle stretching, bond rotation) interactions. <ref name="molmech"> Andrew R. Leach (2001): "[http://www.fis.unam.mx/~ramon/CursoDF/Material%20Didactico/LIBROS/Molecular%20Modelling.%20Principles%20and%20Applications%20%282nd%20Edition%29%20by%20Andrew%20R.%20Leach.pdf Molecular Modelling: Principles and Applications (Second Edition)]". Prentice Hall, ISBN: 9780582382107. </ref> For the structure based mutation analysis the tools SCWRL and FoldX were used.
   
 
== Structure selection ==
 
== Structure selection ==
Line 59: Line 58:
 
| [http://www.pdb.org/pdb/explore/explore.do?pdbId=6PAH 6PAH] || X-ray || 2.15 || 0.17 || 6.80 || - || A || [http://www.uniprot.org/blast/?about=P00439%5B117-424%5D 117-424] || 68.14
 
| [http://www.pdb.org/pdb/explore/explore.do?pdbId=6PAH 6PAH] || X-ray || 2.15 || 0.17 || 6.80 || - || A || [http://www.uniprot.org/blast/?about=P00439%5B117-424%5D 117-424] || 68.14
 
|}
 
|}
  +
<center><small>'''<caption>''' Comparison of all pdb structures of P00439 in the used method, Resolution (in Å), R-factor, pH, included Gaps and chain/s, positions in P00439 (PAH) and the coverage to this sequence. </caption></small></center>
<small><center>'''<caption>''' ...</caption></center></small>
 
 
</figtable>
 
</figtable>
All proteins were found with the X-ray diffraction method. In <xr id="pro-struc"/> we can see, that the structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] has a better resolution value as well as R-factor than the other structures. Although 2PAH has a better pH-value, a higher coverage and even two domains, however, the structure includes one gap. For this reason as well as the better R-factor and higher resolution value, we have chosen the structure of 1J8U (no gaps) for further analysis. Moreover, the structure includes the second highest coverage and also a very good pH-value.
+
All structures do not cover the whole PAH protein (coverage = 100%) and were found with the X-ray diffraction method. In <xr id="pro-struc"/> we can see, that the structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] has a better resolution value as well as R-factor than the other structures. Although [http://www.rcsb.org/pdb/explore/explore.do?structureId=2PAH 2PAH] has a better pH-value, a higher coverage and even two chains, however, the structure includes one gap. For this reason as well as the better R-factor and higher resolution value, we have chosen the structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] for further analysis. [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] is the catalytic domain of human phenylalanine hydroxylase Fe(II) and does not contain any gaps. Moreover, the structure includes the second highest coverage and also a very good pH-value.
   
The structure of 1J8U as well as its ligands are shown in the <xr id="1j8u"/> below. The binding sites which belong to the ligands are shown in <xr id="bindingsite"/>.
+
The 3D structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] as well as its ligands are shown in the <xr id="1j8u"/> below. The binding site for the ligand FE(II) consists of the residues His285, His290 and Glu330 and the one for ligand H4B consists of Val245, Gly247, Leu249 and Ser251. Both were taken from the pdb entry of [http://www.rcsb.org/pdb/explore/remediatedSequence.do?structureId=1J8U 1J8U]. These binding sites are shown in detail in <xr id="bindingsite"/>.
   
  +
<small>
 
{|align=center
 
{|align=center
|<figure id="1j8u"><small>[[File:1J8U.png‎|thumb|400px|'''<caption>''' 3D structure of 1J8U (green) in cartoon style with its two ligands H4B - C<sub>9</sub>H<sub>15</sub>N<sub>5</sub>O<sub>3</sub> (blue) and FE(II) (grey). </caption>]]</small></figure>
+
|<figure id="1j8u">[[File:1J8U.png‎|thumb|400px|'''<caption>''' 3D-structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] (green) in cartoon style with its two ligands H4B - C<sub>9</sub>H<sub>15</sub>N<sub>5</sub>O<sub>3</sub> (blue sticks) and FE(II) (grey sphere). The binding site of FE(II) is shown in orange and the one of H4B in pink. Both are illustrated in sticks with surrounding surfaces. </caption>]]</figure>
|<figure id="bindingsite"><small>[[File:1J8U_Ligands_bindingsite.png|thumb|400px|'''<caption>'''Structure of 1J8U (green) in cartoon style with zoom to the ligands H4B - C<sub>9</sub>H<sub>15</sub>N<sub>5</sub>O<sub>3</sub> (blue) with corresponding binding site (red) and FE(II) (grey) with corresponding binding site (orange). The binding sites are shown in sticks and their belonging surface structure. </caption>]]</small></figure>
+
|<figure id="bindingsite">[[File:1J8U_Ligands_bindingsite.png|thumb|400px|'''<caption>'''3D-structure of [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] (green) in cartoon style with zoom to the ligands H4B - C<sub>9</sub>H<sub>15</sub>N<sub>5</sub>O<sub>3</sub> (blue) with corresponding binding site (pink) and FE(II) (grey) with corresponding binding site (orange). The binding sites are shown in sticks and their belonging surface structures. </caption>]]</figure>
 
|}
 
|}
  +
</small>
   
 
== Visualisation of used mutations ==
 
== Visualisation of used mutations ==
Line 90: Line 91:
 
|}
 
|}
   
  +
For the conversion of the residues, which include a mutation, we used the Mutagenesis tool in PyMOL. A little overview of the mutations is located in the next subsections with associated figures.
   
 
===Gln172His===
 
===Gln172His===
<figure id="Q172H">
+
<figure id="Q172H"><small>
[[File:Mut_gln172his.png|thumb|right|'''<caption>''' Mutation of glutamine (yellow) to histidine (purple) at position 172 of 1J8U (green).</caption>]]
+
[[File:Mut_gln172his.png|thumb|300px|right|'''<caption>''' Mutation of glutamine (yellow) to histidine (purple) with their polar contacts located at position 172 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2. </caption>]]
</figure>
+
</small></figure>
  +
This mutation is located on a coiled region far away from any of the two binding sites (see <xr id="Q172H"/>), so one would not expect a huge effect on the protein. This would confirm our assumption from last weeks [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Sequence-based_mutation_analysis_%28Phenylketonuria%29 task] that this mutation is neutral, although this mutation contains only two of the three polar bonds of the wildtype residue.
...
 
 
<br clear=all>
 
<br clear=all>
   
 
===Ala259Val===
 
===Ala259Val===
<figure id="A259V">
+
<figure id="A259V"><small>
[[File:A259V.png|thumb|right|'''<caption>''' Mutation of alanine (yellow) to valine (purple) at position 172 of 1J8U (green).</caption>]]
+
[[File:Mut_ala259val.png|thumb|300px|right|'''<caption>''' Mutation of alanine (yellow) to valine (purple) with their polar contacts located at position 259 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2. </caption>]]</small>
 
</figure>
 
</figure>
  +
As one can see in <xr id="A259V"/> this mutation is lying in moderate distance to both binding sites. In comparison to [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Sequence-based_mutation_analysis_%28Phenylketonuria%29 Task8] lies the mutation here exactly on a helix and not directly beneath one. The reason for this change is the different protein structure we used for both tasks (2PAH in Task8 and 1J8U here). Even if it is not very close to any of the binding sites and both polar contacts of the wildtype residue remain in the mutation, we assume that this mutation is disease-causing.
...
 
 
<br clear=all>
 
<br clear=all>
   
 
===Thr266Ala===
 
===Thr266Ala===
<figure id="T266A">
+
<figure id="T266A"><small>
[[File:A259V.png|thumb|right|'''<caption>''' Mutation of threonine (yellow) to alanine (purple) at position 172 of 1J8U (green).</caption>]]
+
[[File:Mut_thr266ala.png|thumb|300px|right|'''<caption>''' Mutation of threonine (yellow) to alanine (purple) with their polar contacts located at position 266 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2. </caption>]]</small>
 
</figure>
 
</figure>
  +
In <xr id="T266A"/> we can see, that this residue is not located directly to the binding sites, but it lies between both ones. In last weeks task Thr266Ala was lying on the end of a helix, but here it can be found on a coiled region. Regarding the near location to the binding sites we would expect that this mutation is disease causing. Like in Gln172His this mutation does exhibits only two of the three polar contacts of the wildtype residue.
...
 
 
<br clear=all>
 
<br clear=all>
   
 
===Phe392Ser===
 
===Phe392Ser===
<figure id="F392S">
+
<figure id="F392S"><small>
[[File:A259V.png|thumb|right|'''<caption>''' Mutation of phenylalanine (yellow) to serine (purple) at position 172 of 1J8U (green).</caption>]]
+
[[File:Mut_phe392ser.png|thumb|300px|right|'''<caption>''' Mutation of phenylalanine (yellow) to serine (purple) with their polar contacts located at position 392 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2. </caption>]]</small>
 
</figure>
 
</figure>
  +
Since this mutations lies far away from both binding sites (<xr id="F392S"/>), we would expect only a small effect on the protein. But we think this mutation is disease causing, as it is located on the beginning of an alpha-Helix and has two polar contacts more now than the wildtype one. Moreover, both residues do have a very different structure and the prediction tools of last week were all unanimous that this mutation affects the protein.
...
 
 
<br clear=all>
 
<br clear=all>
   
 
===Pro416Gln===
 
===Pro416Gln===
<figure id="P416Q">
+
<figure id="P416Q"><small>
[[File:A259V.png|thumb|right|'''<caption>''' Mutation of proline (yellow) to glutamine (purple) at position 172 of 1J8U (green).</caption>]]
+
[[File:Mut_pro416gln.png|thumb|300px|right|'''<caption>''' Mutation of proline (yellow) to glutamine (purple) with their polar contacts located at position 416 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2. </caption>]]</small>
 
</figure>
 
</figure>
  +
In this <xr id="P416Q"/> we can see, that the mutation is also far away from the binding sites as Phe392Ser. So, we would also think that it has not a huge effect on the protein. Like before, the tools of last week predict this mutation as non-neutral. Furthermore, the wildtype residue of proline on position 416 has no polar contacts to any other excluding solvent, but the mutation has one included. We therefore think that this mutation is disease-causing.
...
 
 
<br clear=all>
 
<br clear=all>
   
 
== Mutated structure creation ==
 
== Mutated structure creation ==
=== SCWRL ===
+
=== SCWRL4 ===
  +
SCWRL4 (Side-chain Conformation Prediction With Rotamer Library) predicts protein side-chain conformations. Therefore, it uses a backbone-dependent rotamer library. The tool is based on graph theory, easy to use, accurate and very fast. The output includes a 3D structure of the prediction. <ref name="scwrl4"> Georgii G. Krivov1, Maxim V. Shapovalov1 and Roland L. Dunbrack Jr. (2009): "[http://dunbrack.fccc.edu/scwrl4/SCWRL4Paper.pdf Improved prediction of protein side-chain conformations with SCWRL4]". Proteins Vol.77(4):778-95. [http://en.wikipedia.org/wiki/Digital_object_identifier doi]:[http://www.ncbi.nlm.nih.gov/pubmed/19603484 10.1002/prot.22488]</ref> There is also an online [http://www1.jcsg.org/prod/scripts/scwrl/serve.cgi SCWRL Server] available.
  +
  +
Since we have done the comparison between the wildtype and the mutated residue in the previous section, we now want to analyse only the comparison between the SCWRL mutation and the wildtype mutation made by the mutagenesis tool of PyMOL. The SCWRL structure is colored in purple, whereas the wildtype mutation is colored in green.
  +
<center><small>
  +
<gallery widths=250px heights=150px perrow=2>
  +
File:Scwrl_Q172H.png‎|'''Q172H:''' There is only a slight twisting to recognize between the Histidin (mutation) of the wildtype and the SCWRL structure. One of the polar bonds begins in the SCWRL mutation at the H-atom, which the wildtype residue does not include and therefore this bond begins on the atom next to it.
  +
File:Scwrl_A259V.png|'''A259V:''' The SCWRL mutation has a rotation of the arms at approximately 90 degrees. Here we have the same situation than before, one of the polar contacts begins in the SCWRL structure on an H-atom and in the wildtype structure on the atom next to it.
  +
File:Scwrl_T266A.png|'''T266A:''' In this figure both mutations are on the same position, but the one generated with SCWRL has again H-atoms included. We can make the same observation as before, because the polar bonds are starting in the SCWRL mutation from the H-atom and in the wildtype from the atom which is bond to the hydrogen.
  +
File:Scwrl_F392S.png|'''F392S:''' Here, the Serine in the SCWRL structure has a twisting to the wildtype one. In addition, two of the four polar bonds of the wildtype mutation are lost in the SCWRL mutation. So, the bonds are more like the ones from the unmutated residue.
  +
File:Scwrl_P416Q.png‎|'''P416Q:''' Now, we have a bigger conformational change between the two mutations. One part of the mutation has been rotated nearly 180 degree. This could be the reason, why the wildtype mutation has a polar contact, which the SCWRL mutation does not include. However, the SCWRL Glutamine has a new one formed instead.
  +
</gallery>
  +
</small></center>
  +
The big difference between the SCWRL structure and the wildtype one is that SCWRL includes H-atoms, which the other structure does not have. This is the reason, why sometimes the bonds look different although they are from the same location (if the H-atom is not in the SCWRL structure included).
  +
  +
=== FoldX ===
  +
FoldX is an empirical force field to provide a fast and accurate estimation of mutational free energy changes (effect of SNPs) on the protein stability<ref name="foldef"> Raphael Guerois, Jens Erik Nielsen and Luis Serrano (2002): "[http://nar.oxfordjournals.org/content/33/suppl_2/W382.full The FoldX web server: an online force field]". Nucleic Acids Research Vol.33: W382–W388. [http://en.wikipedia.org/wiki/Digital_object_identifier doi]:[http://nar.oxfordjournals.org/content/33/suppl_2/W382.full 10.1093/nar/gki387]</ref><sup>,</sup><ref name="foldx"> Georgii G. Krivov1, Maxim V. Shapovalov1 and Roland L. Dunbrack Jr. (2002): "[ftp://ftp.bork.embl.de/users/lercher/Ka/Guerois_Serrano2002JMB_FOLDEF.pdf Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 Mutations]". J. Mol. Biol. Vol.320: 369–387. [http://en.wikipedia.org/wiki/Digital_object_identifier doi]:[http://www.ncbi.nlm.nih.gov/pubmed/19603484 10.1002/prot.22488]</ref>. FoldX provides different run modes, but normally it takes a PDB-file and calculates several energies (e.g. Backbone H-bond, Van der Waals, Water bridge, etc.). In the output all the different calculated energies are given out, like those for our mutations shown in <xr id="foldx"/>.
  +
  +
<figtable id="foldx">
  +
{|border="1" cellspacing="0" cellpadding="5" align="center" style="text-align:center"
  +
|-
  +
! style="background:#32CD32;" | Type
  +
! style="background:#32CD32;" | Total energy
  +
! style="background:#32CD32;" | Backbone Hbond
  +
! style="background:#32CD32;" | Sidechain Hbond
  +
! style="background:#32CD32;" | Van der Waals
  +
! style="background:#32CD32;" | Electrostatics
  +
! style="background:#32CD32;" | Solvation Polar
  +
! style="background:#32CD32;" | Solvation Hydrophobic
  +
! style="background:#32CD32;" | Van der Waals clashes
  +
! style="background:#32CD32;" | Entropy sidechain
  +
! style="background:#32CD32;" | Entropy mainchain
  +
! style="background:#32CD32;" | Sloop entropy
  +
! style="background:#32CD32;" | Mloop entropy
  +
! style="background:#32CD32;" | Cis bond
  +
! style="background:#32CD32;" | Torsional clash
  +
! style="background:#32CD32;" | Backbone clash
  +
! style="background:#32CD32;" | Helix dipole
  +
! style="background:#32CD32;" | Water bridge
  +
! style="background:#32CD32;" | Disulfide
  +
! style="background:#32CD32;" | Electrostatic kon
  +
! style="background:#32CD32;" | Partial covalent bonds
  +
! style="background:#32CD32;" | Energy Ionisation
  +
! style="background:#32CD32;" | Entropy Complex
  +
|-
  +
| '''WT''' || 14.00 || -196.05 || -55.77 || -379.28 || -19.47 || 492.46 || -495.68 || 34.69 || 194.42 || 454.16 || 0.00 || 0.00 || 0.00 || 11.69 || 227.66 || -15.13 || -13.50 || 0.00 || 0.00 || 0.00 || 1.45 || 0.00
  +
|-
  +
| '''Q172H''' || 13.84 || -196.19 || -56.45 || -378.50 || -19.95 || 492.48 || -495.07 || 33.39 || 193.60 || 453.85 || 0.00 || 0.00 || 0.00 || 11.68 || 227.36 || -14.78 || -11.79 || 0.00 || 0.00 || 0.00 || 1.55 || 0.00
  +
  +
|-
  +
| '''A259V''' || 14.08 || -198.14 || -59.00 || -379.98 || -19.12 || 495.02 || -497.16 || 34.93 || 194.60 || 454.17 || 0.00 || 0.00 || 0.00 || 13.19 || 227.82 || -15.57 || -10.30 || 0.00 || 0.00 || 0.00 || 1.45 || 0.00
  +
|-
  +
| '''T266A''' || 15.08 || -196.56 || -56.37 || -378.45 || -19.86 || 493.23 || -494.94 || 32.68 || 193.42 || 453.13 || 0.00 || 0.00 || 0.00 || 11.94 || 227.53 || -15.13 || -9.46 || 0.00 || 0.00 || 0.00 || 1.45 || 0.00
  +
  +
|-
  +
| '''F392S''' || 24.59 || -195.99 || -57.60 || -377.49 || -19.29 || 495.41 || -492.28 || 34.80 || 193.85 || 453.89 || 0.00 || 0.00 || 0.00 || 11.71 || 227.42 || -15.39 || -8.47 || 0.00 || 0.00 || 0.00 || 1.45 || 0.00
  +
|-
  +
| '''P416Q''' || 21.20 || -196.09 || -57.10 || -379.58 || -19.53 || 496.13 || -495.90 || 34.25 || 195.06 || 454.65 || 0.00 || 0.00 || 0.00 || 11.44 || 228.58 || -15.11 || -8.47 || 0.00 || 0.00 || 0.00 || 1.45 || 0.00
  +
|}
  +
<small>'''<caption>''' Here are all components of the mutations given by the FoldX output represented. </caption></small>
  +
</figtable>
  +
The energies given for the wildtype and the five mutations are very similar. Only the total energy of the mutations F392S and P416Q is observable higher than the one of the wildtype. For the prediction, we would expect here that these two mutations are non-neutral and the other ones are neutral.
  +
  +
=== Comparison ===
  +
Now, we want to compare the results of SCWRL and FoldX. Therefore, we loaded the pdb structures into PyMOL and looked after differences between the two results. There was only one change in the 3D structure of the two programs, which can be seen in <xr id="comparison_all"/> and <xr id="comparison_part"/>. <small>
  +
{|align=center
  +
|<figure id="comparison_all">[[File:SCWRL_FoldX_all.png‎|thumb|300px|'''<caption>''' Comparison of the structures predicted with SCWRL (purple) and FoldX (teal). The only change is one beta strand. </caption>]]</figure>
  +
|<figure id="comparison_part">[[File:SCWRL_FoldX_part.png|thumb|300px|'''<caption>'''Comparison of the structures predicted with SCWRL (purple) and FoldX (teal). Zoom into the region of the only changed beta sheet. </caption>]]</figure>
  +
|}</small>
  +
We also compared the two structures with the wildtype 1J8U structure and there are some little changes, but the interesting part was the change between SCWRL and FoldX. FoldX has the beta strand on the position of the wildtype structure whereas the one of SCWRL has changed. Since it is only a small difference in the structures, we do not think, that it has a huge consequence for the protein.
  +
  +
It is also very interesting, which changes the SCWRL (purple) and FoldX (teal) outputs show in each mutation. Hence, we want to analyse every mutation on its own. We also added polar contacts to the mutations, which can overlap and therefore sometimes not be seen for both tools.
  +
  +
<center><small>
  +
<gallery widths=200px heights=150px perrow=2>
  +
File:SCWRL_FoldX_Q172H.png‎|'''Q172H:''' Comparison of the mutation Gln172His generated with SCWRL (purple) and FoldX (teal)in stick form. The only difference is a slight change in the angle of the two residues. Furthermore, the SCWRL mutation does include H-atoms which the FoldX mutation does not. This is the reason, why one of the polar contacts in SCWRL is starting from the H-atom and in FoldX from the atom bond next to it. Both mutations have three bonds to any excluding solvent.
  +
File:SCWRL_FoldX_A259V.png|'''A259V:''' Comparison of the mutation Ala172Val generated with SCWRL (purple) and FoldX (teal) in stick form. Here, we got the same as in mutation Q172H. Minimal rotation of the structures and H-atoms included in SCWRL. Both mutations have three polar bonds, but one of it starts in the SCWRL mutation from the H-atom and in the FoldX structure on the atom before the hydrogen.
  +
File:SCWRL_FoldX_T266A.png|'''T266A:''' Comparison of the mutation Thr172Ala generated with SCWRL (purple) and FoldX (teal) in stick form. The structures overlap mostly. The only difference are the H-atoms included in the SCWRL structure, which change the polar contacts again and one bond more included in the FoldX mutation.
  +
File:SCWRL_FoldX_F392S.png|'''F392S:''' Comparison of the mutation Phe172Ser generated with SCWRL (purple) and FoldX (teal) in stick form. There are hardly changes in the structure of the mutations, the only thing is that SCWRL includes two H-atoms. FoldX has one polar contact more than the mutation generated with SCWRL. The two other bonds are exactly the same.
  +
File:SCWRL_FoldX_P416Q.png‎|'''P416Q:''' Comparison of the mutation Pro172Gln generated with SCWRL (purple) and FoldX (teal) in stick form. This is the only mutation there the arm of the mutated residue has been rotated nearly 180 degrees. The FoldX mutation includes two polar contacts, which the SCWRL mutation does not have, but does it got a bond more, too.
  +
</gallery></small>
  +
</center>
  +
  +
All in all one can see, that SCWRL has hydrogen atoms in the structures included, which FoldX does not. Furthermore, do all mutations got a slight twisting. But in the whole, there are only small changes between the two tools.
  +
  +
== Energy comparisons ==
  +
In this part the energies of the wildtypes and the mutations are compared for both SCWRL and FoldX. Thereby <xr id="scwrl"/> contains all results as well as the predictions made due to those energies.
 
<figtable id="scwrl">
 
<figtable id="scwrl">
 
{|border="1" cellspacing="0" cellpadding="5" align="center" style="text-align:center"
 
{|border="1" cellspacing="0" cellpadding="5" align="center" style="text-align:center"
 
|-
 
|-
 
! colspan="4" style="background:#32CD32;" | SCWRL results
 
! colspan="4" style="background:#32CD32;" | SCWRL results
  +
! colspan="3" style="background:#32CD32;" | FoldX results
 
|-
 
|-
 
! style="background:#90EE90;" | Type
 
! style="background:#90EE90;" | Type
  +
! style="background:#90EE90;" | Energy
  +
! style="background:#90EE90;" | Energy Mutation / <br> Energy Wildtype
  +
! style="background:#90EE90;" | Prediction
 
! style="background:#90EE90;" | Energy
 
! style="background:#90EE90;" | Energy
 
! style="background:#90EE90;" | Energy Mutation / <br> Energy Wildtype
 
! style="background:#90EE90;" | Energy Mutation / <br> Energy Wildtype
 
! style="background:#90EE90;" | Prediction
 
! style="background:#90EE90;" | Prediction
 
|-
 
|-
| '''WT''' || 164.210 || 1.00 || -
+
| '''WT''' || 164.210 || 1.00 || - || 14.00 || 1.00 || -
 
|-
 
|-
| '''Q172H''' || 169.699 || 1.03 || x
+
| '''Q172H''' || 169.699 || 1.03 || neutral || 13.84 || 0.99 || neutral
 
|-
 
|-
| '''A259V''' || 197.235 || 1.20 || x
+
| '''A259V''' || 197.235 || 1.20 || non-neutral || 14.08 || 1.01 || neutral
 
|-
 
|-
| '''T266A''' || 167.116 || 1.02 || x
+
| '''T266A''' || 167.116 || 1.02 || neutral || 15.08 || 1.08 || neutral
 
|-
 
|-
| '''F392S''' || 171.409 || 1.04 || x
+
| '''F392S''' || 171.409 || 1.04 || non-neutral || 24.59 || 1.76 || non-neutral
 
|-
 
|-
| '''P416Q''' || 169.007 || 1.03 || x
+
| '''P416Q''' || 169.007 || 1.03 || neutral || 21.20 || 1.51 || non-neutral
 
|}
 
|}
<small>'''<caption>''' Comparison of the SCWRL results between the wildtype and the mutant structures. In the first column the type (mutation or wildtype) is given, then the resulting total minimal energy of the graph from the SCWRL results. In the third column this energy is divided through the wildtype resulting energy, to check the difference between this two types, and in the last column the prediction of the SCWRL results is represented. </caption></small>
+
<center><small>'''<caption>''' Comparison of the SCWRL and FoldX energy results between the wildtype and the mutant structures. </caption></small></center>
 
</figtable>
 
</figtable>
  +
For the mutation of '''Q172H''' we do not surely know if it is a neutral or non-neutral one. However, since the sequence structure prediction in last weeks task got the same results as given here, we guess that this mutation is really neutral. The SCWRL prediction of '''A259V''' shows very clear a disease-causing mutation, although in FoldX we would not expect something like that. Nevertheless, we know, that this mutation is really non-neutral from the HGMD website. The mutation of '''T266A''' is actually disease-causing, but we would expect a neutral mutation with both tools. This situation is different with the '''F392S''' mutation, since we would expect a non-neutral characteristic in both tools. However, we are not sure, if the SCWRL prediction is meaningful enough. The last mutation of '''P416Q''' is also non-neutral, but this time SCWRL did not predict a disease-causing whereas FoldX does. Both tools are very difficult for the prediction of disease-causing mutations, since one does not really know at which cutoff a mutation is predicted as non-neutral or neutral as well as both tools are not very steady.
...
 
 
=== FoldX ===
 
...
 
 
== Energy comparisons ==
 
...
 
   
 
== Minimise ==
 
== Minimise ==
In the table below, the energy for all five runs of the minisation are given. Since the SCWRL output could not be minimised, we only can see the difference between the wildtype (WT) and the five mutation structures constructed with foldX.
+
The Minimise tool is used to minimise the energy of a model. In the table below, the energy for all five runs of minimise are given. Since the SCWRL output could not be minimised, we only can see the difference between the wildtype (WT) and the five mutation structures constructed with FoldX. The results can be viewed in <xr id="minimise"/>.
 
<figtable id="minimise">
 
<figtable id="minimise">
 
{|border="1" cellspacing="0" cellpadding="5" align="center" style="text-align:center"
 
{|border="1" cellspacing="0" cellpadding="5" align="center" style="text-align:center"
 
|-
 
|-
! colspan="6" style="background:#32CD32;" | minimisation run
+
! colspan="6" style="background:#32CD32;" | minimise run
 
|-
 
|-
 
! style="background:#90EE90;" | Type
 
! style="background:#90EE90;" | Type
Line 186: Line 272:
 
| '''P416Q''' || -7556.57 || -7542.79 || -7299.39 || -7151.21 || -7040.37
 
| '''P416Q''' || -7556.57 || -7542.79 || -7299.39 || -7151.21 || -7040.37
 
|}
 
|}
  +
<center><small>'''<caption>''' Comparison of the energies calculated in the five minimise runs of the wildtype and the five mutations generated with FoldX. The minimise runs did not function with the SCWRL outputs! </caption></small></center>
<small><center>'''<caption>''' ...</caption></center></small>
 
 
</figtable>
 
</figtable>
The energies of the wildtype and the mutated structures is very similar and is per run increasing slightly. Only for the structures of the wildtype and the mutation F392S has the second run a small decreased value.
+
The energies of the wildtype and the mutated structures is very similar and is per run increasing slightly. Only for the structures of the wildtype and the mutation F392S the second run has a small decreased value.
   
 
== References ==
 
== References ==

Latest revision as of 09:18, 27 August 2013

Summary

In Task 8 the sequence of PAH was used for finding mutational effects, now the structure will be taken for these analysis. But how to find out, if a mutation changes the structure? Therefore, one calculates the energy of all atoms for the wildtype and the mutated structure and compares the results for changes. There are two different methods for this calculations given: Quantum Mechanics (QM) and Molecular Mechanics (MM). In QM the energy of all electrons in a protein is calculated. It is one of the most accurated methods, but it is very time consuming. In MM the energy of a system is calculated as a function of nuclear positions. It is very fast and easy to calculate, but it ignores electronic motions and is not as accurate as QM. Since QM is too time intensive and the results of MM are nearly as good as the ones calculated with QM, we use MM for the further analysis. Molecular Mechanics uses force fields for the energy calculation, which is defined as a sum of terms. The terms are non-bonded (electrostatic and Van-der-Waals) and bonded (Bond stretching, Angle stretching, bond rotation) interactions. <ref name="molmech"> Andrew R. Leach (2001): "Molecular Modelling: Principles and Applications (Second Edition)". Prentice Hall, ISBN: 9780582382107. </ref> For the structure based mutation analysis the tools SCWRL and FoldX were used.

Structure selection

Lab journal

In some Tasks before, we used the protein structure of 2PAH as reference, but now we have to check some more constraints for the protein structure selection:

  • Structure with the highest resolution (small Å value),
  • smallest R-factor,
  • highest coverage,
  • pH-value ideally near physiological pH of 7.4 and
  • no gaps (missing residues) included in the structure, so a consecutive numbering of residues should be given.

To check which protein structure to use for further analysis, we compared the constraint data for all sequences given in the PAH (P00439) Uniprot entry. <figtable id="pro-struc">

Protein Method Resolution(Å) R-factor pH Gaps Chain Positions Coverage %
1DMW X-ray 2.00 0.20 6.80 - A 118-424 67,92
1J8T X-ray 1.70 0.20 6.80 - A 103-427 71.90
1J8U X-ray 1.50 0.16 6.80 - A 103-427 71.90
1KW0 X-ray 2.50 0.22 6.80 - A 103-427 71.90
1LRM X-ray 2.10 0.21 6.80 - A 103-427 71.90
1MMK X-ray 2.00 0.20 6.80 - A 103-427 71.90
1MMT X-ray 2.00 0.21 6.80 - A 103-427 71.90
1PAH X-ray 2.00 0.18 6.80 - A 117-424 68.14
1TDW X-ray 2.10 0.21 6.80 - A 117-424 68.14
1TG2 X-ray 2.20 0.21 6.80 - A 117-424 68.14
2PAH X-ray 3.10 0.25 7.00 136LEU-143ASP A/B 118-452 74.12
3PAH X-ray 2.00 0.18 6.80 - A 117-424 68.14
1ANP X-ray 2.11 0.20 6.80 - A 104-427 71.68
4PAH X-ray 2.00 0.17 6.80 - A 117-424 68.14
5PAH X-ray 2.10 0.16 6.80 - A 117-424 68.14
6PAH X-ray 2.15 0.17 6.80 - A 117-424 68.14
Comparison of all pdb structures of P00439 in the used method, Resolution (in Å), R-factor, pH, included Gaps and chain/s, positions in P00439 (PAH) and the coverage to this sequence.

</figtable> All structures do not cover the whole PAH protein (coverage = 100%) and were found with the X-ray diffraction method. In <xr id="pro-struc"/> we can see, that the structure of 1J8U has a better resolution value as well as R-factor than the other structures. Although 2PAH has a better pH-value, a higher coverage and even two chains, however, the structure includes one gap. For this reason as well as the better R-factor and higher resolution value, we have chosen the structure of 1J8U for further analysis. 1J8U is the catalytic domain of human phenylalanine hydroxylase Fe(II) and does not contain any gaps. Moreover, the structure includes the second highest coverage and also a very good pH-value.

The 3D structure of 1J8U as well as its ligands are shown in the <xr id="1j8u"/> below. The binding site for the ligand FE(II) consists of the residues His285, His290 and Glu330 and the one for ligand H4B consists of Val245, Gly247, Leu249 and Ser251. Both were taken from the pdb entry of 1J8U. These binding sites are shown in detail in <xr id="bindingsite"/>.

</figure> </figure>
<figure id="1j8u">
3D-structure of 1J8U (green) in cartoon style with its two ligands H4B - C9H15N5O3 (blue sticks) and FE(II) (grey sphere). The binding site of FE(II) is shown in orange and the one of H4B in pink. Both are illustrated in sticks with surrounding surfaces.
<figure id="bindingsite">
3D-structure of 1J8U (green) in cartoon style with zoom to the ligands H4B - C9H15N5O3 (blue) with corresponding binding site (pink) and FE(II) (grey) with corresponding binding site (orange). The binding sites are shown in sticks and their belonging surface structures.

Visualisation of used mutations

Following five mutations from the previously selected mutations in Task8 are mapped to the crystal structure:

Substitution Prediction Database
Gln172His neutral dbSNP
Ala259Val non-neutral HGMD
Thr266Ala non-neutral dbSNP
Phe392Ser non-neutral dbSNP
Pro416Gln non-neutral HGMD

For the conversion of the residues, which include a mutation, we used the Mutagenesis tool in PyMOL. A little overview of the mutations is located in the next subsections with associated figures.

Gln172His

<figure id="Q172H">

Mutation of glutamine (yellow) to histidine (purple) with their polar contacts located at position 172 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2.

</figure> This mutation is located on a coiled region far away from any of the two binding sites (see <xr id="Q172H"/>), so one would not expect a huge effect on the protein. This would confirm our assumption from last weeks task that this mutation is neutral, although this mutation contains only two of the three polar bonds of the wildtype residue.

Ala259Val

<figure id="A259V">

Mutation of alanine (yellow) to valine (purple) with their polar contacts located at position 259 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2.

</figure> As one can see in <xr id="A259V"/> this mutation is lying in moderate distance to both binding sites. In comparison to Task8 lies the mutation here exactly on a helix and not directly beneath one. The reason for this change is the different protein structure we used for both tasks (2PAH in Task8 and 1J8U here). Even if it is not very close to any of the binding sites and both polar contacts of the wildtype residue remain in the mutation, we assume that this mutation is disease-causing.

Thr266Ala

<figure id="T266A">

Mutation of threonine (yellow) to alanine (purple) with their polar contacts located at position 266 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2.

</figure> In <xr id="T266A"/> we can see, that this residue is not located directly to the binding sites, but it lies between both ones. In last weeks task Thr266Ala was lying on the end of a helix, but here it can be found on a coiled region. Regarding the near location to the binding sites we would expect that this mutation is disease causing. Like in Gln172His this mutation does exhibits only two of the three polar contacts of the wildtype residue.

Phe392Ser

<figure id="F392S">

Mutation of phenylalanine (yellow) to serine (purple) with their polar contacts located at position 392 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2.

</figure> Since this mutations lies far away from both binding sites (<xr id="F392S"/>), we would expect only a small effect on the protein. But we think this mutation is disease causing, as it is located on the beginning of an alpha-Helix and has two polar contacts more now than the wildtype one. Moreover, both residues do have a very different structure and the prediction tools of last week were all unanimous that this mutation affects the protein.

Pro416Gln

<figure id="P416Q">

Mutation of proline (yellow) to glutamine (purple) with their polar contacts located at position 416 of 1J8U (green). The ligands and binding sites are shown in the same colors and representations as given in Figure 1 and 2.

</figure> In this <xr id="P416Q"/> we can see, that the mutation is also far away from the binding sites as Phe392Ser. So, we would also think that it has not a huge effect on the protein. Like before, the tools of last week predict this mutation as non-neutral. Furthermore, the wildtype residue of proline on position 416 has no polar contacts to any other excluding solvent, but the mutation has one included. We therefore think that this mutation is disease-causing.

Mutated structure creation

SCWRL4

SCWRL4 (Side-chain Conformation Prediction With Rotamer Library) predicts protein side-chain conformations. Therefore, it uses a backbone-dependent rotamer library. The tool is based on graph theory, easy to use, accurate and very fast. The output includes a 3D structure of the prediction. <ref name="scwrl4"> Georgii G. Krivov1, Maxim V. Shapovalov1 and Roland L. Dunbrack Jr. (2009): "Improved prediction of protein side-chain conformations with SCWRL4". Proteins Vol.77(4):778-95. doi:10.1002/prot.22488</ref> There is also an online SCWRL Server available.

Since we have done the comparison between the wildtype and the mutated residue in the previous section, we now want to analyse only the comparison between the SCWRL mutation and the wildtype mutation made by the mutagenesis tool of PyMOL. The SCWRL structure is colored in purple, whereas the wildtype mutation is colored in green.

The big difference between the SCWRL structure and the wildtype one is that SCWRL includes H-atoms, which the other structure does not have. This is the reason, why sometimes the bonds look different although they are from the same location (if the H-atom is not in the SCWRL structure included).

FoldX

FoldX is an empirical force field to provide a fast and accurate estimation of mutational free energy changes (effect of SNPs) on the protein stability<ref name="foldef"> Raphael Guerois, Jens Erik Nielsen and Luis Serrano (2002): "The FoldX web server: an online force field". Nucleic Acids Research Vol.33: W382–W388. doi:10.1093/nar/gki387</ref>,<ref name="foldx"> Georgii G. Krivov1, Maxim V. Shapovalov1 and Roland L. Dunbrack Jr. (2002): "Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 Mutations". J. Mol. Biol. Vol.320: 369–387. doi:10.1002/prot.22488</ref>. FoldX provides different run modes, but normally it takes a PDB-file and calculates several energies (e.g. Backbone H-bond, Van der Waals, Water bridge, etc.). In the output all the different calculated energies are given out, like those for our mutations shown in <xr id="foldx"/>.

<figtable id="foldx">

Type Total energy Backbone Hbond Sidechain Hbond Van der Waals Electrostatics Solvation Polar Solvation Hydrophobic Van der Waals clashes Entropy sidechain Entropy mainchain Sloop entropy Mloop entropy Cis bond Torsional clash Backbone clash Helix dipole Water bridge Disulfide Electrostatic kon Partial covalent bonds Energy Ionisation Entropy Complex
WT 14.00 -196.05 -55.77 -379.28 -19.47 492.46 -495.68 34.69 194.42 454.16 0.00 0.00 0.00 11.69 227.66 -15.13 -13.50 0.00 0.00 0.00 1.45 0.00
Q172H 13.84 -196.19 -56.45 -378.50 -19.95 492.48 -495.07 33.39 193.60 453.85 0.00 0.00 0.00 11.68 227.36 -14.78 -11.79 0.00 0.00 0.00 1.55 0.00
A259V 14.08 -198.14 -59.00 -379.98 -19.12 495.02 -497.16 34.93 194.60 454.17 0.00 0.00 0.00 13.19 227.82 -15.57 -10.30 0.00 0.00 0.00 1.45 0.00
T266A 15.08 -196.56 -56.37 -378.45 -19.86 493.23 -494.94 32.68 193.42 453.13 0.00 0.00 0.00 11.94 227.53 -15.13 -9.46 0.00 0.00 0.00 1.45 0.00
F392S 24.59 -195.99 -57.60 -377.49 -19.29 495.41 -492.28 34.80 193.85 453.89 0.00 0.00 0.00 11.71 227.42 -15.39 -8.47 0.00 0.00 0.00 1.45 0.00
P416Q 21.20 -196.09 -57.10 -379.58 -19.53 496.13 -495.90 34.25 195.06 454.65 0.00 0.00 0.00 11.44 228.58 -15.11 -8.47 0.00 0.00 0.00 1.45 0.00

Here are all components of the mutations given by the FoldX output represented. </figtable> The energies given for the wildtype and the five mutations are very similar. Only the total energy of the mutations F392S and P416Q is observable higher than the one of the wildtype. For the prediction, we would expect here that these two mutations are non-neutral and the other ones are neutral.

Comparison

Now, we want to compare the results of SCWRL and FoldX. Therefore, we loaded the pdb structures into PyMOL and looked after differences between the two results. There was only one change in the 3D structure of the two programs, which can be seen in <xr id="comparison_all"/> and <xr id="comparison_part"/>.

</figure> </figure>
<figure id="comparison_all">
Comparison of the structures predicted with SCWRL (purple) and FoldX (teal). The only change is one beta strand.
<figure id="comparison_part">
Comparison of the structures predicted with SCWRL (purple) and FoldX (teal). Zoom into the region of the only changed beta sheet.

We also compared the two structures with the wildtype 1J8U structure and there are some little changes, but the interesting part was the change between SCWRL and FoldX. FoldX has the beta strand on the position of the wildtype structure whereas the one of SCWRL has changed. Since it is only a small difference in the structures, we do not think, that it has a huge consequence for the protein.

It is also very interesting, which changes the SCWRL (purple) and FoldX (teal) outputs show in each mutation. Hence, we want to analyse every mutation on its own. We also added polar contacts to the mutations, which can overlap and therefore sometimes not be seen for both tools.

All in all one can see, that SCWRL has hydrogen atoms in the structures included, which FoldX does not. Furthermore, do all mutations got a slight twisting. But in the whole, there are only small changes between the two tools.

Energy comparisons

In this part the energies of the wildtypes and the mutations are compared for both SCWRL and FoldX. Thereby <xr id="scwrl"/> contains all results as well as the predictions made due to those energies. <figtable id="scwrl">

SCWRL results FoldX results
Type Energy Energy Mutation /
Energy Wildtype
Prediction Energy Energy Mutation /
Energy Wildtype
Prediction
WT 164.210 1.00 - 14.00 1.00 -
Q172H 169.699 1.03 neutral 13.84 0.99 neutral
A259V 197.235 1.20 non-neutral 14.08 1.01 neutral
T266A 167.116 1.02 neutral 15.08 1.08 neutral
F392S 171.409 1.04 non-neutral 24.59 1.76 non-neutral
P416Q 169.007 1.03 neutral 21.20 1.51 non-neutral
Comparison of the SCWRL and FoldX energy results between the wildtype and the mutant structures.

</figtable> For the mutation of Q172H we do not surely know if it is a neutral or non-neutral one. However, since the sequence structure prediction in last weeks task got the same results as given here, we guess that this mutation is really neutral. The SCWRL prediction of A259V shows very clear a disease-causing mutation, although in FoldX we would not expect something like that. Nevertheless, we know, that this mutation is really non-neutral from the HGMD website. The mutation of T266A is actually disease-causing, but we would expect a neutral mutation with both tools. This situation is different with the F392S mutation, since we would expect a non-neutral characteristic in both tools. However, we are not sure, if the SCWRL prediction is meaningful enough. The last mutation of P416Q is also non-neutral, but this time SCWRL did not predict a disease-causing whereas FoldX does. Both tools are very difficult for the prediction of disease-causing mutations, since one does not really know at which cutoff a mutation is predicted as non-neutral or neutral as well as both tools are not very steady.

Minimise

The Minimise tool is used to minimise the energy of a model. In the table below, the energy for all five runs of minimise are given. Since the SCWRL output could not be minimised, we only can see the difference between the wildtype (WT) and the five mutation structures constructed with FoldX. The results can be viewed in <xr id="minimise"/>. <figtable id="minimise">

minimise run
Type 1 2 3 4 5
WT -7516.27 -7524.20 -7291.36 -7133.71 -6996.34
Q172H -7514.27 -7504.92 -7281.60 -7131.31 -7023.56
A259V -7469.61 -7462.48 -7221.58 -7065.94 -6951.32
T266A -7536.77 -7523.38 -7298.14 -7165.29 -7084.60
F392S -7511.51 -7528.61 -7290.01 -7132.75 -7010.52
P416Q -7556.57 -7542.79 -7299.39 -7151.21 -7040.37
Comparison of the energies calculated in the five minimise runs of the wildtype and the five mutations generated with FoldX. The minimise runs did not function with the SCWRL outputs!

</figtable> The energies of the wildtype and the mutated structures is very similar and is per run increasing slightly. Only for the structures of the wildtype and the mutation F392S the second run has a small decreased value.

References

<references/>