Structure-based mutation analysis HEXA

From Bioinformatikpedia
Revision as of 18:08, 10 August 2011 by Link (talk | contribs) (Energy)

Farbcode bei active side: active side: grün Glycolysation: gelb Cystein: cyan Mutation:rot

Sequence Description

We had to use a PDB file, in which are no missing residues and the quality of the structure should be high. We found only one PDB structure which was not bounded to a ligand. Therefore, we could not regard the quality and the pH value, the R-factor and the coverage. Nevertheless, we listed in the following table this values:

experiment type X-Ray diffraction
Resolution 2.8 Å
temperature (Kelvin) 100K
temperature (Celsius) -173 °C
pH-Value 5.5 (slightly acid)
R-Value 0.270


It was not possible to find one file, without any missing residues. In each file there was a gap between residue 74 to 89 and the last amino acid. Therefore, we decided to cut off the first 89 residues and use a PDB file with a structure from 89 - 528. This file can be found [here].

Mutations

Because of the shorten PDB file, it was not possible for us to analyse the first two mutations on position 29 and 39.

SNP-id codon number mutation codon mutation triplet
rs4777505 29 Asn -> Ser AAC -> AGC
rs121907979 39 Leu -> Arg CTT -> CGT
rs61731240 179 His -> Asp CAT -> GAT
rs121907974 211 Phe -> Ser TTC -> TCC
rs61747114 248 Leu -> Phe CTT -> TTT
rs1054374 293 Ser -> Ile AGT -> ATT
rs121907967 329 Trp -> TER TGG -> TAG
rs1800430 399 Asn -> Asp AAC -> GAC
rs121907982 436 Ile -> Val ATA -> GTA
rs121907968 485 Trp -> Arg gTGG -> CGG

Analysis of the mutations

We created for each mutation an extra page. The summary of the analysis can be seen in the Summary Section.


Protocoll - Using the methods

Pymol

We visualized the local hydrogen-bonding network with following commands:

distance hbonds, all, all, 3.2, mode=2
zoom resi <interval>
hide labels, all
color red, resi <mutation_position>

Furthermore, we also used the polar contact mode in pymol to visulize the h-bonds.

The clashed are visualized by the following commands:

distance clash = pos_mutation, all, 2.0, 0
zoom clash

FoldX

To use FoldX, we created a runfile, which can be found [here]. We fitted the temperature and pH-value to the values we extracted from the PDB page. Furthermore, we analysed the mutations with a random choosen temperature and pH value, to see how much influence these parameteres have on the analysis.

We ran FoldX with following command:

FoldX -runfile run.txt > foldx_output


minimise

Next we used minimise. Therefore, it was not necessary to create any file for the run. Sadly, we could not find any documentation about minimise and therefore, it was really hard to figure out how it works and what means the output.

We ran minimise with following command:

minimise <input> <output>

Gromacs

Before we could run Gromacs, we had to curate our PDB file. Therefore, we used the script repairPDB to extract chainA. Next we run SCRWL to make sure, that every residue is available in the PDB file.

Then we used the commands which are listed in our task section.


Additionally, to the analysis of our mutated sequences, we also chose different forcefields and analysed our protein without any mutation with these forcefields. Here are the results of this analysis:

analysed energies (in kJ/mol) force field
AMBER3 AMBER99SB-ILDN CHARMM27
Bond Average 852.968 1091.57 1796.6
Err Estimation - 270 240
RMSD 42.0241 -nan 2924.29
Drift -74.0853 -1622.75 -1404.11
Angle Average 3438.47 3326.81 4764.7
Error Estimation - 62 60
RMSD 16.8864 -nan 466.82
Drift -33.7041 404.076 368.45
Potential Average -50917.7 -61304.1 166.582
Error Estimation - 960 39
RMSD 66.4149 -nan 79.7058
Drift -132.636 -6402.44 280.841

Furthermore, we used different numbers for nsteps. The result of how long these analysis run, can be found in the following table and graph:

nstep time real time user time sys #steps
50 8.268s 3.860s 0.110s 24
500 27.523s 47.650s 0.540s 321
5000 25.281s 17.710s 0.210s 114
50000 14.940s 14.210s 0.300s 91

Results

Energy

 FoldX  Minimise  Gromacs
Mutation energy value Ratio difference energy value Ratio difference energy value Ratio difference
wildtype -154.17 0 -9610.467157 0 -61304.1 0
Rs61731240 -151.61 1.56 -9480.968602 1.35 -48160.5 21.44
Rs121907974 to do todo -9594.637506 0.16 -46177.4 24.68
Rs61747114 -153.78 0.15 -9606.588566 0.04 -48802.5 20.40
Rs1054374 -152.15 1.21 -6189.246312 35.60 -48652.6 20.64
Rs121907967 - - - - - -
Rs1800430 -155.11 -0.72 -9505.864181 1.09 -48693.9 20.57
Rs121907982 -154.36 -0.23 -9618.062763 -0.08 -48418.9 21.02
Rs121907968 -149.08 3.20 -9608.663976 0.02 -49443.8 19.35

FoldX

There are differences in the energy of the mutated structures and the wild-type structure, but in general these differences are not that strong. The highest energy value has mutation Rs121907968, where the mutated structures has 3% more energy than the wild type. The lowest difference is between wildtype and Rs61747114 with only 0.15, which is nearly 0. There are two structures which have less energy than the wildtype (mutation Rs1800430 and Rs121907982) but also here the difference is very low (0.23 and 0.72) and therefore it is hard to explain why these mutations damage the protein, because the energy difference is not that high.

Minimise

Minimise has lower energy values than FoldX. But these two values are not comparable, because both methods based on different calculation models. In this case Rs1054374 has the highest energy different with about 35%, which is very high. The lowest energy differences can be seen for mutation Rs61747114 (0.04) and Rs121907968 (0.02). There is also one structure which has a higher energy, but the value is very low (-0.08). In general expect the mutation Rs1054374 almost all of the other structures has less energy difference compared to the wildtype.

Gromacs

Gromacs uses the lowest values, but as before, these values are not comparable with the other methods because of different calculation models. In this case you can see a strong difference between wildtype and mutation with about 20% in average. Furthermore, there is no big difference between the values of the different mutations, as we could seen before. The lowest energy difference has mutation Rs121907968 with 19.35 whereas the highest differences is between wildtype and mutation Rs121907974 with 24.68.

General

Interesstingly, Gromacs do not calculate any structure with lower energy than the wildtype. FoldX and Minimise have almost always the same trend. There are two exception with the mutation Rs1054374 which has a low ratio with FoldX and a very high ratio with Minimise. Furthermore, FoldX calculated a higher energy than the wildtype energy for Rs1800430, whereas Minimise does not predict this trend. In general to compare the predictions of the different programs we draw a graph with the different values which can be seen in the following picture.

Visualisation of the ratio of the different programs. (FoldX: orange, Minimise: purple, Gromacs: grey)

Bonds

Mutation Position H-bonds original H-bonds mutated
Rs61731240 179 no no
rs121907974 211 no no
rs61747114 248 no no
rs1054374 293 two no
rs121907967 329 - -
rs1800430 399 no no
rs121907982 436 no no
rs121907968 485 one one (different location)