Structure based mutation analysis of GBA

From Bioinformatikpedia
Revision as of 19:31, 14 August 2011 by Braunt (talk | contribs) (Mutations)

Introduction

After the sequence based mutation analysis, as described in Task 6, a structure based mutation analysis of ten mutations (listed in the table below) was carried out. For the analysis, several different tools have been used. TThe different tools and the corresponding steps, applied in this analysis are described in more detail in this workflow for reasons of clarity.

Nr. SNP ID/Accession Number Database Position
including SP
Position
without SP
Amino Acid Change Codon Change
1 CM081634 HGMD 49 10 Gly - Ser cGGC-AGC
2 rs74953658, CM050263 dbSNP, HGMD 63 24 Asp - Asn tGAC-AAC
3 rs1141820 dbSNP 99 60 His - Arg CAC - CGC
4 CM880035 HGMD 159 120 Arg - Gln CGG-CAG
5 rs80205046, CM041347 dbSNP, HGMD 221 182 Pro - Leu CCC - CTC
6 rs74731340, CM970620 dbSNP, HGMD 310 271 Ser - Asn AGT - AAT
7 CM880036 HGMD 409 370 Asn - Ser AAC-AGC
8 CM993703 HGMD 350 311 His - Arg CAT-CGT
9 rs80020805, CM052245 dbSNP, HGMD 455 416 Met - Val cATG-GTG
10 rs113825752 dbSNP 509 470 Leu - Pro CTT - CCT



Structure Selection

To carry out a structure-based analysis of the mutations chosen in Task 6, a crystal structure had to be chosen. According to Uniprot there are 19 different crystal structures of glucocerebrosidase available. The table below shows the six different structures with a resolution of or better than 2 Angstrom. 2NT0 is chosen as template for the analysis carried out in this section, as no residues are missing, the R-value is quite low, and it has the best resolution among the structures without missing residues. Only incomplete structures have been resolved near the physiological pH (7.4), therefore a structure resolved at a more acid pH had to be chosen. The structure can either be downloaded from the PDB website or by using the script fetchpdb, which validates the ID and downloads the corresponding structure.


PDB ID Resolution [Å] R-factor Coverage pH # Missing Residues (A/B)
1OGS 2.00 0.195 4.6 0
2NT0 1.79 0.181 . 4.5 0
2V3D 1.96 0.157 6.5 9/8
2V3E 2.00 0.163 7.5 7/7
2V3F 1.95 0.154 6.5 8/14
3GXI 1.84 0.193 NULL 0


Mutation Mapping

Figure 1 shows the positions of the analyzed mutations in the original structure of 2NT0. As already mentioned in Task 5 and 6, one can clearly see that two mutations are next to the active site residues Glu235 and Glu340, namley the mutations at positions 120 and 311. The wildtype residues at these positions (Arg120 and His311) are known to form hydrogen bonds with the active sites and should therefore be quite important for function and structure. <ref>Kim et al., Crystal Structure of the Salmonella enterica Serovar Typhimurium Virulence Factor SrfJ, a Glycoside Hydrolase Family Enzyme. Journal of Bacteriology, 2009, p. 6550-6554, Vol. 191, No. 21 </ref> The other eight mutation positions are located all over the protein. The amino acid properties of the different mutant and wildtype structures have already been analysed in Task 6.


Figure 1: 2NT0 with hilighted mutation positions (red) and active site residues (blue).
Figure 2: Close-up of active site of 2NT0 with hilighted mutation positions (red) and active site residues (blue).

SCWRL

SCRWL4 was applied ten times, once for each mutation. The resulting conformations of the mutants are visualized in Figure 3. For this representation the hydrogens of the SCWRL mutant structures have been removed to simplify the comparison with the other two structures. Figure 3 additionally shows the wildtype amino acids and the mutants created with the mutagenesis method of pymol. The conformations, created with SCWRL4 and pymol vary greatly. Only in mutation 9 they seem to be quite similar. Figure 4 shows a superposition of the wild type protein and the mutated proteins in cartoon representation. This shows that SCWRL did not only change the mutant residues, but also changed some beta sheets at the bottom of the structure (shown in green). This fact may be due to the different polar interactions of the mutants.

Figure 3: Wildtype amino acids (red) and mutations created with SCWRL (green) and pymol mutagenesis (orange) hilighted on the structure of 2NT0.
Figure 4: Cartoon representation of 2NT0, chain A (gray) superimposed with the resulting structures of SCWRL (green).




Minimise

Figure 5 shows the interesting positions with hilighted mutants and wild type residues of the pdb files obtained with Minimise after having applied the steps as indicated in the workflow. The hydrogens of the structures have been removed as well.

Figure 5: Wildtype amino acids (red) and mutations (green) created with Minimise. Polar interactions are shown with dotted lines.

Gromacs

Figure 6 shows the interesting positions with hilighted mutants and wild type residues of the pdb files obtained with Gromacs after having applied the steps indicated in the workflow. The hydrogens of the structures have been removed as well.


Figure 6: Wildtype amino acids (red) and mutations (cyan) created with Gromacs. Polar interactions are shown with dotted lines.

Polar Interactions

Polar interactions are crucial for the structue, and therefore function of a protein. A mutation of a single amino acid could therefore alter the features and appearance of a protein. Analyzing polar interactions may therefore help to determine whether a mutation is damaging or neutral. If the polar interactions, which are present in the wild type structure, are still formed after the mutation, the mutation may be tolerated, otherwise it should be damaging.

The table below shows the residues forming polar interactions with the mutant/wild type amino acid. The polar interactions can be seen in Figure 3, Figure 5 and Figure 6 as well, but one may not clearly distinguish, whether an interaction is only formed by e.g. the wild type or by wild type and mutant. To determine the interaction partners, the tool Pymol was used [actions -> find -> polar contacts -> to other atoms in object].


Mutation PDB
Wild type
Pymol
Mutagenesis
SCWRL Minimise Gromacs
Wild Type Mutation Wild Type Mutation
1 S8 S8
F9
S8 S8 S8 S8 S8
2 K413
Y418


Y22
K413
Y418
K413
Y418
K413
Y418
K413
Y418
K413
Y418
3 T471
G62
T471
G62
T417
4 G83
M85
S177
D282
N234
E340
G83
M85
G83
M85
S177



S339
G83
M85
S177
D282
N234
E340
G83
M85
S177



S339
G83
M85
S177
D282
N234
E340
G83
M85
S177
5 S181
L185
S181
L185
S181
L185
S181
L185
S181
L185
S181
L185
S181
L185
6 L268
H273
H274
L268
H273
H274
L268

H274
L268
H273
H274
L268

H274
L268
H273
H274
L268

H274
T267
7 S366
Y373
V375
S366
Y373
V375
S366
Y373
V375
S366
Y373
V375
S366
Y373
V375
S366
Y373
V375
S366
Y373
V375
8 D282
D283
R285
S339

D283
R285
D282

R285

Y313
E340
D282
D283
R285

D283
R285

Y313
E340
N234
E235
D282
D283
R285
S339

D283
R285

Y313
E340
9 L420 L420 L420 L420 L420 L420 L420
10 T482 T482
V468
T482 T482 T482 T482 T482


Mutations 1, 2, 5, 7, 9 and 10 show in most cases (SCWRL, Minimise, Gromacs) the same polar interactions as the wild type and should therefore not have a big influence. The other mutations (3,4,6 and 8) form however additional bonds or miss some interactions which are present in the wild type. Interestingly, the most common mutation found in Gaucher Disease forms the same polar interactions as the wildtype.

Clashes or Holes

Furthermore it was analyzed whether the mutations cause clashes or holes in the protein structure. The following table shows whether the mutations created/minimized with Pymol, SCWRL, Minimise and Gromacs lead to structural inconsistency compared to the wildtype glucocerebrosidase (2NT0).


Mutation Pymol
Mutagenesis
SCWRL Minimise Gromacs
1 no
but different surface
no no no
but different surface
2 no no no no
3 no no
but different surface
no
but different surface
no
4 no no no no
but different surface
5 no no no no
6 no
but different surface
no
but different surface
no
but different surface
no
but different surface
7 no no no no
8 no no
but different surface
no no
9 no no
but different surface
no no
but different surface
10 no no
but different surface
no
but different surface
no



None of the mutations lead to clashes or holes in the protein structure, but some of the mutations cause a different surface. Mutation 6 leads to a different surface, no matter whith which tool they were created or minimized. This is quite interesting as Serine and Asparagine are structurally and chemically very similar. Regarding the polar interactions, they are not identical.

Overall one would assume that the surface od the resulting protein is changed if a mutation leads to different polar interactions and if the mutations takes place inside a structural element. Mutation 1 leads to a different surface when build with Pymol Mutagenesis and Gromacs. The mutated residue is part of a secondary structural element, the mutated amino acid has a different polarity than the wildtype and the Pymol Mutagenesis build forms an additional hydrogen bond. These facts could explain the different surface. Mutation 2 does not lead to a different surface. This is quite interesting regarding the Pymol Mutagenesis Build, as this one forms completely different hydrogen bonds than the wildtype. The fact that the mutated residue is not part of a secondary structural element and that Asparagine and Aspartic Acid have a similar molar mass could be an explanation for the not altered surface. Mutation 3 leads to a different surface whem niminised with SCRWL or Minimize. The polar interactions are always different to the ones formed by the wild type, no matter which tool was used. Furthermore the mutated amino acid is structurally very different to the wildtype amino acid. Mutation 4 results in a different surfacce when using the tool Gromacs. The mutated amino acid forms different polar interactions than the wild type, no matter which tool was used. Therefore it is quite interesting, that the surface is only altered when gromacs is used.

Energy Comparison

The energy of a protein is an indication for its stability: the lower the more stable is the protein. Therefore a comparison of energy between the different mutations and wildtypes could be an indication whether a mutation could be damaging. If the energy of the mutated protein is much higher than the one of the wildtype protein it might be an indication that this mutation is damaging.

In this section, the energies of the structures, obtained with different minimization and modeling tools (SCWRL, FoldX, Minimise and Gromacs) are listed for each structure and for the wildtype (if available).

SCWRL

The table below shows the minimal energies for the different mutated structures obtained after the sidechain modelling with SCWRL. Most of the mutations are about the same energy value. Only Mutation 5 shows a much higher energy, which shows, that the protein is much less stable than if other resiudes are mutated. This indicates that Mutation 5 is a damaging mutation.

Mutation 1 2 3 4 5 6 7 8 9 10
Minimal Energy 351.329 348.659 350.017 355.416 473.454 364.148 352.615 362.604 354.98 375.976

FoldX

The total energies calculated with FoldX are shown in the table below. The differences between the wild type and the different mutant structures have been calculated and are listed in the table as well.


Mutation Total Energy Difference
WT -372.60 0
1 -225.82 -146.78
2 -228.18 -144.42
3 -226.97 -145.63
4 -226.38 -146.22
5 -196.84 -175.76
6 -224.01 -148.59
7 -228.48 -144.12
8 -217.29 -155.31
9 -221.71 -150.89
10 -218.65 -153.95

The mutated structures have a higher energy than the wildtype protein. Once again it is Mutation 5 having the highest energy and therefore being the least stable structure. The other mutations show similar energies.

Minimise

The total energies calculated with minimise are shown in the table below. The differences between the wild type and the different mutant structures have been calculated and are listed in the table as well.

Mutation Total Energy Difference
WT -12263.635255 0
1 -10405.258800 1858.37645
2 -10354.081889 1909.55337
3 -10313.900667 1949.73459
4 -10415.502839 1848.13242
5 -8558.228610 3705.40664
6 -10397.436817 1866.19844
7 -10427.599844 1836.03541
8 -3611.581531 8652.053719
9 -10030.430566 2233.20469
10 -10300.108046 1963.52721

The energy values of the mutated proteins are higher than the energy of the wildtype protein, which indicates that the mutated ones are less stable. This time Mutation 8 shows the highest energy and variance from the wildtype protein. Mutation 5 is also significantly higher than the average distance. Mutation 9 shows a slightly higher energy than most of the other mutations. This might be an indication, that Mutation 5, 8 and 9 are more damaging than the other mutations.

Gromacs

Wildtype

The wildtype glucocerebrosidase was simulated with three different force fields and different maximum number of steps to integrate or minimize (nsteps). The following force fields have been used:

  • AMBER03: Assisted Model Building and Energy Refinement
  • CHARMM27: Chemistry at HARvard Macromolecular Mechanics; all atom force field
  • OPLS/AA: Optimized Potential for Liquid Simulations; all atom force field
Correlation between nsteps and runtime of mdrun

The following table shows the measured time for mdrun for different nsteps and force fields. It also shows the steps it really took to calculate the lowest energy. Nsteps specify the maximum number of steps to integrate or minimize during the simulation. Therefore the computation time should ne higher, the more steps have been performed. This fact can be observed in the table below. With the force fields AMBER03 and CHARMM27 it took less than 1000 steps to reach the minimum, so computation times for 1000 and 5000 are the same. Figure 7 shows the correlation between nsteps and the runtime (real time)of mdrun.

Figure 7: Correlation between nsteps and the runtime of mdrun for different force fields
nsteps AMBER03 CHARMM27 OPLS/AA
10 real: 0m4.615s
user: 0m2.910s
sys: 0m0.230s
Steps: 10
real: 0m10.760s
user: 0m2.760s
sys: 0m0.150s
Steps: 10
real: 0m3.801s
user: 0m2.280s
sys: 0m0.320s
Steps: 10
50 error occured real: 0m34.086s
user: 0m10.830s
sys: 0m0.260s
Steps: 50
real: 0m13.042s
user: 0m9.200s
sys: 0m0.620s
Steps: 50
100 real: 0m29.400s
user: 0m22.060s
sys: 0m0.640s
Steps: 100
real: 0m23.616s
user: 0m19.410s
sys: 0m0.590s
Steps: 100
real: 0m25.184s
user: 0m18.330s
sys: 0m0.740s
Steps: 100
500 real: 1m9.519s
user: 0m53.700s
sys: 0m1.010s
Steps: 242
real: 3m15.607s
user: 1m1.790s
sys: 0m1.070s
Steps: 307
real: 3m51.264s
user: 1m7.870s
sys: 0m1.590s
Steps: 500
1000 real: 2m9.160s
user: 0m52.990s
sys: 0m1.190s
Steps: 242
real: 3m22.812s
user: 1m2.120s
sys: 0m1.170s
Steps: 307
real: 5m9.851s
user: 1m38.390s
sys: 0m1.860s
Steps: 732
5000 real: 2m13.786s
user: 0m53.020s
sys: 0m1.030s
Steps: 242
real: 3m15.591s
user: 1m2.730s
sys: 0m1.560s
Steps: 307
real: 5m5.631s
user: 1m37.240s
sys: 0m1.830s
Steps: 732


Results of the different Force Fields

The results for the different force fields and different nsteps are listed in the following file for reasons of clarity: File:Glucocerebdosidase gromacs forcefields.pdf. The results are additionally visualized in the figures below.


AMBER03

Figure 8: AMBER03 with 10 nsteps.
Figure 9: AMBER03 with 100 nsteps.
Figure 10: AMBER03 with 500 nsteps.
Figure 11: AMBER03 with 1000 nsteps
Figure 12: AMBER03 with 5000 nsteps.

CHARMM27


Figure 13: CHARMM27 with 10 nsteps.
Figure 14: CHARMM27 with 50 nsteps.
Figure 15:CHARMM27 with 100 nsteps.
Figure 16: CHARMM27 with 500 nsteps.
Figure 17: CHARMM27 with 1000 nsteps
Figure 18: CHARMM27 with 5000 nsteps.

OPLS/AA

Figure 19: OPLS/AA with 10 nsteps.
Figure 20: OPLS/AA with 50 nsteps.
Figure 21: OPLS/AA with 100 nsteps.
Figure 22: OPLS/AA with 500 nsteps.
Figure 23: OPLS/AA with 1000 nsteps.
Figure 24: OPLS/AA with 5000 nsteps.

Mutations

For the ten mutated structures, only a simulation with the AMBER force field has been done. The table below shows the total bond, angle and potential energy of the mutations and the wildtype. Additionally the differences of the mutations in comparison to the wildtype are given.


Mutation Total Energy Bond Difference Bond Total Energy Angle Difference Angle Total Energy Potential Difference Potential
WT 1037.65 0 4355.86 0 -48425.9 0
1 1274.92 237.27 4255.25 -100.61 -47765.7 660.2
2 1456.65 419 4236.44 -119.42 -47077.9 1348
3 1231.56 193.91 4183.17 -172.69 -48600.7 -174.8
4 1338.99 301.34 4240.62 -115.24 -47096.1 1329.8
5 1478.43 440.78 4311.75 -44.11 77967 126392.9
6 1308.78 271.13 4244.41 -111.45 -47639.3 786.6
7 1201.46 163.81 4258.22 -97.64 -47883.1 542.8
8 1628.70 591.05 4182.29 -173.57 -40120 8305.9
9 1421.28 383.63 4230.01 -125.85 -47052.5 1373.4
10 1514.09 476.44 4283.65 -72.21 -46575.7 1850.2


Mutation 1: Gly - Ser (Pos. 10)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1274.92 500 3967.13 -3038.46 (kJ/mol) read 306 time 398.000
Angle 4255.25 28 265.491 71.602 (kJ/mol) read 306 time 398.000
Potential -47765.7 1500 6978.27 -9464.84 (kJ/mol) read 306 time 398.000
Figure 25: GROMACS Energy for the first mutation with AMBER03 forcefield and nsteps 500.
Mutation 2: Asp - Asn (Pos. 24)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1456.65 680 4628.95 -4212.43 (kJ/mol) read 223 time 292.000
Angle 4236.44 39 306.459 -37.8896 (kJ/mol) read 223 time 292.000
Potential -47077.9 1800 8037.84 -11650.4 (kJ/mol) read 223 time 292.000
Figure 26: GROMACS Energy for the second mutation with AMBER03 forcefield and nsteps 500.
Mutation 3: His - Arg (Pos. 60)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1231.56 460 3764.8 -2712.35 (kJ/mol) read 340 time 440.000
Angle 4183.17 25 249.669 93.2215 (kJ/mol) read 340 time 440.000
Potential -48600.7 1400 6639.68 -8746.39 (kJ/mol) read 340 time 440.000
Figure 27: GROMACS Energy for the third mutation with AMBER03 forcefield and nsteps 500.
Mutation 4: Arg - Gln (Pos. 120)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1338.99 580 4237.53 -3506.39 (kJ/mol) read 267 time 350.000
Angle 4240.62 32 282.449 33.6883 (kJ/mol) read 267 time 350.000
Potential -47096.1 1600 7399.52 -10330.7 (kJ/mol) read 267 time 350.000
Figure 28: GROMACS Energy for the fourth mutation with AMBER03 forcefield and nsteps 500.
Mutation 5: Pro - Leu (Pos. 182)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1478.43 700 4792.41 -4089.05 (kJ/mol) read 351 time 449.000
Angle 4311.75 35 287.097 -52.7644 (kJ/mol) read 351 time 449.000
Potential 77967 130000 2.2001e+06 -760308 (kJ/mol) read 351 time 449.000
Figure 29: GROMACS Energy for the fifth mutation with AMBER03 forcefield and nsteps 500.
Mutation 6: Ser - Asn (Pos. 271)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1308.78 550 4136.28 -3263.06 (kJ/mol) read 281 time 364.000
Angle 4244.41 32 277.965 43.6715 (kJ/mol) read 281 time 364.000
Potential -47639.3 1600 7314.5 -10123.2 (kJ/mol) read 281 time 364.000
Figure 30: GROMACS Energy for the sixth mutation with AMBER03 forcefield and nsteps 500.
Mutation 7: Asn - Ser (Pos. 370)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1201.46 430 3661.19 -2529.9 (kJ/mol) read 360 time 461.000
Angle 4258.22 26 246.136 110.294 (kJ/mol) read 360 time 461.000
Potential -47883.1 1300 6455.75 -8370.77 (kJ/mol) read 360 time 461.000
Figure 31: GROMACS Energy for the seventh mutation with AMBER03 forcefield and nsteps 500.
Mutation 8: His - Arg (Pos. 311)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1628.7 850 5363.95 -5160.93 (kJ/mol) read 269 time 351.000
Angle 4182.29 42 309.833 -56.0265 (kJ/mol) read 269 time 351.000
Potential -40120 9100 96292.1 -57418.1 (kJ/mol) read 269 time 351.000
Figure 32: GROMACS Energy for the eighth mutation with AMBER03 forcefield and nsteps 500.
Mutation 9: Met - Val (Pos. 416)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1421.28 650 4522.24 -4029.74 (kJ/mol) read 234 time 307.000
Angle 4230.01 38 300.113 -29.8853 (kJ/mol) read 234 time 307.000
Potential -47052.5 1700 7880.56 -11371.6 (kJ/mol) read 234 time 307.000
Figure 33: GROMACS Energy for the ninth mutation with AMBER03 forcefield and nsteps 500.
Mutation 10: Leu - Pro (Pos. 470)
Energy Average Err. Est. RMSD Tot-Drift (kJ/mol) Last energy frame
Bond 1514.09 740 4792.64 -4608.11 (kJ/mol) read 208 time 275.000
Angle 4283.65 49 321.438 -149.588 (kJ/mol) read 208 time 275.000
Potential -46575.7 1900 8664.13 -12790.7 (kJ/mol) read 208 time 275.000
Figure 34: GROMACS Energy for the tenth mutation with AMBER03 forcefield and nsteps 500.



Discussion

References