Structure-based mutation analysis BCKDHA

Structure selection

The following table presents the PDB structures for BCKDHA to date:

PDB id	resolution [Å]	R-factor	ph-value
1DTW	2.70	0.224	7.5*
1OLS	1.85	0.172	5.5
1OLU	1.90	0.161	5.5
1OLX	2.25	0.161	5.5
1U5B	1.83	0.156	5.8
1V11	1.95	0.139*	5.5
1V16	1.90	0.132*	5.5
1V1M	2.00	0.130*	5.5
1V1R	1.80	0.158	5.5
1WCI	1.84	0.149	5.5
1X7W	1.73	0.148	5.8
1X7X	2.10	0.149	5.8
1X7Y	1.57	0.150	5.8
1X7Z	1.72	0.154	5.8
1X80	2.00	0.161	5.8
2BEU	1.89	0.171	5.5
2BEV	1.80	0.139	5.5
2BEW	1.79	0.147	5.5
2BFB	1.77	0.145	5.5
2BFC	1.64	0.144	5.5
2BFD	1.39*	0.150	5.5
2BFE	1.69	0.150	5.5
2BFF	1.46	0.150	5.5
2J9F	1.88	0.171	5.5

The asteriks-marked values indicate that these structures were resolved with the asked experimental quality. As one can see, none of the structures fulfills all conditions.

Furthermode, we could not use any of the PDB structures for BCKDHA because all of them had gaps in the secondary structure which means that some residues were missing. So we took the structure which has the less gaps: 1U5B

resultion: 1.83
R-factor: 0.156
ph-value: 5.8

This structure has to be modified with some programms to close the gaps. Additionally the first residues which are in BCKDHA misses in 1U5B thats why the start position corresponds to position 6 of the BCKDHA -PDB sequence.

As we can see none of the values corresponds to the demands because it was asked for a structure which has a very small R-factor, a pH of 7.4 and a high resolution.

Mapping of the mutations on the crystal structure

Structure of BCKDHA. Violet: mutations, orange: thiamine pyrophosphate binding sites, yellow: metal binding sites.

Hydrogen bonds are interactions between an hydrogen atom and an electronegative atom. Electronegative atoms which often take part in hydrogen bonds are oxygen, nitrogen and fluorine (not present in amino acid side chains). They serve as a hydrogen bond acceptor, whereas a hydrogen bond donor is a electronegative atom bonded to a hydrogen atom. Hydrogen bonds are essential for the three-dimensional structures of proteins. They play a important role in the formation of helices and beta-sheets and cause proteins to fold into a specific structure.

Showing hydrogen bonds with Pymol: A -> find -> polar contacts -> within selection The respective amino acids were colored by element, s.t. oxygen is red, nitrogen is blue, hydrogen is white and sulfur is yellow.

M82L

Hydrogen Bonds for methionine on pos 82 in the wild type structure

Hydrogen Bonds for leucine on pos 82 in the mutated type structure

Comparing the two figures for the wildtype and the mutated amino acid on position 82, no change in the hydrogen bonding network can be observed. This is due to the similar physiochemical properties of these two amino acids. No atom which could serve as additional hydrogen-bond donor or acceptor was introduced or removed.

Q125E

Hydrogen Bonds for glutamine on pos 125 in the wild type structure

Hydrogen Bonds for glutamic acid on pos 125 in the mutated type structure

The substitution from glutamine to glutamic acid changes the side chain properties completely. A NH2 group is substituted by a negatively charged oxygen. The NH2 which served in the wildtype structure as a hydrogen bond acceptor is not present any more, so the hydrogen bonding network changed for this substitution.

Y166N

Hydrogen Bonds for tyrosine on pos 166 in the wild type structure

Hydrogen Bonds for asparagine on pos 166 in the mutated type structure

Although tyrosine and asparagine both could play a role in the hydrogen bonding network, no hydrogen bond is formed for position 166. Therefore this substitution has no influence on the hydrogen bonding network of the protein.

G249S

Hydrogen Bonds for glycine 249 in the wild type structure

Hydrogen Bonds for serine on pos 249 in the mutated type structure

Introducing a serine on position 249 leads to the formation of several additional hydrogen bonds. Two of the newly established bonds are due to the new hydroxy group which is very likely to participate in hydrogen bonds. Another additional hydrogen bond is formed using the nitrogen atom as a hydrogen bond acceptor.

C264W

Hydrogen Bonds for cysteine on pos 264 in the wild type structure

Hydrogen Bonds for tryptophan on pos 264 in the mutated type structure

Although the amino acids cysteine and tryptophan have very different structures and chemical properties, no change in the hydrogen bonding network occurs.

R265W

Hydrogen Bonds for arginine on pos 265 in the wild type structure

Hydrogen Bonds for tryptophan on pos 265 in the mutated type structure

The mutation from arginine to tryptophan leads to a drastic change in the hydrogen bonding network. Arginine, which contains three nitrogen atoms in its side chain is removed and therefore three hydrogen bond acceptors are missing in the mutated protein.

I326T

Hydrogen Bonds for isoleucine on pos 326 in the wild type structure

Hydrogen Bonds for threonine on pos 326 in the mutated type structure

The mutation from isoleucine to threonine doesn't have an influence on the hydrogen bonding network, although the oxygen atom of threonine could serve as an additional hydrogen bond donor.

F409C

Hydrogen Bonds for phenylalanine on pos 409 in the wild type structure

Hydrogen Bonds for cysteine on pos 409 in the mutated type structure

The phenylalanine side chain in the wildtype protein does not participate in any hydrogen bonds. The mutation to serine doesn't introduce new hydrogen bonding donors or acceptors, therefore the mutation has no effect on the hydrogen bonding network.

Y438N

Hydrogen Bonds for tyrosine on pos 438 in the wild type structure

Hydrogen Bonds for asparagine on pos 438 in the mutated type structure

The hydrogen bond donor property of the amino acid on position 438 is maintained but the bond seems to be between different sidechains now. This substitution also disturbs the hydrogen bonding network of our protein.

Comparison energies

SCWRL

Before we could use SCWRL we first had to get the sequence of our model: repairPDB bckdha.pdb -seq >> bckdha.seq

When we have the sequence we have to make one file for each mutation. In these files we copied the bckdha.seq and changed the sequence to lower case letters. Then we add the mutation in an upper case letter.

To run SCWRL we used the command: scwrl -i bckdha.pdb -s mutation1.seq -o mutation1Model.pdb

Total minimal energy of the graph

Position	Energy
M82L	642.213
Q125E	616.85
Y166N	616.293
G249S	633.378
C264W	805.257
R265W	710.647
I326T	619.424
F409C	617.305
Y438N	615.951

foldX

To use foldX we first build a runscript. It is important to change values of <Temperature> and <pH> to the values of the used protein. These values can be found on the pdb side . Additionally we had to create one file with all PDB Ids each in a new line (list.txt). We used the command Foldx -runfile run.txt > Stout.txt to run the programm.

	total energy	difference
wildtype	401.00	0
M82L	437.88	-36.88
Q125E	431.77	-30.77
Y166N	432.24	-31.24
G249S	432.22	-31.22
C264W	488.43	-87.43
R265W	460.43	-59.43
I326T	432.94	-31.94
F409C	433.33	-32.33
Y438N	431.56	-30.56

After using foldx we have the total energy for the wiltype protein and for each mutation. The value of the wildtype protein is 401.00 which is already a high value. This means that the protein is quite instabile. To find out which mutation has a high influence on the protein we look at the energies and especially on the difference between the energy of the mutated protein and the wildtype protein. All of the mutated proteins have a much higher energy than the unmutated protein which means that these proteins are less stable. We can see in the table that the proteins can be divided into two groups. The first group has an energy difference of about 31 and the other group has a much higher difference.

Minimise

It is important to remove the hydrogens and water before using the programm. For this we used the new version of repairPDB of the virtualbox. The programm can be started with the command: repairPDB bckdha.pdb -nosol out.pdb > Stout.txt
It is also possible to use the old version but then the command is: repairPDB bckdha.pdb -nosol -noh out.pdb > Stout.txt
It is useful to save the output in a file because it includes the energy.

	total energy	difference
wildtype	-2485.452755	0
M82L	-4253.174790	1767.722015
Q125E	-4080.989512	1595.536757
Y166N	-4354.495238	1869.042483
G249S	-4280.043000	1794.590245
C264W	-3745.313620	1259.860865
R265W	-3989.790625	1504.33787
I326T	-4317.105618	1831.652863
F409C	-4358.528143	1873.075388
Y438N	-4339.778964	1854.326209

Minimise calculates the energy for a mutation by building a new model for each mutation. And then it calculates the energy for the whole mutated model. To find out if there is a difference between the wildtype and the model that is calculated by Minimise. The aim by comparing the mutated models with the wildtype is to find out if there is a structural change caused by a mutation. We superposed each mutated protein with the wildtype and focused on the mutated position. In the pictures there are always the superposed structures. In the wildtype pictures the structure of the unmutated residue is bold and in the mutated pictures the structure of the mutated residue is bold. So we can compare the two pictures to see if there is a change in the structure caused by the mutation on this residue.

mutation	wildtype structure	mutated structure
M82L	wildtype	mutation M82L
Q125E	wildtype	mutation Q125E
Y166N	wildtype	mutation Y166N
G249S	wildtype	mutation G249S
C264W	wildtype	mutation C264W
R265W	wildtype	mutation R265W
I326T	wildtype	mutation I326T
F409C	wildtype	mutation F409C
Y438N	wildtype	mutation Y438N

Gromacs

The first part describes general background information for gromacs as well as how to run those programs. The second part contains the result description and analysis.

General

1. fetchpdb

The fetch-pdb script first checks, if it was called with an valid PDB-id. If the entered PDB code has 4letters, the script tries to download the pdb-file from the server. The successfully downloaded folder gets unzipped and everything except the actual pdb file is removed.

2. repairPDB

For repairPDB the following options are available:

-offset value	offset the residue numbering
-chain char	change Chain ID
-ratom	renumber Atoms
-rres	renumber Residues
-noh	remove hydrogens
-het	no change of HETATM to ATOM for AA
-seq	returns protein sequence from AA in pdb file
-seqrs	protein sequence from SEQRES entries
-nosol	just Protein, no solvent OR
-ssw cutoff	print only waters with B-value below cutoff OR
-cleansol	remove overlapping solvent for GROMACS

We run repairPDB using the following command:

repairPDB bckdha_mod.pdb -noh -nosol > bckdha_clean.pdb

Using this command we removed hydrogens and solvent from our pdb to get just the protein.

3. SCWRL

SCWRL was executed using the following command:

scwrl -i bckdha_mod.pdb -s extractedPDB.seq -o bckdha_scwrl.pdb

SCWRL returned a pdb including HETATOMS. These solvent atoms needed to be removed before continuing.

4.pdb2gmx

use clean pdb without HEATOMS

pdb2gmx -f bckdha_clean.pdb -o bckdha.gro -p bckdha.top -water tip3p -ff amber03

5. MDP

title = PBSA minimization in vacuum
cpp = /usr/bin/cpp
define = -DFLEXIBLE -DPOSRES
implicit_solvent = GBSA
integrator = steep
emtol = 1.0
nsteps = 500
nstenergy = 1
energygrps = System
ns_type = grid
coulombtype = cut-off
rcoulomb = 1.0
rvdw	 = 1.0
constraints = none
pbc = no

adjust nsteps for the time vs steps analysis

integrator	a steepest descent algorithm for energy minimization.
emtol	tolerance for steep integrator:the minimization is converged when the maximum force is smaller than this value
nsteps	maximum number of steps to integrate or minimize, -1 is no maximum
nstenergy	frequency to write energies to energy file (last energies are always written)
energygrps	groups to write to energy files
ns_type
coulombtype
rcoulomb
rvdw
constraints
pbc

6. grompp

grompp -v -f bckdha.mdp -c bckdha.gro -p bckdha.top -o bckdha.tpr

7. System Minimization

mdrun -v -deffnm bckdha 2> mdrun_out.txt

8. Analyzation

g_energy -f bckdha.edr -o energy_1.xvg

Analysis

Wildtype analysis: nsteps vs time

The table below shoes the running time for mdrun depending on different values for nsteps. It also lists the real number of steps carried out to calculate the energy.

steps	time (real) [s]	time (user) [s]	time (sys) [s]	performed steps
50	5.453	4.730	0.120	50
100	10.393	9.210	0.240	100
500	36.419	30.660	0.780	338
1000	5.261	4.390	0.130	47
2000	10.564	8.500	0.290	93
3000	10.661	8.840	0.230	96
4000	2.620	2.010	0.140	21
5000	3.693	3.300	0.100	35

The following plot shows the correlation between nsteps and the running time for mdrun

Interestingly, the running time is not dependent on the number of nsteps, but just on the number of really performed steps. There is a linear dependency between the calculation time and the number of performed steps. The number of performed steps however is not correlating with the value for nsteps. It is not obvious why the number of performed steps varies so extremely given a certain value for nsteps.

Wildtype analysis: force fields

The different force fields chosen for this task were:

AMBER03

GROMACS Energy for the AMBER03 forcefield using the wildtype bckdha structure.

CHARMM27

GROMACS Energy for the CHARMM27 forcefield using the wildtype bckdha structure.

OPLS-AA

GROMACS Energy for the OPLS-AA forcefield using the wildtype bckdha structure.

Bond Analysis

Force Field	Average	Err. Est.	RMSD	Tot-Drift (kJ/mol)
AMBER03	3072.83	2200	-nan	-13100.2
CHARMM25	3180.46	1700	7382.72	-9958.05
OPLS	2780.55	2100	-nan	-11542.6

Angle Analysis

Force Field	Average	Err. Est.	RMSD	Tot-Drift (kJ/mol)
AMBER03	3616.97	230	-nan	-1295.57
CHARMM25	5018.38	490	1646.81	-2783.35
OPLS	3271.23	340	-nan	-1889.98

Potential Analysis

Force Field	Average	Err. Est.	RMSD	Tot-Drift (kJ/mol)
AMBER03	2.67001e+07	2.6e+07	-nan	-1.60382e+08
CHARMM25	487.479	97199.742	673.043
OPLS	2.38353e+07	2.4e+07	-nan	-1.39932e+08

Mutation analysis

M82L

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	2518.71	1700	6337.97	-10023.3
Angle	3642.41	270	638.624	-1479.34
Potential	5.16e+06	5.1e+06	7.47e+07	-3.13e+07

Q125E

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	2519.85	1700	6351.32	-10027.5
Angle	3626.21	260	618.433	-1418.24
Potential	5.23e+06	5.2e+06	7.5e+07	-3.17e+07

Y166N

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	3029.19	2200	-nan	-12529.5
Angle	3654.58	280	-nan	-1486.71
Potential	7.95e+06	7.8e+06	-nan	-4.67e+07

G249S

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	2775.97	2000	6761.45	-11375.2
Angle	3682.24	300	670.885	-1625.24
Potential	5.96e+06	5.0e+06	8.02e+07	-3.61e+07

C264W

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	3186.75	2300	-nan	-13603.2
Angle	3831.06	370	-nan	-2070.89
Potential	3.41e+07	3.3e+07	-nan	-2.03e+08

R265W

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	2473.43	1700	6385.14	-9741.04
Angle	3726.4	330	827.187	1803.54
Potential	5.36e+06	5.3e+06	7.68e+07	-3.26e+07

I326T

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	3214.03	2300	7364.47	-13490.1
Angle	3738.44	310	698.943	-1792.01
Potential	7.29e+06	6.9e+06	8.86e+07	-4.38e+07

F409C

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	2341.69	1600	6048.14	-9087.07
Angle	3597.89	240	594.267	-1309.54
Potential	4.68e+06	4.7e+06	7.12e+07	-2.85e+07

Y438N

Energy	Average	Err.Est	RMSD	Tot-Drift (kJ/mol)
Bond	3141.2	2300	-nan	-13216.1
Angle	3672.66	290	-nan	-1550.04
Potential	8.33e+06	8.1e+06	-nan	-4.94e+07

Links

go back to Maple syrup urine disease main page

go back to Task 6 Sequence based mutation analysis

go to Task 8 Molecular Dynamics Simulations

go to Reference Sequence BCKDHA