Difference between revisions of "Structure-based mutation analysis BCKDHA"
(→3. SCWRL) |
(→Wildtype analysis: force fields) |
||
(118 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Structure selection == |
== Structure selection == |
||
+ | In order to perform a structure based mutation analysis an appropriate protein structure is needed. Therefore one has to check whether the resolution of the structure is high enough, that means the Å-value should be small. The R-factor (reliability factor or residual factor) is another measure to access the quality of a structure. The R-factor rates how well the crystallographic model predicts the observed data from experimental X-ray diffraction. The smaller the R-factor the better. The coverage is the percentage of UniProt sequence present in the sample structure. This value was obtained from [[http://www.ebi.ac.uk/pdbe-srv/view/entry/1u5b/summary.html|PDBe]]. The coverage should be quite high. The pH-value indicates the pH-value at which the structure was resolved. Ideally it should be resolved at physiological pH (7.4). |
||
− | The following table presents the PDB structures for BCKDHA to date: |
||
+ | |||
+ | The following table presents the PDB structures for BCKDHA to date. For each structure the resolution in [Å], the R-factor, coverage and pH-value are listed as well. |
||
+ | |||
{|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
{|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
!PDB id |
!PDB id |
||
!resolution [Å] |
!resolution [Å] |
||
!R-factor |
!R-factor |
||
+ | !% of UniProt sequence present in the sample |
||
− | !coverage |
||
+ | !%residues observed |
||
!ph-value |
!ph-value |
||
|- |
|- |
||
− | |1DTW||2.70||0.224||||7.5* |
+ | |1DTW||2.70||0.224||90%||97%||7.5* |
|- |
|- |
||
− | |1OLS||1.85||0.172||||5.5 |
+ | |1OLS||1.85||0.172||90%||96%||5.5 |
|- |
|- |
||
− | |1OLU||1.90||0.161||||5.5 |
+ | |1OLU||1.90||0.161||90%||91%||5.5 |
|- |
|- |
||
− | |1OLX||2.25||0.161||||5.5 |
+ | |1OLX||2.25||0.161||90%||97%||5.5 |
|- |
|- |
||
− | |1U5B||1.83||0.156||||5.8 |
+ | |1U5B||1.83||0.156||90%||97%||5.8 |
|- |
|- |
||
− | |1V11||1.95||0.139*||||5.5 |
+ | |1V11||1.95||0.139*||90%||92%||5.5 |
|- |
|- |
||
− | |1V16||1.90||0.132*||||5.5 |
+ | |1V16||1.90||0.132*||90%||92%||5.5 |
|- |
|- |
||
− | |1V1M||2.00||0.130*||||5.5 |
+ | |1V1M||2.00||0.130*||90%||93%||5.5 |
|- |
|- |
||
− | |1V1R||1.80||0.158||||5.5 |
+ | |1V1R||1.80||0.158||90%||88%||5.5 |
|- |
|- |
||
− | |1WCI||1.84||0.149||||5.5 |
+ | |1WCI||1.84||0.149||90%||97%||5.5 |
|- |
|- |
||
− | |1X7W||1.73||0.148||||5.8 |
+ | |1X7W||1.73||0.148||90%||92%||5.8 |
|- |
|- |
||
− | |1X7X||2.10||0.149||||5.8 |
+ | |1X7X||2.10||0.149||90%||92%||5.8 |
|- |
|- |
||
− | |1X7Y||1.57||0.150||||5.8 |
+ | |1X7Y||1.57||0.150||90%||92%||5.8 |
|- |
|- |
||
− | |1X7Z||1.72||0.154||||5.8 |
+ | |1X7Z||1.72||0.154||90%||97%||5.8 |
|- |
|- |
||
− | |1X80||2.00||0.161||||5.8 |
+ | |1X80||2.00||0.161||90%||92%||5.8 |
|- |
|- |
||
− | |2BEU||1.89||0.171||||5.5 |
+ | |2BEU||1.89||0.171||90%||97%||5.5 |
|- |
|- |
||
− | |2BEV||1.80||0.139||||5.5 |
+ | |2BEV||1.80||0.139||90%||97%||5.5 |
|- |
|- |
||
− | |2BEW||1.79||0.147||||5.5 |
+ | |2BEW||1.79||0.147||90%||97%||5.5 |
|- |
|- |
||
− | |2BFB||1.77||0.145||||5.5 |
+ | |2BFB||1.77||0.145||90%||92%||5.5 |
|- |
|- |
||
− | |2BFC||1.64||0.144||||5.5 |
+ | |2BFC||1.64||0.144||90%||92%||5.5 |
|- |
|- |
||
− | |2BFD||1.39*||0.150||||5.5 |
+ | |2BFD||1.39*||0.150||90%||93%||5.5 |
|- |
|- |
||
− | |2BFE||1.69||0.150||||5.5 |
+ | |2BFE||1.69||0.150||90%||93%||5.5 |
|- |
|- |
||
− | |2BFF||1.46||0.150||||5.5 |
+ | |2BFF||1.46||0.150||90%||98%||5.5 |
|- |
|- |
||
− | |2J9F||1.88||0.171||||5.5 |
+ | |2J9F||1.88||0.171||90%||95%||5.5 |
|} |
|} |
||
+ | The asteriks-marked values indicate that these structures were resolved with the asked experimental quality. As one can see, none of the structures fulfills all conditions. |
||
− | The following PDB Structure was chosen because of its good experimental resolution: <bold></bold> |
||
− | *resultion: |
||
− | *R-factor |
||
− | *ph-value |
||
+ | Furthermode, we could not use any of the PDB structures for BCKDHA because all of them had gaps in the secondary structure which means that some residues were missing. So we took the structure which has the less gaps: '''1U5B'''<br> |
||
− | == Comparison energies == |
||
+ | *resultion: 1.83<br> |
||
+ | *R-factor: 0.156<br> |
||
+ | *ph-value: 5.8 |
||
+ | This structure has to be modified with some programms to close the gaps. Additionally the first residues which are in BCKDHA misses in 1U5B thats why the start position corresponds to position 6 of the BCKDHA -PDB sequence. |
||
+ | |||
+ | As we can see none of the values corresponds to the demands because it was asked for a structure which has a very small R-factor, a pH of 7.4 and a high resolution. |
||
== Mapping of the mutations on the crystal structure == |
== Mapping of the mutations on the crystal structure == |
||
+ | Figure 1 and 2 shows the structure of BCKDHA with mapped mutations. |
||
+ | [[File:BCKDHA_mutations_activeSites.png|thumb|400px|center|Figure 1: Structure of BCKDHA. Violet: mutations, orange: thiamine pyrophosphate binding sites, yellow: metal binding sites.]] |
||
+ | [[File:BCKDHA_surface.png|thumb|400px|center|Figure 2: Surface of BCKDHA. Violet: mutations, orange: thiamine pyrophosphate binding sites, yellow: metal binding sites.]] |
||
+ | As shown in Figure 2 there are some mutations on the surface of the protein, which might have an effect on the stability of the protein. Furthermore it is possible to recognize a mutation which is quite near to the active center. These facts are studied in more detail in the mutation analysis section. |
||
+ | |||
+ | <!-- |
||
+ | ;M82L |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt37.png|thumb|150px|Figure 2: Hydrogen Bonds for methionine on pos 82 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut37.png|thumb|150px|Figure 3: Hydrogen Bonds for leucine on pos 82 in the mutated type structure]] |
||
+ | |} |
||
+ | Comparing the figures 2 and 3 for the wildtype and the mutated amino acid on position 82, no change in the hydrogen bonding network can be observed. This is due to the similar physiochemical properties of these two amino acids. No atom which could serve as additional hydrogen-bond donor or acceptor was introduced or removed. |
||
+ | ;Q125E |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt80.png|thumb|150px|Figure 4: Hydrogen Bonds for glutamine on pos 125 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut80.png|thumb|150px|Figure 5: Hydrogen Bonds for glutamic acid on pos 125 in the mutated type structure]] |
||
+ | |} |
||
+ | The substitution from glutamine to glutamic acid changes the side chain properties completely. A NH2 group is substituted by a negatively charged oxygen. The NH2 which served in the wildtype structure as a hydrogen bond acceptor (see Figure 4) is not present any more, so the hydrogen bonding network changed for this substitution (compare Figure 4 and 5). |
||
+ | ;Y166N |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt121.png|thumb|150px|Figure 6: Hydrogen Bonds for tyrosine on pos 166 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut121.png|thumb|150px|Figure 7: Hydrogen Bonds for asparagine on pos 166 in the mutated type structure]] |
||
+ | |} |
||
+ | Although tyrosine and asparagine both could play a role in the hydrogen bonding network, no hydrogen bond is formed for position 166 (see Figure 6 and 7). Therefore this substitution has no influence on the hydrogen bonding network of the protein. |
||
+ | ;G249S |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt204.png|thumb|150px|Figure 8: Hydrogen Bonds for glycine 249 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut204.png|thumb|150px|Figure 9: Hydrogen Bonds for serine on pos 249 in the mutated type structure]] |
||
+ | |} |
||
+ | Introducing a serine on position 249 leads to the formation of several additional hydrogen bonds (see Figure 9). Two of the newly established bonds are due to the new hydroxy group which is very likely to participate in hydrogen bonds. Another additional hydrogen bond is formed using the nitrogen atom as a hydrogen bond acceptor. |
||
+ | ;C264W |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt219.png|thumb|150px|Figure 10: Hydrogen Bonds for cysteine on pos 264 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut219.png|thumb|150px|Figure 11: Hydrogen Bonds for tryptophan on pos 264 in the mutated type structure]] |
||
+ | |} |
||
+ | Although the amino acids cysteine and tryptophan have very different structures and chemical properties, no change in the hydrogen bonding network occurs (compare Figure 10 and 11). |
||
+ | ;R265W |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt220.png|thumb|150px|Figure 12: Hydrogen Bonds for arginine on pos 265 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut220.png|thumb|150px|Figure 13: Hydrogen Bonds for tryptophan on pos 265 in the mutated type structure]] |
||
+ | |} |
||
+ | The mutation from arginine to tryptophan leads to a drastic change in the hydrogen bonding network. Arginine, which contains three nitrogen atoms in its side chain is removed and therefore three hydrogen bond acceptors (see Figure 12) are missing in the mutated protein (Figure 13). |
||
+ | ;I326T |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt281.png|thumb|150px|Figure 14: Hydrogen Bonds for isoleucine on pos 326 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut281.png|thumb|150px|Figure 15: Hydrogen Bonds for threonine on pos 326 in the mutated type structure]] |
||
+ | |} |
||
+ | The mutation from isoleucine to threonine doesn't have an influence on the hydrogen bonding network, although the oxygen atom of threonine could serve as an additional hydrogen bond donor (compare Figure 14 and 15). |
||
+ | ;F409C |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt364.png|thumb|150px|Figure 16: Hydrogen Bonds for phenylalanine on pos 409 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut364.png|thumb|150px|Figure 17: Hydrogen Bonds for cysteine on pos 409 in the mutated type structure]] |
||
+ | |} |
||
+ | The phenylalanine side chain in the wildtype protein does not participate in any hydrogen bonds (see Figure 16). The mutation to serine doesn't introduce new hydrogen bonding donors or acceptors, therefore the mutation has no effect on the hydrogen bonding network (see Figure 17). |
||
+ | ;Y438N |
||
+ | {|class="centered" |
||
+ | |[[File:BCKDHA_Wt393.png|thumb|150px|Figure 18: Hydrogen Bonds for tyrosine on pos 438 in the wild type structure]] |
||
+ | |[[File:BCKDHA_Mut393.png|thumb|150px|Figure 19: Hydrogen Bonds for asparagine on pos 438 in the mutated type structure]] |
||
+ | |} |
||
+ | The hydrogen bond donor property of the amino acid on position 438 is maintained but the bond seems to be between different sidechains now (Compare Figure 18 and 19). This substitution also disturbs the hydrogen bonding network of our protein. |
||
+ | --> |
||
+ | |||
+ | == Comparison energies == |
||
=== SCWRL === |
=== SCWRL === |
||
Before we could use SCWRL we first had to get the sequence of our model: |
Before we could use SCWRL we first had to get the sequence of our model: |
||
− | repairPDB bckdha.pdb -seq >> bckdha.seq |
+ | repairPDB bckdha.pdb -seq >> bckdha.seq |
When we have the sequence we have to make one file for each mutation. In these files we copied the bckdha.seq and changed the sequence to lower case letters. Then we add the mutation in an upper case letter. |
When we have the sequence we have to make one file for each mutation. In these files we copied the bckdha.seq and changed the sequence to lower case letters. Then we add the mutation in an upper case letter. |
||
To run SCWRL we used the command: |
To run SCWRL we used the command: |
||
− | scwrl -i bckdha.pdb -s mutation1.seq -o mutation1Model.pdb |
+ | scwrl -i bckdha.pdb -s mutation1.seq -o mutation1Model.pdb |
Line 107: | Line 177: | ||
=== foldX === |
=== foldX === |
||
+ | To use foldX we first build a [[runscript for foldx |runscript]]. It is important to change values of <Temperature> and <pH> to the values of the used protein. These values can be found on the [http://www.pdb.org pdb side] . |
||
− | To use foldX we first build a runscript. |
||
− | Additionally we had to create one file with all PDB Ids each in a new line (list.txt). |
+ | Additionally we had to create one file with all PDB Ids each in a new line (list.txt). We used the command <code> Foldx -runfile run.txt > Stout.txt </code> to run the programm.<br> |
− | |||
− | |||
− | <TITLE>FOLDX_runscript; |
||
− | <JOBSTART>#; |
||
− | <PDBS>#; |
||
− | <BATCH>foldx_protein.txt; |
||
− | <COMMANDS>FOLDX_commandfile; |
||
− | <Stability>list.txt; |
||
− | <END>#; |
||
− | <OPTIONS>FOLDX_optionfile; |
||
− | //<Temperature>298; |
||
− | <R>#; |
||
− | <pH>5.5; |
||
− | <IonStrength>0.050; |
||
− | <water>-CRYSTAL; |
||
− | <metal>-CRYSTAL; |
||
− | <VdWDesign>2; |
||
− | <OutPDB>false; |
||
− | <pdb_hydrogens>false; |
||
− | <END>#; |
||
− | <JOBEND>#; |
||
− | <ENDFILE>#; |
||
− | |||
Line 141: | Line 188: | ||
|wildtype||401.00||0 |
|wildtype||401.00||0 |
||
|- |
|- |
||
− | | |
+ | |M82L||437.88||-36.88 |
|- |
|- |
||
− | | |
+ | |Q125E||431.77||-30.77 |
|- |
|- |
||
− | | |
+ | |Y166N||432.24||-31.24 |
|- |
|- |
||
− | | |
+ | |G249S||432.22||-31.22 |
|- |
|- |
||
− | | |
+ | |C264W||488.43||-87.43 |
|- |
|- |
||
− | | |
+ | |R265W||460.43||-59.43 |
|- |
|- |
||
− | | |
+ | |I326T||432.94||-31.94 |
|- |
|- |
||
− | | |
+ | |F409C||433.33||-32.33 |
|- |
|- |
||
− | | |
+ | |Y438N||431.56||-30.56 |
|} |
|} |
||
+ | After using foldx we have the total energy for the wiltype protein and for each mutation. The value of the wildtype protein is 401.00 which is already a high value. This means that the protein is quite instabile. To find out which mutation has a high influence on the protein we look at the energies and especially on the difference between the energy of the mutated protein and the wildtype protein. All of the mutated proteins have a much higher energy than the unmutated protein which means that these proteins are less stable. We can see in the table that the proteins can be divided into two groups. The first group has an energy difference of about 31 and the other group has a much higher difference. |
||
− | === gromacs === |
||
+ | |||
+ | === Minimise === |
||
+ | |||
+ | It is important to remove the hydrogens and water before using the programm. For this we used the new version of repairPDB of the virtualbox. The programm can be started with the command: |
||
+ | <code>repairPDB bckdha.pdb -nosol out.pdb > Stout.txt </code> <br> |
||
+ | It is also possible to use the old version but then the command is: |
||
+ | <code>repairPDB bckdha.pdb -nosol -noh out.pdb > Stout.txt </code> <br> |
||
+ | It is useful to save the output in a file because it includes the energy. |
||
+ | |||
+ | |||
+ | {|border="1" |
||
+ | ! |
||
+ | !total energy |
||
+ | !difference |
||
+ | |- |
||
+ | |wildtype|| -2485.452755|| 0 |
||
+ | |- |
||
+ | |M82L|| -4253.174790||1767.722015 |
||
+ | |- |
||
+ | |Q125E|| -4080.989512|| 1595.536757 |
||
+ | |- |
||
+ | |Y166N|| -4354.495238|| 1869.042483 |
||
+ | |- |
||
+ | |G249S|| -4280.043000||1794.590245 |
||
+ | |- |
||
+ | |C264W|| -3745.313620||1259.860865 |
||
+ | |- |
||
+ | |R265W|| -3989.790625||1504.33787 |
||
+ | |- |
||
+ | |I326T|| -4317.105618||1831.652863 |
||
+ | |- |
||
+ | |F409C|| -4358.528143||1873.075388 |
||
+ | |- |
||
+ | |Y438N|| -4339.778964||1854.326209 |
||
+ | |} |
||
+ | |||
+ | |||
+ | Minimise calculates the energy for a mutation by building a new model for each mutation. Then it calculates the energy for the whole mutated model. The aim by comparing the mutated models with the wildtype is to find out if there is a structural change caused by a mutation. The calculated energies are discussed in more detail in the mutation analysis section (below). |
||
+ | |||
+ | <!-- |
||
+ | We superposed each mutated protein with the wildtype and focused on the mutated position. The figures 20-37 show the superimposed structures. The pictures showing the wildtype structure display the unmutated residue in bold and vice versa. So we can compare the two pictures to see if there is a change in the structure caused by the mutation of this residue. |
||
+ | |||
+ | |||
+ | {|border="1" |
||
+ | !mutation |
||
+ | !wildtype structure |
||
+ | !mutated structure |
||
+ | |- |
||
+ | |M82L||[[File:Wildtype1_BCKDHAMinimise.png|thumb|Figure 20: Methionine on pos 82]]||[[File:Mutation1_BCKDHAMinimise.png|thumb|Figure 21: Mutation M82L]] |
||
+ | |- |
||
+ | |Q125E||[[File:Wildtype2_BCKDHAMinimise.png|thumb|Figure 22: Glutamine on pos 125]]||[[File:Mutation2_BCKDHAMinimise.png|thumb|Figure 23: Mutation Q125E]] |
||
+ | |- |
||
+ | |Y166N||[[File:Wildtype3_BCKDHAMinimise.png|thumb|Figure 24: Tyrosine on pos 166]]||[[File:Mutation3_BCKDHAMinimise.png|thumb|Figure 25: Mutation Y166N]] |
||
+ | |- |
||
+ | |G249S||[[File:Wildtype4_BCKDHAMinimise.png|thumb|Figure 26: Glycine on pos 249]]||[[File:Mutation4_BCKDHAMinimise.png|thumb|Figure 27: Mutation G249S]] |
||
+ | |- |
||
+ | |C264W||[[File:Wildtype5_BCKDHAMinimise.png|thumb|Figure 28: Cysteine on pos 264]]||[[File:Mutation5_BCKDHAMinimise.png|thumb|Figure 29: Mutation C264W]] |
||
+ | |- |
||
+ | |R265W||[[File:Wildtype6_BCKDHAMinimise.png|thumb|Figure 30: Arginine on pos 265]]||[[File:Mutation6_BCKDHAMinimise.png|thumb|Figure 31: Mutation R265W]] |
||
+ | |- |
||
+ | |I326T||[[File:Wildtype7_BCKDHAMinimise.png|thumb|Figure 32: Isoleucine on pos 326]]||[[File:Mutation7_BCKDHAMinimise.png|thumb|Figure 33: Mutation I326T]] |
||
+ | |- |
||
+ | |F409C||[[File:Wildtype8_BCKDHAMinimise.png|thumb|Figure 34: Phenylalanine on pos 409]]||[[File:Mutation8_BCKDHAMinimise.png|thumb|Figure 35: Mutation F409C]] |
||
+ | |- |
||
+ | |Y438N||[[File:Wildtype9_BCKDHAMinimise.png|thumb|Figure 36: Tyrosine on pos 438]]||[[File:Mutation9_BCKDHAMinimise.png|thumb|Figure 37: Mutation Y438N]] |
||
+ | |} |
||
+ | --> |
||
== Gromacs == |
== Gromacs == |
||
+ | The first part describes general background information for gromacs as well as how to run those programs. |
||
− | ===1. fetchpdb=== |
||
+ | The second part contains the result description and analysis. |
||
+ | === General === |
||
+ | |||
+ | ====1. fetchpdb==== |
||
The fetch-pdb script first checks, if it was called with an valid PDB-id. If the entered PDB code has 4letters, the script tries to download the pdb-file from the server. The successfully downloaded folder gets unzipped and everything except the actual pdb file is removed. |
The fetch-pdb script first checks, if it was called with an valid PDB-id. If the entered PDB code has 4letters, the script tries to download the pdb-file from the server. The successfully downloaded folder gets unzipped and everything except the actual pdb file is removed. |
||
− | ===2. repairPDB=== |
+ | ====2. repairPDB==== |
+ | For repairPDB the following options are available: |
||
− | ===3. SCWRL === |
||
+ | {| |
||
− | scwrl -i bckdha_mod.pdb -s extractedPDB.seq -o bckdha_scwrl.pdb |
||
+ | | -offset value || offset the residue numbering |
||
+ | |- |
||
+ | | -chain char||change Chain ID |
||
+ | |- |
||
+ | | -ratom || renumber Atoms |
||
+ | |- |
||
+ | | -rres || renumber Residues |
||
+ | |- |
||
+ | | -noh ||remove hydrogens |
||
+ | |- |
||
+ | | -het ||no change of HETATM to ATOM for AA |
||
+ | |- |
||
+ | | -seq || returns protein sequence from AA in pdb file |
||
+ | |- |
||
+ | | -seqrs || protein sequence from SEQRES entries |
||
+ | |- |
||
+ | | -nosol || just Protein, no solvent OR |
||
+ | |- |
||
+ | | -ssw cutoff || print only waters with B-value below cutoff OR |
||
+ | |- |
||
+ | | -cleansol || remove overlapping solvent for GROMACS |
||
+ | |} |
||
− | pdb including HEATOMS |
||
+ | We run repairPDB using the following command: |
||
− | ===4.pdb2gmx=== |
||
− | <!-- pdb2gmx -f bckdha_mod.pdb -o bckdha.gro -p topol.top -water tip3p -ff amber03 |
||
− | --> |
||
+ | repairPDB bckdha_mod.pdb -noh -nosol > bckdha_clean.pdb |
||
− | Warning: Starting residue K456 in chain not identified as Protein/RNA/DNA. |
||
− | Warning: Starting residue MN458 in chain not identified as Protein/RNA/DNA. |
||
− | Warning: Starting residue TDP556 in chain not identified as Protein/RNA/DNA. |
||
+ | Using this command we removed hydrogens and solvent from our pdb to get just the protein. |
||
− | und später dann: |
||
− | Fatal error: |
||
− | Residue 'MN' not found in residue topology database |
||
+ | ====3. SCWRL ==== |
||
+ | SCWRL was executed using the following command: |
||
+ | scwrl -i bckdha_mod.pdb -s extractedPDB.seq -o bckdha_scwrl.pdb |
||
− | removed first HAtoms: |
||
+ | |||
− | K, MN and TDP |
||
+ | SCWRL returned a pdb including HETATOMS. |
||
− | -> worked |
||
+ | These solvent atoms needed to be removed before continuing. |
||
+ | |||
+ | ====4.pdb2gmx==== |
||
+ | |||
+ | (use clean pdb without HEATOMS!) |
||
+ | |||
+ | pdb2gmx -f bckdha_clean.pdb -o bckdha.gro -p bckdha.top -water tip3p -ff amber03 |
||
+ | |||
+ | ====5. MDP==== |
||
− | ===5. MDP=== |
||
− | <!-- |
||
<code> |
<code> |
||
− | title = PBSA minimization in vacuum |
+ | title = PBSA minimization in vacuum |
− | cpp = /usr/bin/cpp |
+ | cpp = /usr/bin/cpp |
− | define = -DFLEXIBLE -DPOSRES |
+ | define = -DFLEXIBLE -DPOSRES |
− | implicit_solvent = GBSA |
+ | implicit_solvent = GBSA |
− | integrator = steep |
+ | integrator = steep |
− | emtol = 1.0 |
+ | emtol = 1.0 |
− | nsteps = 500 |
+ | nsteps = 500 |
− | nstenergy = 1 |
+ | nstenergy = 1 |
− | energygrps = System |
+ | energygrps = System |
− | ns_type = grid |
+ | ns_type = grid |
− | coulombtype = cut-off |
+ | coulombtype = cut-off |
− | rcoulomb = 1.0 |
+ | rcoulomb = 1.0 |
− | rvdw = 1.0 |
+ | rvdw = 1.0 |
− | constraints = none |
+ | constraints = none |
− | pbc = no |
+ | pbc = no |
− | </code |
+ | </code> |
+ | |||
+ | adjust nsteps for the time vs steps analysis |
||
{| |
{| |
||
Line 224: | Line 368: | ||
|energygrps ||groups to write to energy files |
|energygrps ||groups to write to energy files |
||
|- |
|- |
||
+ | |ns_type|| Which atoms to check when construction a new neighbor list |
||
− | |ns_type|| |
||
|- |
|- |
||
+ | |coulombtype || which coulomb-type to use, e.g. Cut-off, PME, Ewald |
||
− | |coulombtype |
||
|- |
|- |
||
− | |rcoulomb |
+ | |rcoulomb ||distance for the Coulomb cut-off |
|- |
|- |
||
+ | |rvdw ||Van-der-Waals cut-of (distance for the LJ or Buckingham cut-off) |
||
− | |rvdw |
||
|- |
|- |
||
+ | |constraints || constraints for bonds, e.g bonds are presented by a potention, all bonds are converted to constraints,... |
||
− | |constraints |
||
|- |
|- |
||
+ | |pbc || Remove the periodicity (make molecule whole again) |
||
− | |pbc |
||
+ | |} |
||
− | --> |
||
− | ===6.grompp=== |
+ | ====6. grompp==== |
− | grompp -v -f |
+ | grompp -v -f bckdha.mdp -c bckdha.gro -p bckdha.top -o bckdha.tpr |
+ | ====7. System Minimization ==== |
||
+ | mdrun -v -deffnm bckdha 2> mdrun_out.txt |
||
+ | ====8. Analyzation==== |
||
+ | g_energy -f bckdha.edr -o energy_1.xvg |
||
− | processing coordinates... |
||
− | double-checking input for internal consistency... |
||
− | Reading position restraint coords from bckdha.gro |
||
− | renumbering atomtypes... |
||
+ | === Analysis === |
||
− | GB parameter(s) missing or negative for atom type 'OW' |
||
+ | ====Wildtype analysis: nsteps vs time ==== |
||
+ | The table below shoes the running time for mdrun depending on different values for nsteps. It also lists the real number of steps carried out to calculate the energy. |
||
− | GB parameter(s) missing or negative for atom type 'HW' |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
− | ------------------------------------------------------- |
||
+ | !steps |
||
− | Program grompp, VERSION 4.5.3 |
||
+ | !time (real) [s] |
||
− | Source code file: /build/buildd/gromacs-4.5.3/src/kernel/grompp.c, line: 1123 |
||
+ | !time (user) [s] |
||
+ | !time (sys) [s] |
||
+ | !performed steps |
||
+ | |- |
||
+ | |50|| 5.453|| 4.730|| 0.120 ||50 |
||
+ | |- |
||
+ | |100|| 10.393|| 9.210|| 0.240 ||100 |
||
+ | |- |
||
+ | |500|| 36.419|| 30.660|| 0.780|| 338 |
||
+ | |- |
||
+ | |1000|| 5.261|| 4.390|| 0.130|| 47 |
||
+ | |- |
||
+ | |2000|| 10.564|| 8.500|| 0.290|| 93 |
||
+ | |- |
||
+ | |3000|| 10.661|| 8.840|| 0.230|| 96 |
||
+ | |- |
||
+ | |4000|| 2.620|| 2.010|| 0.140|| 21 |
||
+ | |- |
||
+ | |5000|| 3.693|| 3.300|| 0.100|| 35 |
||
+ | |- |
||
+ | |} |
||
+ | Figure 3 shows the correlation between nsteps and the running time for mdrun |
||
− | Fatal error: |
||
+ | [[File:BCKDHA_nstepsVsTime.png|300px|thumb|center|Figure 3: nsteps vs running time for mdrun]] |
||
− | Can't do GB electrostatics; the implicit_genborn_params section of the forcefield is missing parameters for 2 atomtypes or they might be negative. |
||
+ | Interestingly, the running time is not dependent on the number of nsteps, but just on the number of really performed steps. There is a linear dependency between the calculation time and the number of performed steps. The number of performed steps however is not correlating with the value for nsteps. |
||
− | For more information and tips for troubleshooting, please check the GROMACS |
||
+ | It is not obvious why the number of performed steps varies so extremely given a certain value for nsteps. |
||
− | website at http://www.gromacs.org/Documentation/Errors |
||
− | ------------------------------------------------------- |
||
+ | ==== Wildtype analysis: force fields==== |
||
− | ===7.System Minimization === |
||
+ | The different force fields chosen for this task were: |
||
− | ===8.Analyzation=== |
||
+ | *AMBER03 |
||
+ | [[File:Energy_bckdha_amber03.png|thumb|100px|right|Figure 4: GROMACS Energy for the AMBER03 forcefield using the wildtype bckdha structure.]] |
||
+ | *CHARMM27 |
||
+ | [[File:Energy_bckdha_charmm27.png|thumb|100px|right|Figure 5: GROMACS Energy for the CHARMM27 forcefield using the wildtype bckdha structure.]] |
||
+ | *OPLS-AA |
||
+ | [[File:Energy_bckdha_opls.png|thumb|100px|right|Figure 6: GROMACS Energy for the OPLS-AA forcefield using the wildtype bckdha structure.]] |
||
+ | '''Bond Analysis''' |
||
+ | |||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Force Field |
||
+ | !Average |
||
+ | !Err. Est. |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |AMBER03 ||3072.83||2200||-nan||-13100.2 |
||
+ | |- |
||
+ | |CHARMM25 ||3180.46||1700||7382.72||-9958.05 |
||
+ | |- |
||
+ | |OPLS ||2780.55||2100||-nan||-11542.6 |
||
+ | |} |
||
+ | |||
+ | '''Angle Analysis''' |
||
+ | |||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Force Field |
||
+ | !Average |
||
+ | !Err. Est. |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |AMBER03 ||3616.97||230||-nan||-1295.57 |
||
+ | |- |
||
+ | |CHARMM25 ||5018.38||490||1646.81||-2783.35 |
||
+ | |- |
||
+ | |OPLS ||3271.23||340||-nan||-1889.98 |
||
+ | |} |
||
+ | |||
+ | '''Potential Analysis''' |
||
+ | |||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Force Field |
||
+ | !Average |
||
+ | !Err. Est. |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |AMBER03 ||2.67001e+07||2.6e+07||-nan||-1.60382e+08 |
||
+ | |- |
||
+ | |CHARMM25 ||487.479||97||199.742||673.043 |
||
+ | |- |
||
+ | |OPLS ||2.38353e+07||2.4e+07||-nan||-1.39932e+08 |
||
+ | |} |
||
+ | |||
+ | Comparing the results for the different force fields it is noticeable that only Charmm27 produces values for the RMSD. The values calculated by AMBER03 and OPLS are mostly similar, only the CHARMM27 values vary totally. |
||
+ | |||
+ | Figure 4 shows the energy calculated by Gromacs using the AMBER03 forcefield applied to the wildtype protein structure. Figure 5 and 6 are showing the energy for the Charmm25 and the OPLS-AA force fields, respectively. |
||
+ | |||
+ | === Mutation analysis === |
||
+ | |||
+ | In order to discuss the effect of different mutations more in detail, one page for each mutation was created: |
||
+ | *[[M82L]] |
||
+ | *[[Q125E]] |
||
+ | *[[Y166N]] |
||
+ | *[[G249S]] |
||
+ | *[[C264W]] |
||
+ | *[[R265W]] |
||
+ | *[[I326T]] |
||
+ | *[[F409C]] |
||
+ | *[[Y438N]] |
||
+ | |||
+ | <!-- |
||
+ | ==== M82L ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 2518.71 || 1700||6337.97||-10023.3 |
||
+ | |- |
||
+ | |Angle || 3642.41||270|| 638.624||-1479.34 |
||
+ | |- |
||
+ | |Potential || 5.16e+06 ||5.1e+06||7.47e+07||-3.13e+07 |
||
+ | |} |
||
+ | |||
+ | ==== Q125E ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 2519.85 ||1700||6351.32||-10027.5 |
||
+ | |- |
||
+ | |Angle || 3626.21 ||260|| 618.433||-1418.24 |
||
+ | |- |
||
+ | |Potential || 5.23e+06|| 5.2e+06||7.5e+07||-3.17e+07 |
||
+ | |} |
||
+ | |||
+ | ==== Y166N ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 3029.19||2200||-nan||-12529.5 |
||
+ | |- |
||
+ | |Angle || 3654.58||280||-nan||-1486.71 |
||
+ | |- |
||
+ | |Potential || 7.95e+06||7.8e+06||-nan||-4.67e+07 |
||
+ | |} |
||
+ | |||
+ | ==== G249S ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 2775.97||2000||6761.45||-11375.2 |
||
+ | |- |
||
+ | |Angle || 3682.24||300||670.885||-1625.24 |
||
+ | |- |
||
+ | |Potential || 5.96e+06||5.0e+06||8.02e+07||-3.61e+07 |
||
+ | |} |
||
+ | |||
+ | ==== C264W ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 3186.75||2300||-nan||-13603.2 |
||
+ | |- |
||
+ | |Angle || 3831.06||370||-nan||-2070.89 |
||
+ | |- |
||
+ | |Potential || 3.41e+07||3.3e+07||-nan||-2.03e+08 |
||
+ | |} |
||
+ | |||
+ | ==== R265W ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 2473.43 ||1700||6385.14 || -9741.04 |
||
+ | |- |
||
+ | |Angle || 3726.4 ||330||827.187 || 1803.54 |
||
+ | |- |
||
+ | |Potential || 5.36e+06 || 5.3e+06|| 7.68e+07||-3.26e+07 |
||
+ | |} |
||
+ | |||
+ | ==== I326T ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 3214.03||2300||7364.47||-13490.1 |
||
+ | |- |
||
+ | |Angle || 3738.44||310||698.943||-1792.01 |
||
+ | |- |
||
+ | |Potential || 7.29e+06||6.9e+06||8.86e+07||-4.38e+07 |
||
+ | |} |
||
+ | |||
+ | ==== F409C ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 2341.69||1600||6048.14||-9087.07 |
||
+ | |- |
||
+ | |Angle || 3597.89||240||594.267||-1309.54 |
||
+ | |- |
||
+ | |Potential || 4.68e+06||4.7e+06||7.12e+07||-2.85e+07 |
||
+ | |} |
||
+ | |||
+ | ==== Y438N ==== |
||
+ | {|border="1" style="border-spacing:0" align="center" cellpadding="3" cellspacing="3" |
||
+ | !Energy |
||
+ | !Average |
||
+ | !Err.Est |
||
+ | !RMSD |
||
+ | !Tot-Drift (kJ/mol) |
||
+ | |- |
||
+ | |Bond || 3141.2||2300||-nan||-13216.1 |
||
+ | |- |
||
+ | |Angle || 3672.66||290||-nan||-1550.04 |
||
+ | |- |
||
+ | |Potential || 8.33e+06||8.1e+06||-nan||-4.94e+07 |
||
+ | |} |
||
+ | --> |
||
+ | |||
+ | === Results === |
||
+ | The following table summarizes the calculated energies for the different mutations. |
||
+ | |||
+ | |||
+ | {| border="1" style="text-align:center; border-spacing:0;" |
||
+ | | |
||
+ | |colspan="2" | FoldX |
||
+ | |colspan="2" | Minimise |
||
+ | |colspan="2" | Gromacs |
||
+ | |- |
||
+ | |Mutation |
||
+ | |energy value |
||
+ | |difference |
||
+ | |energy value |
||
+ | |difference |
||
+ | |energy value |
||
+ | |difference |
||
+ | |- |
||
+ | |wildtype |
||
+ | | 401.00 |
||
+ | |0 |
||
+ | | -2485.452755 |
||
+ | |0 |
||
+ | | 2.67001e+07 |
||
+ | |0 |
||
+ | |- |
||
+ | |M82L |
||
+ | | 437.88 |
||
+ | |36.88 |
||
+ | | -4253.174790 |
||
+ | | -1767.722015 |
||
+ | | 5.16e+06 |
||
+ | | -21540100 |
||
+ | |- |
||
+ | |Q125E |
||
+ | | 431.77 |
||
+ | |30.77 |
||
+ | | -4080.989512 |
||
+ | | -1595.536757 |
||
+ | | 5.23e+06 |
||
+ | | -21470100 |
||
+ | |- |
||
+ | |Y166N |
||
+ | |432.24 |
||
+ | |31.24 |
||
+ | | -4354.495238 |
||
+ | | -1869.042483 |
||
+ | | 7.95e+06 |
||
+ | | -18750100 |
||
+ | |- |
||
+ | |G249S |
||
+ | | 432.22 |
||
+ | | 31.22 |
||
+ | | -4280.043000 |
||
+ | | -1794.590245 |
||
+ | | 5.96e+06 |
||
+ | | -20740100 |
||
+ | |- |
||
+ | |C264W |
||
+ | | 488.43 |
||
+ | | 87.43 |
||
+ | | -3745.313620 |
||
+ | | -1259.860865 |
||
+ | | 3.41e+07 |
||
+ | | 7399900 |
||
+ | |- |
||
+ | |R265W |
||
+ | |460.43 |
||
+ | |59.43 |
||
+ | | -3989.790625 |
||
+ | | -1504.33787 |
||
+ | | 5.36e+06 |
||
+ | | -21340100 |
||
+ | |- |
||
+ | |I326T |
||
+ | | 432.94 |
||
+ | |31.94 |
||
+ | | -4317.105618 |
||
+ | | -1831.652863 |
||
+ | | 7.29e+06 |
||
+ | | -19410100 |
||
+ | |- |
||
+ | |F409C |
||
+ | | 433.33 |
||
+ | | 32.33 |
||
+ | | -4358.528143 |
||
+ | | -1873.075388 |
||
+ | | 4.68e+06 |
||
+ | | -22020100 |
||
+ | |- |
||
+ | |Y438N |
||
+ | | 431.56 |
||
+ | | 30.56 |
||
+ | | -4339.778964 |
||
+ | | 1854.326209 |
||
+ | | 8.33e+06 |
||
+ | | -18370100 |
||
+ | |- |
||
+ | |} |
||
+ | |||
+ | Interestingly, foldX calculates for all mutations a higher energy than for the wildtype, whereas the energy of the mutated protein calculated by Minimise and Gromacs is (almost) always lower than the wildtype energy. The difference in energies is also very varying between these tools. While foldX returns usually very small energy differences between wildtype and mutated structures, the differences are bigger for minimise and even higher for Gromacs. |
||
+ | The most interesting observation from these energies is the mutation C264W, where foldX calculates a very different energy for the mutated protein and also Gromacs returns for the only time an energy that is higher for the mutated structure than for the wildtype structure. See detailed mutation analysis for more information. |
||
+ | |||
+ | == Links == |
||
go back to [[Maple syrup urine disease]] main page |
go back to [[Maple syrup urine disease]] main page |
||
go back to Task 6 [[Sequence-based mutation analysis_BCKDHA|Sequence based mutation analysis]] |
go back to Task 6 [[Sequence-based mutation analysis_BCKDHA|Sequence based mutation analysis]] |
||
+ | |||
+ | go to Task 8 [[Molecular_Dynamics_Simulations_BCKDHA|Molecular Dynamics Simulations]] |
||
go to [[Reference Sequence BCKDHA]] |
go to [[Reference Sequence BCKDHA]] |
Latest revision as of 21:27, 11 August 2011
Contents
Structure selection
In order to perform a structure based mutation analysis an appropriate protein structure is needed. Therefore one has to check whether the resolution of the structure is high enough, that means the Å-value should be small. The R-factor (reliability factor or residual factor) is another measure to access the quality of a structure. The R-factor rates how well the crystallographic model predicts the observed data from experimental X-ray diffraction. The smaller the R-factor the better. The coverage is the percentage of UniProt sequence present in the sample structure. This value was obtained from [[1]]. The coverage should be quite high. The pH-value indicates the pH-value at which the structure was resolved. Ideally it should be resolved at physiological pH (7.4).
The following table presents the PDB structures for BCKDHA to date. For each structure the resolution in [Å], the R-factor, coverage and pH-value are listed as well.
PDB id | resolution [Å] | R-factor | % of UniProt sequence present in the sample | %residues observed | ph-value |
---|---|---|---|---|---|
1DTW | 2.70 | 0.224 | 90% | 97% | 7.5* |
1OLS | 1.85 | 0.172 | 90% | 96% | 5.5 |
1OLU | 1.90 | 0.161 | 90% | 91% | 5.5 |
1OLX | 2.25 | 0.161 | 90% | 97% | 5.5 |
1U5B | 1.83 | 0.156 | 90% | 97% | 5.8 |
1V11 | 1.95 | 0.139* | 90% | 92% | 5.5 |
1V16 | 1.90 | 0.132* | 90% | 92% | 5.5 |
1V1M | 2.00 | 0.130* | 90% | 93% | 5.5 |
1V1R | 1.80 | 0.158 | 90% | 88% | 5.5 |
1WCI | 1.84 | 0.149 | 90% | 97% | 5.5 |
1X7W | 1.73 | 0.148 | 90% | 92% | 5.8 |
1X7X | 2.10 | 0.149 | 90% | 92% | 5.8 |
1X7Y | 1.57 | 0.150 | 90% | 92% | 5.8 |
1X7Z | 1.72 | 0.154 | 90% | 97% | 5.8 |
1X80 | 2.00 | 0.161 | 90% | 92% | 5.8 |
2BEU | 1.89 | 0.171 | 90% | 97% | 5.5 |
2BEV | 1.80 | 0.139 | 90% | 97% | 5.5 |
2BEW | 1.79 | 0.147 | 90% | 97% | 5.5 |
2BFB | 1.77 | 0.145 | 90% | 92% | 5.5 |
2BFC | 1.64 | 0.144 | 90% | 92% | 5.5 |
2BFD | 1.39* | 0.150 | 90% | 93% | 5.5 |
2BFE | 1.69 | 0.150 | 90% | 93% | 5.5 |
2BFF | 1.46 | 0.150 | 90% | 98% | 5.5 |
2J9F | 1.88 | 0.171 | 90% | 95% | 5.5 |
The asteriks-marked values indicate that these structures were resolved with the asked experimental quality. As one can see, none of the structures fulfills all conditions.
Furthermode, we could not use any of the PDB structures for BCKDHA because all of them had gaps in the secondary structure which means that some residues were missing. So we took the structure which has the less gaps: 1U5B
- resultion: 1.83
- R-factor: 0.156
- ph-value: 5.8
This structure has to be modified with some programms to close the gaps. Additionally the first residues which are in BCKDHA misses in 1U5B thats why the start position corresponds to position 6 of the BCKDHA -PDB sequence.
As we can see none of the values corresponds to the demands because it was asked for a structure which has a very small R-factor, a pH of 7.4 and a high resolution.
Mapping of the mutations on the crystal structure
Figure 1 and 2 shows the structure of BCKDHA with mapped mutations.
As shown in Figure 2 there are some mutations on the surface of the protein, which might have an effect on the stability of the protein. Furthermore it is possible to recognize a mutation which is quite near to the active center. These facts are studied in more detail in the mutation analysis section.
Comparison energies
SCWRL
Before we could use SCWRL we first had to get the sequence of our model:
repairPDB bckdha.pdb -seq >> bckdha.seq
When we have the sequence we have to make one file for each mutation. In these files we copied the bckdha.seq and changed the sequence to lower case letters. Then we add the mutation in an upper case letter.
To run SCWRL we used the command:
scwrl -i bckdha.pdb -s mutation1.seq -o mutation1Model.pdb
Total minimal energy of the graph
Position | Energy |
---|---|
M82L | 642.213 |
Q125E | 616.85 |
Y166N | 616.293 |
G249S | 633.378 |
C264W | 805.257 |
R265W | 710.647 |
I326T | 619.424 |
F409C | 617.305 |
Y438N | 615.951 |
foldX
To use foldX we first build a runscript. It is important to change values of <Temperature> and <pH> to the values of the used protein. These values can be found on the pdb side .
Additionally we had to create one file with all PDB Ids each in a new line (list.txt). We used the command Foldx -runfile run.txt > Stout.txt
to run the programm.
total energy | difference | |
---|---|---|
wildtype | 401.00 | 0 |
M82L | 437.88 | -36.88 |
Q125E | 431.77 | -30.77 |
Y166N | 432.24 | -31.24 |
G249S | 432.22 | -31.22 |
C264W | 488.43 | -87.43 |
R265W | 460.43 | -59.43 |
I326T | 432.94 | -31.94 |
F409C | 433.33 | -32.33 |
Y438N | 431.56 | -30.56 |
After using foldx we have the total energy for the wiltype protein and for each mutation. The value of the wildtype protein is 401.00 which is already a high value. This means that the protein is quite instabile. To find out which mutation has a high influence on the protein we look at the energies and especially on the difference between the energy of the mutated protein and the wildtype protein. All of the mutated proteins have a much higher energy than the unmutated protein which means that these proteins are less stable. We can see in the table that the proteins can be divided into two groups. The first group has an energy difference of about 31 and the other group has a much higher difference.
Minimise
It is important to remove the hydrogens and water before using the programm. For this we used the new version of repairPDB of the virtualbox. The programm can be started with the command:
repairPDB bckdha.pdb -nosol out.pdb > Stout.txt
It is also possible to use the old version but then the command is:
repairPDB bckdha.pdb -nosol -noh out.pdb > Stout.txt
It is useful to save the output in a file because it includes the energy.
total energy | difference | |
---|---|---|
wildtype | -2485.452755 | 0 |
M82L | -4253.174790 | 1767.722015 |
Q125E | -4080.989512 | 1595.536757 |
Y166N | -4354.495238 | 1869.042483 |
G249S | -4280.043000 | 1794.590245 |
C264W | -3745.313620 | 1259.860865 |
R265W | -3989.790625 | 1504.33787 |
I326T | -4317.105618 | 1831.652863 |
F409C | -4358.528143 | 1873.075388 |
Y438N | -4339.778964 | 1854.326209 |
Minimise calculates the energy for a mutation by building a new model for each mutation. Then it calculates the energy for the whole mutated model. The aim by comparing the mutated models with the wildtype is to find out if there is a structural change caused by a mutation. The calculated energies are discussed in more detail in the mutation analysis section (below).
Gromacs
The first part describes general background information for gromacs as well as how to run those programs. The second part contains the result description and analysis.
General
1. fetchpdb
The fetch-pdb script first checks, if it was called with an valid PDB-id. If the entered PDB code has 4letters, the script tries to download the pdb-file from the server. The successfully downloaded folder gets unzipped and everything except the actual pdb file is removed.
2. repairPDB
For repairPDB the following options are available:
-offset value | offset the residue numbering |
-chain char | change Chain ID |
-ratom | renumber Atoms |
-rres | renumber Residues |
-noh | remove hydrogens |
-het | no change of HETATM to ATOM for AA |
-seq | returns protein sequence from AA in pdb file |
-seqrs | protein sequence from SEQRES entries |
-nosol | just Protein, no solvent OR |
-ssw cutoff | print only waters with B-value below cutoff OR |
-cleansol | remove overlapping solvent for GROMACS |
We run repairPDB using the following command:
repairPDB bckdha_mod.pdb -noh -nosol > bckdha_clean.pdb
Using this command we removed hydrogens and solvent from our pdb to get just the protein.
3. SCWRL
SCWRL was executed using the following command:
scwrl -i bckdha_mod.pdb -s extractedPDB.seq -o bckdha_scwrl.pdb
SCWRL returned a pdb including HETATOMS. These solvent atoms needed to be removed before continuing.
4.pdb2gmx
(use clean pdb without HEATOMS!)
pdb2gmx -f bckdha_clean.pdb -o bckdha.gro -p bckdha.top -water tip3p -ff amber03
5. MDP
title = PBSA minimization in vacuum
cpp = /usr/bin/cpp
define = -DFLEXIBLE -DPOSRES
implicit_solvent = GBSA
integrator = steep
emtol = 1.0
nsteps = 500
nstenergy = 1
energygrps = System
ns_type = grid
coulombtype = cut-off
rcoulomb = 1.0
rvdw = 1.0
constraints = none
pbc = no
adjust nsteps for the time vs steps analysis
integrator | a steepest descent algorithm for energy minimization. |
emtol | tolerance for steep integrator:the minimization is converged when the maximum force is smaller than this value |
nsteps | maximum number of steps to integrate or minimize, -1 is no maximum |
nstenergy | frequency to write energies to energy file (last energies are always written) |
energygrps | groups to write to energy files |
ns_type | Which atoms to check when construction a new neighbor list |
coulombtype | which coulomb-type to use, e.g. Cut-off, PME, Ewald |
rcoulomb | distance for the Coulomb cut-off |
rvdw | Van-der-Waals cut-of (distance for the LJ or Buckingham cut-off) |
constraints | constraints for bonds, e.g bonds are presented by a potention, all bonds are converted to constraints,... |
pbc | Remove the periodicity (make molecule whole again) |
6. grompp
grompp -v -f bckdha.mdp -c bckdha.gro -p bckdha.top -o bckdha.tpr
7. System Minimization
mdrun -v -deffnm bckdha 2> mdrun_out.txt
8. Analyzation
g_energy -f bckdha.edr -o energy_1.xvg
Analysis
Wildtype analysis: nsteps vs time
The table below shoes the running time for mdrun depending on different values for nsteps. It also lists the real number of steps carried out to calculate the energy.
steps | time (real) [s] | time (user) [s] | time (sys) [s] | performed steps |
---|---|---|---|---|
50 | 5.453 | 4.730 | 0.120 | 50 |
100 | 10.393 | 9.210 | 0.240 | 100 |
500 | 36.419 | 30.660 | 0.780 | 338 |
1000 | 5.261 | 4.390 | 0.130 | 47 |
2000 | 10.564 | 8.500 | 0.290 | 93 |
3000 | 10.661 | 8.840 | 0.230 | 96 |
4000 | 2.620 | 2.010 | 0.140 | 21 |
5000 | 3.693 | 3.300 | 0.100 | 35 |
Figure 3 shows the correlation between nsteps and the running time for mdrun
Interestingly, the running time is not dependent on the number of nsteps, but just on the number of really performed steps. There is a linear dependency between the calculation time and the number of performed steps. The number of performed steps however is not correlating with the value for nsteps. It is not obvious why the number of performed steps varies so extremely given a certain value for nsteps.
Wildtype analysis: force fields
The different force fields chosen for this task were:
- AMBER03
- CHARMM27
- OPLS-AA
Bond Analysis
Force Field | Average | Err. Est. | RMSD | Tot-Drift (kJ/mol) |
---|---|---|---|---|
AMBER03 | 3072.83 | 2200 | -nan | -13100.2 |
CHARMM25 | 3180.46 | 1700 | 7382.72 | -9958.05 |
OPLS | 2780.55 | 2100 | -nan | -11542.6 |
Angle Analysis
Force Field | Average | Err. Est. | RMSD | Tot-Drift (kJ/mol) |
---|---|---|---|---|
AMBER03 | 3616.97 | 230 | -nan | -1295.57 |
CHARMM25 | 5018.38 | 490 | 1646.81 | -2783.35 |
OPLS | 3271.23 | 340 | -nan | -1889.98 |
Potential Analysis
Force Field | Average | Err. Est. | RMSD | Tot-Drift (kJ/mol) |
---|---|---|---|---|
AMBER03 | 2.67001e+07 | 2.6e+07 | -nan | -1.60382e+08 |
CHARMM25 | 487.479 | 97 | 199.742 | 673.043 |
OPLS | 2.38353e+07 | 2.4e+07 | -nan | -1.39932e+08 |
Comparing the results for the different force fields it is noticeable that only Charmm27 produces values for the RMSD. The values calculated by AMBER03 and OPLS are mostly similar, only the CHARMM27 values vary totally.
Figure 4 shows the energy calculated by Gromacs using the AMBER03 forcefield applied to the wildtype protein structure. Figure 5 and 6 are showing the energy for the Charmm25 and the OPLS-AA force fields, respectively.
Mutation analysis
In order to discuss the effect of different mutations more in detail, one page for each mutation was created:
Results
The following table summarizes the calculated energies for the different mutations.
FoldX | Minimise | Gromacs | ||||
Mutation | energy value | difference | energy value | difference | energy value | difference |
wildtype | 401.00 | 0 | -2485.452755 | 0 | 2.67001e+07 | 0 |
M82L | 437.88 | 36.88 | -4253.174790 | -1767.722015 | 5.16e+06 | -21540100 |
Q125E | 431.77 | 30.77 | -4080.989512 | -1595.536757 | 5.23e+06 | -21470100 |
Y166N | 432.24 | 31.24 | -4354.495238 | -1869.042483 | 7.95e+06 | -18750100 |
G249S | 432.22 | 31.22 | -4280.043000 | -1794.590245 | 5.96e+06 | -20740100 |
C264W | 488.43 | 87.43 | -3745.313620 | -1259.860865 | 3.41e+07 | 7399900 |
R265W | 460.43 | 59.43 | -3989.790625 | -1504.33787 | 5.36e+06 | -21340100 |
I326T | 432.94 | 31.94 | -4317.105618 | -1831.652863 | 7.29e+06 | -19410100 |
F409C | 433.33 | 32.33 | -4358.528143 | -1873.075388 | 4.68e+06 | -22020100 |
Y438N | 431.56 | 30.56 | -4339.778964 | 1854.326209 | 8.33e+06 | -18370100 |
Interestingly, foldX calculates for all mutations a higher energy than for the wildtype, whereas the energy of the mutated protein calculated by Minimise and Gromacs is (almost) always lower than the wildtype energy. The difference in energies is also very varying between these tools. While foldX returns usually very small energy differences between wildtype and mutated structures, the differences are bigger for minimise and even higher for Gromacs. The most interesting observation from these energies is the mutation C264W, where foldX calculates a very different energy for the mutated protein and also Gromacs returns for the only time an energy that is higher for the mutated structure than for the wildtype structure. See detailed mutation analysis for more information.
Links
go back to Maple syrup urine disease main page
go back to Task 6 Sequence based mutation analysis
go to Task 8 Molecular Dynamics Simulations