Difference between revisions of "Structure-based mutation analysis ARSA"
(→Gromacs) |
(→Gromacs) |
||
Line 402: | Line 402: | ||
Give a brief description of the different keywords used in this file - for this you should use the gromacs manual. |
Give a brief description of the different keywords used in this file - for this you should use the gromacs manual. |
||
− | DFLEXIBLE: water in the the topology is flexible |
+ | -> DFLEXIBLE: water in the the topology is flexible |
− | DPOSRES: position restraints, defined in posre.itp, are included in the topology |
+ | -> DPOSRES: position restraints, defined in posre.itp, are included in the topology |
+ | -> implicit_solvent = GBSA: defines the implicit solvent model, here the Generalized Born formalism is used |
||
− | implicit_solvent: |
||
+ | -> integrator = steep: defines the algorithm for energy minimization, here the steepest descent algorithm is used |
||
− | integrator: |
||
+ | -> emtol: defines, when convergence is assumed: minimization is stoppen when maximum force is smaller than this value |
||
− | emtol: |
||
+ | -> nsteps: defines the maximum number of steps in the energy minimization |
||
− | nsteps: |
||
+ | -> nstenergy: frequency in which energies are written to the energy file |
||
− | nstenergy: |
||
− | energygrps: |
+ | -> energygrps: group(s) to write to the energy file |
+ | -> ns_type: defines type of Neighbor searching, here a grid is built and only atoms in neighboring grid cells are checked when a new neighbor list is constructed |
||
− | ns_type: |
||
− | coulombtype: |
+ | -> coulombtype: |
− | rcoulomb: |
+ | -> rcoulomb: |
− | rvdw: |
+ | -> rvdw: |
+ | -> constraints: defines additional constraints besides the ones defined in the topology file |
||
− | constraints: |
||
− | pbc: |
+ | -> pbc: |
6. Use grompp to prepare the system for gromacs: grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr FILE.tpr is the system file to create which we use in the next step |
6. Use grompp to prepare the system for gromacs: grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr FILE.tpr is the system file to create which we use in the next step |
Revision as of 12:13, 4 July 2011
Contents
Preparation
Visualization with Pymol
The following image shows a pymol visualization of Arylsulfatase A, together with all known active and binding sites.
One can see, that the mutations are spread out through the protein. Some lie near functional sites, other are very distant from them. The table below shows again Pymol visualizations of all mutations, but each seperately. With this, we want to try to derive investigate the correlation of location of the mutation with respect to functional sites and its effect on the protein function.
Above visualizations indicate, that mutations near functional important sites of the protein are likely to cause a harmful effect. However, for distant mutations no trend can be observed.
SCRWL
First, we extracted the amino acid sequence from our pdb file and converted it to lower case.
repairPDB arsa_model.pdb -seq > arsa.model.seq
tr '[:upper:]' '[:lower:]' < arsa.model.seq > arsa.model.lower.seq
Next, we included the individual mutations as capital letters in seperate files and executed scwrl with the following command:
scwrl cmd
We also ran SCWRL on the wild type (wt) structure in order to make it comparable to energy predictions by other programs. The minimal energy of the graph for the wt is 415.134.
FoldX
wt: 510.88
Nr. | mutation | position | Minimal energy | Energy(mutant)/Energy(wt) | |
1 | Asp-Asn | 29 | 496.88 | 0.9725963 | |
2 | Pro - Ala | 136 | 493.81 | 0.966587 | |
3 | Gln-His | 153 | 493.90 | 0.9667632 | |
4 | Trp-Cys | 193 | 496.23 | 0.971324 | |
5 | Thr-Met | 274 | 503.34 | 0.9852412 | |
6 | Phe -Val | 356 | 495.39 | 0.9696798 | |
7 | Thr-Ile | 409 | 495.45 | 0.9697972 | |
8 | Asn-Ser | 440 | 496.79 | 0.9724201 | |
9 | Cys-Gly | 489 | 495.87 | 0.9706193 | |
10 | Arg-His | 496 | 498.75 | 0.9762567 |
Minimise
wt: -3839.492677
Nr. | mutation | position | Free energy | Energy(mutant)/Energy(wt) |
1 | Asp-Asn | 29 | -4174.487222 | 1.087250 |
2 | Pro - Ala | 136 | -4164.523707 | 1.084655 |
3 | Gln-His | 153 | -4109.12924 | 1.070227 |
4 | Trp-Cys | 193 | -4169.617285 | 1.085981 |
5 | Thr-Met | 274 | -4065.730562 | 1.058924 |
6 | Phe -Val | 356 | -4109.021939 | 1.070199 |
7 | Thr-Ile | 409 | -4121.130656 | 1.073353 |
8 | Asn-Ser | 440 | -4120.275589 | 1.073130 |
9 | Cys-Gly | 489 | -4127.969116 | 1.075134 |
10 | Arg-His | 496 | -4120.151911 | 1.073098 |
Gromacs
Gromacs is an abbreviation for 'GROningen Mixture of Alchemy and Childrens' Stories' and is a package to perform molecular dynamics. It is Free Software, available under the GNU General Public License.
Workflow
1. Use fetchpdb to get the pdb structure – look at the script, what does it do?
-> we did not use fetchpdb, as we already had the needed pdb-file. The script simply downloads a pdb-file for a give PDB-ID from the PDB-webpage.
2. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.
-> TODO BENE
3. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
-> TODO BENE
4. Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.
-> we chose the AMBER03 forcefield as it is optimized for the use with proteins
5. Create a MDP file with the following content
title = PBSA minimization in vacuum cpp = /usr/bin/cpp define = -DFLEXIBLE -DPOSRES implicit_solvent = GBSA integrator = steep emtol = 1.0 nsteps = 500 nstenergy = 1 energygrps = System ns_type = grid coulombtype = cut-off rcoulomb = 1.0 rvdw = 1.0 constraints = none pbc = no
Give a brief description of the different keywords used in this file - for this you should use the gromacs manual.
-> DFLEXIBLE: water in the the topology is flexible -> DPOSRES: position restraints, defined in posre.itp, are included in the topology -> implicit_solvent = GBSA: defines the implicit solvent model, here the Generalized Born formalism is used -> integrator = steep: defines the algorithm for energy minimization, here the steepest descent algorithm is used -> emtol: defines, when convergence is assumed: minimization is stoppen when maximum force is smaller than this value -> nsteps: defines the maximum number of steps in the energy minimization -> nstenergy: frequency in which energies are written to the energy file -> energygrps: group(s) to write to the energy file -> ns_type: defines type of Neighbor searching, here a grid is built and only atoms in neighboring grid cells are checked when a new neighbor list is constructed -> coulombtype: -> rcoulomb: -> rvdw: -> constraints: defines additional constraints besides the ones defined in the topology file -> pbc:
6. Use grompp to prepare the system for gromacs: grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr FILE.tpr is the system file to create which we use in the next step
7. Now we minimize the system: mdrun -v -deffnm FILE
8. Analyze the minimization of the system with the following command: g_energy -f FILE.edr -o energy_1.xvg. Do the analysis for Bond, Angle and Potential. The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, the print and convert to pdf.