Difference between revisions of "Task 9 Structure-based mutation analysis"

From Bioinformatikpedia
m (Gromacs (optional task for those who love MD!))
(Intro)
 
(6 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
== Intro ==
 
== Intro ==
This time we make full use of the protein’s known crystal structures and try to predict energetic changes caused by mutations.
+
This time we make full use of the protein’s known crystal structures and try to predict energetic changes caused by mutations. We use different methods to generate structures and predict energies. For the analysis, we check how the predicted structures differ and how the predicted energy changes differ. Do any of the analyses lead to a plausible prediction of the effect of the mutation?
  +
  +
The slides to the talk can be found here: [[File:Structure_based_mutation_analysis_Karo.pdf‎]]
   
 
== Where to run the analyses ==
 
== Where to run the analyses ==
Line 69: Line 71:
   
 
=== Steps ===
 
=== Steps ===
  +
<ol>
# Use fetchpdb to get the pdb structure – look at the script, what does it do?
 
# Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.
+
<li> Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
  +
<li> Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.
# Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
 
  +
<li>Create a MDP file with the following content (for a description of the different keywords used in this file see the gromacs manual)<br />
# Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.
 
# Create a MDP file with the following content
 
 
<code>
 
<code>
title = PBSA minimization in vacuum
+
title = PBSA minimization in vacuum<br />
cpp = /usr/bin/cpp
+
cpp = /usr/bin/cpp <br />
define = -DFLEXIBLE -DPOSRES
+
define = -DFLEXIBLE -DPOSRES <br />
implicit_solvent = GBSA
+
implicit_solvent = GBSA <br />
integrator = steep
+
integrator = steep <br />
emtol = 1.0
+
emtol = 1.0 <br />
nsteps = 500
+
nsteps = 500 <br />
nstenergy = 1
+
nstenergy = 1 <br />
energygrps = System
+
energygrps = System <br />
ns_type = grid
+
ns_type = grid <br />
coulombtype = cut-off
+
coulombtype = cut-off <br />
rcoulomb = 1.0
+
rcoulomb = 1.0 <br />
rvdw = 1.0
+
rvdw = 1.0 <br />
constraints = none
+
constraints = none <br />
 
pbc = no
 
pbc = no
 
</code>
 
</code>
  +
<li> Use grompp to prepare the system for gromacs (producing the file FILE.tpr, which we use in the next step): <code>grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr </code>
Give a brief description of the different keywords used in this file - for this you should use the gromacs manual.
 
# Use grompp to prepare the system for gromacs: <code>grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr </code>
+
<li> Now we minimize the system: <code>mdrun -v -deffnm FILE </code>
  +
<li> Analyze the minimization of the system (Bond, Angle and Potential) with the following command: <code>g_energy -f FILE.edr -o energy_1.xvg</code>.
FILE.tpr is the system file to create which we use in the next step
 
  +
</ol>
# Now we minimize the system: <code>mdrun -v -deffnm FILE </code>
 
  +
The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, then print and convert to pdf.
# Analyze the minimization of the system with the following command: <code>g_energy -f FILE.edr -o energy_1.xvg</code>.
 
Do the analysis for Bond, Angle and Potential.
 
 
The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, the print and convert to pdf.
 
   
 
=== Further experiments (only for the WT!) ===
 
=== Further experiments (only for the WT!) ===
* Repeat step 5 to 7 with different settings for
+
* Repeat step 3 to 6 with different settings for “nsteps” and measure the time of mdrun; create a plot nsteps versus time.
  +
* Do the whole analysis for two other energy functions chosen in step 4 (AMBER03 should be included in any case).
** “nsteps” and
 
** time mdrun.
 
* Create a plot nsteps versus time.
 
 
Do the whole analysis for two other energy functions chosen in step 4 (AMBER03 should be included in any case).
 

Latest revision as of 14:07, 2 July 2013

Intro

This time we make full use of the protein’s known crystal structures and try to predict energetic changes caused by mutations. We use different methods to generate structures and predict energies. For the analysis, we check how the predicted structures differ and how the predicted energy changes differ. Do any of the analyses lead to a plausible prediction of the effect of the mutation?

The slides to the talk can be found here: File:Structure based mutation analysis Karo.pdf

Where to run the analyses

  • You have to use the student computer pool: i12k-biolab0n.informatik.tu-muenchen.de, where n goes from 1 to 9 (or more?). The file server does not have blast etc. installed!
  • The software and scripts to use can be found in the dir /opt/SS12-Practical

Preparation

Choose a structure to work with

Before we start you will have to choose one of structures available (if several can be used). In the previous sections you already had to use structures as references, so you could stick to your choice. However, in this part of the practical there are some additional constraints to observe.

  • It is important that the structure has a high resolution (small Å value);
  • furthermore the R-factor should be as small as possible, and the higher the coverage the better.
  • Also, check at which pH-value the structure was resolved; ideally you want physiological pH (7.4).
  • Finally, before you decide for a structure make sure it does not contain any gaps (missing residues) within the structure – this means two consecutive residues would not have a consecutive numbering.
  • If there is no structure without missing residues, try to create a composite structure.

Visualise the mutations you want to work with

Map 5 mutations of your choice from the previously selected mutations onto the crystal structure:

  • Color the mutants differently than the rest of the protein and create a snapshot for the wiki.
  • If applicable find out whether the mutations are close to the active site, a binding interface or other important functional sites. Visualize this and describe it properly.

Create mutated structures

Next we use SCWRL to create our mutations. Make sure you only change one side chain for each mutated structure. It is possible to give SCWRL the mutated sequence. This can be done by extracting the sequence with repairPDB. Then you change all letters to lower case. Next you introduce the new amino acid letter (mutation) in capital letters to the sequence file. This sequence file can be read in by SCWRL using the –s flag. Check if only the mutated side chain has been changed.

Energy comparisons

In the following, compare wild type (WT) and mutant structures.

Investigate the local hydrogen-bonding network using pymol[1] – also check for potential clashes (when sidechains are too close to each other). Are you introducing hydrophilics to the core or hydrophobics to the protein surface? Are there any holes introduced to the protein due to the mutations?

Now that you should have a clear idea of the WT and mutant proteins, we will try to calculate some energies. Always calculate the energy for the wild type and mutants – then substract/compare.


Use the following approaches:

  • foldX

before minimise and gromacs the hydrogens and waters (protein only) need to be removed using repairPDB

  • minimise
  • gromacs

foldX

Examples how to use foldX can be found here:[2] We want to use the approach Multiple mutations using indivudal list.

To run foldx you will need to make a static link in your working directory to the file ln -sf /opt/SS12-Practical/foldx/rotabase.txt .

For each of the mutations also a new structure will be created. Note down all of the energies, but also use these structures in the next steps.

Compare the scwrl and foldx structures in Pymol and superimpose them. What are the differences?

In the next step you will use both the scwrl and the foldX structure as input to minimise.

Minimise

Here we call minimise with the input structure filename as the first argument and the output filename as the second.

Apply this for all mutant structures (produced by scwrl and foldx) and the WT.

Also, use the output of one minimisation for another run as input. Do this 5 times for each structure. What happens regarding the energy? Please only look at the energy for the recursive runs. The structures should only compared for the second minimise run.

Gromacs (optional task for those who love MD!)

Gromacs is a powerful molecular dynamics package that can be used to simulate nearly any atomic system at different levels of accuracy. Here we will get a basic introduction. First we will get all the necessary files and then we will minimize our protein in vacuum. Finally we will analyze the energies during the minimization.

The Gromacs manual can be found here [3]; tutorial are available at this site[4]

Input

  • For the mutations we will ONLY use the AMBER03 forcefield!
  • For the mutation input choose either the scwrl or foldx structures. Base your decisions on the previous results and explain.

Steps

  1. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.
  2. Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.
  3. Create a MDP file with the following content (for a description of the different keywords used in this file see the gromacs manual)
    title = PBSA minimization in vacuum
    cpp = /usr/bin/cpp
    define = -DFLEXIBLE -DPOSRES
    implicit_solvent = GBSA
    integrator = steep
    emtol = 1.0
    nsteps = 500
    nstenergy = 1
    energygrps = System
    ns_type = grid
    coulombtype = cut-off
    rcoulomb = 1.0
    rvdw = 1.0
    constraints = none
    pbc = no
  4. Use grompp to prepare the system for gromacs (producing the file FILE.tpr, which we use in the next step): grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr
  5. Now we minimize the system: mdrun -v -deffnm FILE
  6. Analyze the minimization of the system (Bond, Angle and Potential) with the following command: g_energy -f FILE.edr -o energy_1.xvg.

The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, then print and convert to pdf.

Further experiments (only for the WT!)

  • Repeat step 3 to 6 with different settings for “nsteps” and measure the time of mdrun; create a plot nsteps versus time.
  • Do the whole analysis for two other energy functions chosen in step 4 (AMBER03 should be included in any case).