Task 7 - Structure-based mutation analysis

From Bioinformatikpedia

Intro

This time we make full use of the protein’s known crystal structures.

The software and scripts used can be found in the following directory:

For FOLDX use the command Foldx and link the rotabase.txt file into the directory you want to do the calculations in ln -sf /apps/FoldX_30b5/rotabase.txt .

Talk

structure_based_mutation_analysis.pdf

Where to run the analyses

  • You have to use the student computer pool: i12k-biolab0n.informatik.tu-muenchen.de, where n goes from 1 to 9 (or more?). The file server does not have blast etc. installed!
  • The software and scripts to use can be found in the dir /opt/SS12-Practical

Preparation

Before we start you will have to choose one of structures available (if several can be used). The easiest way to do so is to look at the UniProt entry and check the PDB section there. It is important that the structure has a high resolution (small Å value); furthermore the R-factor should be as small as possible, and the higher the coverage the better. Also, check at which pH-value the structure was resolved; ideally you want physiological pH (7.4). Finally, before you decide for a structure make sure it does not contain any gaps (missing residues) within the structure – this means two consecutive residues would not have a consecutive numbering. If there is no structure without missing residues, try to create a composite structure (contact me).

Map 5 mutations of your choice from the previously selected 10 mutations onto the crystal structure. Color the mutants differently than the rest of the protein and create a snapshot for the wiki. If applicable find out whether the mutations are close to the active site, a binding interface or other important functional sites. Visualize this and describe it properly.

Next we use SCWRL to create our mutation. Make sure you only change the side chain for the mutated structure. It is possible to give SCWRL the mutated sequence. This can be done by extracting the sequence with repairPDB. Then you change all letters to lower case. Next you introduce the new amino acid letter (mutation) in capital letters to the sequence file. This sequence file can be read in by SCWRL using the –s flag. Check if only the mutation side chain has been changed.

Comparison energies

In the following, compare wild type (WT) and mutant structures.

Investigate the local hydrogen-bonding network using pymol[1] – also check for potential clashes (when sidechains are too close to each other). Are you introducing hydrophilics to the core or hydrophobics to the protein surface? Are there any holes introduced to the protein due to the mutations?

Now that you should have a clear idea of the WT and mutant proteins we will try to calculate some energies. Always calculate the energy for the wild type and mutants – then substract/compare.


Use the following approaches:

  • foldX

before minimise and gromacs the hydrogens and waters (protein only) need to be removed using repairPDB

  • minimise
  • gromacs

foldX

Examples how to use foldX can be found here:[2] We want to use the approach Multiple mutations using indivudal list.

To run foldx you will need to make a static link in your working directory to the file ln -sf /opt/SS12-Practical/foldx/rotabase.txt .

Foreach of the mutations also a new structure will be created. Note down all of the energies, but also use these structures in the next steps.

Compare the scwrl and foldx structures in Pymol and superimpose them. What are the differences?

In the next step you will use both the scwrl and the foldX structure as input to minimise.

Minimise

Here we call minimise with the input structure filename as the first argument and the output filename as the second.

Apply this for all 10 mutants and the WT using the scwrl and foldx structures.

Also, use the output of one minimisation for another run as input. Do this 5 times for each structure. What happens regarding the energy? Please only look at the energy for the recursive runs. The structures should only compared for the second minimise run.

Gromacs

Gromacs is a powerful molecular dynamics package that can be used to simulate nearly any atomic system at different levels of accuracy. Here we will get a basic introduction, which will also be useful in the MD task later on. First we will get all the necessary files and then we will minimize our protein in vacuum. Finally we will analyze the energies during the minimization.

The Gromacs manual can be found here [3]; tutorial are available at this site[4]


The following you only have to do for the WT: Repeat step 5 to 7 with different settings for “nsteps” and time mdrun. Create a plot nsteps versus time. Do this for ONLY one forcefield. Do the whole analysis for two other energy functions chosen in step 4 (AMBER03 should be included in any case).


For the mutations we will ONLY use the AMBER03 forcefield!

For the mutation input choose either the scwrl or foldx structures. Base your decisions on the previous results and explain.

1. Use fetchpdb to get the pdb structure – look at the script, what does it do?


2. Use repairPDB to clean the PDB and extract the protein only - describe what options you chose and what other options are available. Make sure you chose the right chain.


3. Run SCWRL with the lowercase protein sequence to make sure there are no missing sidechains. The sequence can be extracted using repairPDB. After SCWRL you will have to remove the hydrogens again.


4. Use the gromacs command “pdb2gmx” the option –f defines the input structure, -o the gromacs outputfile (.gro) and –p the topology (.top) output file. IMPORTANT the input must end with “pdb”. Choose a forcefield and for water the TIP3P model.


5. Create a MDP file with the following content

title = PBSA minimization in vacuum
cpp = /usr/bin/cpp
define = -DFLEXIBLE -DPOSRES
implicit_solvent = GBSA
integrator = steep
emtol = 1.0
nsteps = 500
nstenergy = 1
energygrps = System
ns_type = grid
coulombtype = cut-off
rcoulomb = 1.0
rvdw	 = 1.0
constraints = none
pbc = no

Give a brief description of the different keywords used in this file - for this you should use the gromacs manual.


6. Use grompp to prepare the system for gromacs: grompp -v -f FILE.mdp -c FILE.gro -p FILE.top -o FILE.tpr FILE.tpr is the system file to create which we use in the next step


7. Now we minimize the system: mdrun -v -deffnm FILE


8. Analyze the minimization of the system with the following command: g_energy -f FILE.edr -o energy_1.xvg. Do the analysis for Bond, Angle and Potential. The xvg graphs can be viewed with xmgrace and in the print settings you can choose eps output, the print and convert to pdf.