Difference between revisions of "Lab Journal Hemochromatosis Task9"

From Bioinformatikpedia
(Minimisation)
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
== Structure Selection ==
  +
  +
For the calculation of the coverage, the signal peptide at the beginning was omitted:
  +
coverage = ( 275 (length of sequence used in crystallisation ) - 3 ( missing residues ) ) / ( 348 ( length of Q30201 ) - 22 ( length of signal peptide ) )
  +
 
== 3D mutations with SCWRL4 ==
 
== 3D mutations with SCWRL4 ==
   
First, the wild type sequence was extracted from the FASTA entry of 1A6Z on rcsb.org. Then, using mut_seq.py, all mutant sequences were generated.
+
First, the wild type sequence was extracted from the FASTA entry of 1A6Z on rcsb.org. Then, using mut_seq.py (see below for the code), all mutant sequences were generated.
 
The mutated structures were then generated with SCWRL4 using the following command
 
The mutated structures were then generated with SCWRL4 using the following command
 
Scwrl4 -i <1a6z chain A pdb file> -o <mutated pdb> -s <mutated sequence file>
 
Scwrl4 -i <1a6z chain A pdb file> -o <mutated pdb> -s <mutated sequence file>
   
To check, whether only the right residue was mutated, the sequence of the PDB files containing the mutations were extracted and compared:
+
To check, whether only the right residue was mutated, the sequence of the PDB files containing the mutations were extracted and compared. The mutations are colored in red in the following:
  +
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYT<span style="color:red">Y</span>QVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTYQVEHPGLDQPLIVIW
 
  +
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYD<span style="color:red">D</span>ESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDDESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
 
  +
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDH<span style="color:red">I</span>FTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHIFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
 
  +
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHV<span style="color:red">I</span>SSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVISSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
 
  +
RSHSLHYLFMGASEQDLGLSLFEALGY<span style="color:red">M</span>DDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYMDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
 
  +
  +
== 3D mutations with FoldX ==
  +
  +
FoldX was run with modified scripts from the tutorial page of FoldX.
  +
  +
== Minimisation ==
  +
  +
The minimisation was carried out on the student machines with the following command
  +
/opt/SS12-Practical/minimise/minimise <inPDB> <outPDB> > <logfile>
  +
  +
  +
1 kJ = 0.239 kcal
   
 
=== mut_seq.py ===
 
=== mut_seq.py ===
 
<source lang="python">
 
<source lang="python">
 
import copy
 
import copy
  +
seq="rllrshslhylfmgaseqdlglslfealgyvddqlfvfydhesrrveprtpwvssrissqmwlqlsqslkgwdhmftvdfwtimenhnhskeshtlqvilgcemqednstegywkygydgqdhlefcpdtldwraaeprawpt
seq="rllrshslhylfmgaseqdlglslfealgyvddqlfvfydhesrrveprtpwvssrissqmwlqlsqslkgwdhmftvdfwtimenhnhskeshtlqvilgcemqednstegywkygydgqdhlefcpdtldwraaeprawptklewerhkirarqnraylerdcpaqlqqllelgrgvldqqvpplvkvthhvtssvttlrcralnyypqnitmkwlkdkqpmdakefepkdvlpngdgtyqgwitlavppgeeqrytcqvehpgldqpliviw"
 
  +
klewerhkirarqnraylerdcpaqlqqllelgrgvldqqvpplvkvthhvtssvttlrcralnyypqnitmkwlkdkqpmdakefepkdvlpngdgtyqgwitlavppgeeqrytcqvehpgldqpliviw"
 
seq= list(seq)
 
seq= list(seq)
   
Line 27: Line 45:
 
f.write("".join(mut_seq[3:])) #leave out first three residues because they are missing in the PDB structure
 
f.write("".join(mut_seq[3:])) #leave out first three residues because they are missing in the PDB structure
   
  +
  +
</source>
  +
  +
  +
  +
=== min_structures.sh ===
  +
  +
This script was used to execute the minimisation of all structures.
  +
  +
<source lang="bash">
  +
  +
fx=./foldx/*.pdb
  +
  +
for f in $fx
  +
do
  +
base=${f%.*}
  +
mkdir $base
  +
  +
/opt/SS12-Practical/minimise/minimise $f $base/iter1.pdb > $base/iter1.out
  +
  +
for i in 2 3 4 5
  +
do
  +
p=$(($i-1))
  +
/opt/SS12-Practical/minimise/minimise $base/iter$p.pdb $base/iter$i.pdb > $base/iter$i.out
  +
done
  +
  +
done
  +
</source>
  +
  +
  +
  +
=== get_res.py ===
  +
This script was used to extract the results.
  +
  +
<source lang="python">
  +
import sys
  +
import os
  +
from collections import defaultdict
  +
  +
folder = sys.argv[1]
  +
  +
mutNames = os.listdir(folder)
  +
eList = defaultdict(list)
  +
for mut in sorted(mutNames):
  +
if not mut.endswith(".pdb"):
  +
continue
  +
mFolder = folder + "/" + mut[:-4]
  +
for i in range(1,6):
  +
fName = mFolder + "/iter"+ str(i) + ".out"
  +
with open(fName) as f:
  +
t = f.readlines()
  +
energy = t[-6].strip()[7:].strip(")")
  +
energy = str(round(float(energy),2))
  +
eList[mut[:-4]].append(energy)
  +
  +
  +
for mut in sorted(eList.keys()):
  +
eList[mut].insert(0,mut)
  +
print "| " + " || ".join(eList[mut])
  +
print "|-"
   
 
</source>
 
</source>

Latest revision as of 20:29, 1 September 2013

Structure Selection

For the calculation of the coverage, the signal peptide at the beginning was omitted:

coverage = ( 275 (length of sequence used in crystallisation ) - 3 ( missing residues ) ) / ( 348 ( length of Q30201 ) - 22 ( length of signal peptide ) )

3D mutations with SCWRL4

First, the wild type sequence was extracted from the FASTA entry of 1A6Z on rcsb.org. Then, using mut_seq.py (see below for the code), all mutant sequences were generated. The mutated structures were then generated with SCWRL4 using the following command

Scwrl4 -i <1a6z chain A pdb file> -o <mutated pdb> -s <mutated sequence file>

To check, whether only the right residue was mutated, the sequence of the PDB files containing the mutations were extracted and compared. The mutations are colored in red in the following:

RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTYQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDDESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHIFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYVDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVISSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW
RSHSLHYLFMGASEQDLGLSLFEALGYMDDQLFVFYDHESRRVEPRTPWVSSRISSQMWLQLSQSLKGWDHMFTVDFWTIMENHNHSKESHTLQVILGCEMQEDNSTEGYWKYGYDGQDHLEFCPDTLDWRAAEPRAWPTKLEWERHKIRARQNRAYLERDCPAQLQQLLELGRGVLDQQVPPLVKVTHHVTSSVTTLRCRALNYYPQNITMKWLKDKQPMDAKEFEPKDVLPNGDGTYQGWITLAVPPGEEQRYTCQVEHPGLDQPLIVIW

3D mutations with FoldX

FoldX was run with modified scripts from the tutorial page of FoldX.

Minimisation

The minimisation was carried out on the student machines with the following command

/opt/SS12-Practical/minimise/minimise <inPDB> <outPDB> > <logfile>


1 kJ = 0.239 kcal

mut_seq.py

<source lang="python"> import copy seq="rllrshslhylfmgaseqdlglslfealgyvddqlfvfydhesrrveprtpwvssrissqmwlqlsqslkgwdhmftvdfwtimenhnhskeshtlqvilgcemqednstegywkygydgqdhlefcpdtldwraaeprawpt klewerhkirarqnraylerdcpaqlqqllelgrgvldqqvpplvkvthhvtssvttlrcralnyypqnitmkwlkdkqpmdakefepkdvlpngdgtyqgwitlavppgeeqrytcqvehpgldqpliviw" seq= list(seq)

mut = {53:"M",63:"D",97:"I",217:"I",282:"Y"}

for key in mut.iterkeys(): mut_seq = copy.deepcopy(seq) mut_seq[key-23] = mut[key] ## -(22+1) 22 for PDB - seq offset and 1 for indexing starting at 0 fName= str(key) + mut[key] + ".seq" with open(fName,"w+") as f: f.write("".join(mut_seq[3:])) #leave out first three residues because they are missing in the PDB structure


</source>


min_structures.sh

This script was used to execute the minimisation of all structures.

<source lang="bash">

fx=./foldx/*.pdb

for f in $fx do base=${f%.*} mkdir $base

/opt/SS12-Practical/minimise/minimise $f $base/iter1.pdb > $base/iter1.out

for i in 2 3 4 5 do p=$(($i-1)) /opt/SS12-Practical/minimise/minimise $base/iter$p.pdb $base/iter$i.pdb > $base/iter$i.out done

done </source>


get_res.py

This script was used to extract the results.

<source lang="python"> import sys import os from collections import defaultdict

folder = sys.argv[1]

mutNames = os.listdir(folder) eList = defaultdict(list) for mut in sorted(mutNames): if not mut.endswith(".pdb"): continue mFolder = folder + "/" + mut[:-4] for i in range(1,6): fName = mFolder + "/iter"+ str(i) + ".out" with open(fName) as f: t = f.readlines() energy = t[-6].strip()[7:].strip(")") energy = str(round(float(energy),2)) eList[mut[:-4]].append(energy)


for mut in sorted(eList.keys()): eList[mut].insert(0,mut) print "| " + " || ".join(eList[mut]) print "|-"

</source>