Difference between revisions of "TSD Homology modelling protocol"
(→Visualisation with Pymol) |
m (→Modeller) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 108: | Line 108: | ||
mkdir -p ../prediction/3GH5_ED/ |
mkdir -p ../prediction/3GH5_ED/ |
||
./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log |
./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log |
||
+ | </source> |
||
+ | |||
+ | Creating MSAs: |
||
+ | <source lang="python"> |
||
+ | #!/usr/bin/env python |
||
+ | from modeller import * |
||
+ | import os |
||
+ | |||
+ | OUT='../prediction/' |
||
+ | IN='../input/' |
||
+ | |||
+ | P=OUT+'3GH5_3NSM/' |
||
+ | |||
+ | if not os.path.exists (P) : |
||
+ | os.mkdir(P) |
||
+ | |||
+ | env = environ() |
||
+ | aln = alignment(env) |
||
+ | |||
+ | for (code, chain) in (('3GH5', 'A'), ('3NSM', 'A')): |
||
+ | mdl = model(env, file=IN+code+'.pdb', model_segment=('FIRST:'+chain, 'LAST:'+chain)) |
||
+ | aln.append_model(mdl, atom_files=code, align_codes=code+chain) |
||
+ | aln.salign() |
||
+ | aln.write(file=P+'mymas.pap', alignment_format='PAP') |
||
+ | aln.write(file=P+'mymsa.ali', alignment_format='PIR') |
||
+ | |||
+ | aln = alignment(env) |
||
+ | aln.append(file=P+'mymsa.ali', align_codes='all') |
||
+ | aln_block = len(aln) |
||
+ | # Read aligned sequence(s): |
||
+ | aln.append(file='../input/P06865.fasta', align_codes='P06865', alignment_format='FASTA') |
||
+ | # Structure sensitive variable gap penalty sequence-sequence alignment: |
||
+ | aln.salign() |
||
+ | aln.write(file=P+'mymsa.ali', alignment_format='PIR') |
||
+ | aln.write(file=P+'mymsa.pap', alignment_format='PAP') |
||
+ | #And the same thing again for 3gh5 and 2gk1 |
||
+ | </source> |
||
+ | |||
+ | Script for creating the models from the MSAs: |
||
+ | <source lang="python"> |
||
+ | #!/usr/bin/env python |
||
+ | |||
+ | from modeller import * |
||
+ | from modeller.automodel import * |
||
+ | import os |
||
+ | |||
+ | OUT='../prediction' |
||
+ | |||
+ | os.chdir(OUT) |
||
+ | |||
+ | P='3GH5_3NSM/' |
||
+ | |||
+ | os.chdir(P) |
||
+ | |||
+ | env = environ() |
||
+ | a = automodel(env, alnfile='mymsa.ali', |
||
+ | knowns=('3GH5A', '3NSMA'), sequence='P06865') |
||
+ | a.starting_model = 1 |
||
+ | a.ending_model = 1 |
||
+ | a.make() |
||
+ | #And the same thing again for 3gh5 and 2gk1 |
||
+ | </source> |
||
+ | |||
+ | ==3D-Jigsaw== |
||
+ | PDB files consisting of multiple structures were created using repairPDB, for each sequence identity group 3 models were used. These pdb files were then used as input for [http://bmm.cancerresearchuk.org/~populus/populus_submit.html 3D-Jigsaw]. <br> |
||
+ | Example script |
||
+ | <source lang="bash"> |
||
+ | #!/bin/bash |
||
+ | |||
+ | IN=/mnt/home/student/meiera/4_HomoMod |
||
+ | |||
+ | for model in $IN/1_SWISSMODEL/2gk1/SwissModel_2gk1A.pdb $IN/2_MODELLER/prediction/2GK1/P06865.B99990001.pdb $IN/3_iTASSER/iTasser/2gk1/model1.pdb; do |
||
+ | perl repairPDB $model >> highseq.pdb |
||
+ | done |
||
+ | |||
</source> |
</source> |
||
==Evaluation== |
==Evaluation== |
||
+ | |||
+ | === SAP and TMScore === |
||
+ | The two scripts were called as follows, exemplified here by the Modeller data. |
||
+ | <source lang="bash"> |
||
+ | #!/bin/bash |
||
+ | |||
+ | |||
+ | PDBS=( 2GK1 1O7A 3GH5 ) |
||
+ | |||
+ | for pdb in "${PDBS[@]}" #Gotta love bash syntax |
||
+ | do |
||
+ | echo "$pdb" |
||
+ | /mnt/project/pracstrucfunc12/bin/sap ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/sap.log |
||
+ | /mnt/project/pracstrucfunc12/bin/sap ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/sap.log |
||
+ | |||
+ | /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/tmscore.log |
||
+ | /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/tmscore.log |
||
+ | done |
||
+ | |||
+ | /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/tmscore.log |
||
+ | /mnt/project/pracstrucfunc12/bin/sap ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/sap.log |
||
+ | </source> |
||
===6Å RMSD using Pymol=== |
===6Å RMSD using Pymol=== |
||
− | *Load |
+ | *Load reference and model and align them |
− | * |
+ | *Select active site from reference |
− | *Expand |
+ | *Expand selection by 6Å |
+ | modify -> expand -> by 6 A |
||
− | *Extract selections into objects |
||
+ | *Exclude the model sequence |
||
− | *Align objects |
||
+ | modify -> exclude -> object |
||
+ | *Extract selection into object |
||
+ | *Select according region from model |
||
+ | *Align selection to object |
Latest revision as of 22:42, 4 June 2012
Back to results.
Contents
Template selection
HHpred and COMA were employed in order to complement the 3 sequence identity classes.
SWISS-MODEL
The modelling was performed with the webserver of SWISS-MODEL in automated mode and alignment mode.
Visualisation with Pymol
SWISS-MODEL provides a model file containing the error coding in the B factor column. This error coding was adopted for the superimposed reference-model illustration. The command therefore is:
spectrum b, red_blue, <model-name>
iTasser
Modeller
Get all necessary files <source lang="bash"> cd ../input/
- HEXA sequence
wget http://www.uniprot.org/uniprot/P06865.fasta sed -ri 's/^>.+/>P06865/' P06865.fasta
- Reference structure
wget http://www.pdb.org/pdb/files/2GJX.pdb
- Templates
wget http://www.pdb.org/pdb/files/2GK1.pdb wget http://www.pdb.org/pdb/files/3GH5.pdb wget http://www.pdb.org/pdb/files/1O7A.pdb </source>
Create the pairwise alignments, using Modeller internal methods <source lang=python>
- !/usr/bin/env python
from modeller import *
OUT='../prediction/'
def makeAli(targetId, targetFile, templateId, templateFile, templateChain) :
env = environ() aln = alignment(env) mdl = model(env, file=templateFile, model_segment=('FIRST:'+templateChain, 'LAST:'+templateChain)) aln.append_model(mdl, align_codes=templateId, atom_files=templateFile) aln.append(file=targetFile, align_codes=targetId, alignment_format='FASTA') aln.align2d() aln.check() aln.write(file=OUT+targetId+templateId+'-2d.ali', alignment_format='PIR') aln.malign() aln.check() aln.write(file=OUT+targetId+templateId+'.ali', alignment_format='PIR') return
makeAli('P06865','../input/P06865.fasta','2GK1', '../input/2GK1.pdb','A' ) makeAli('P06865','../input/P06865.fasta','1O7A', '../input/1O7A.pdb','D' ) makeAli('P06865','../input/P06865.fasta','3GH5', '../input/3GH5.pdb','A' ) </source>
Call modeller on a single alignment file <source lang="python">
- !/usr/bin/env python
from modeller import * from modeller.automodel import * import os import sys
ali = sys.argv[1] strc = sys.argv[2] wd = sys.argv[3]
os.chdir(wd)
env = environ()
a = automodel(env,
alnfile = ali, knowns = strc, sequence = 'P06865', assess_methods=(assess.DOPE, assess.GA341))
a.starting_model= 1 a.ending_model = 1 a.make() </source>
Caller script for all modeller predictions
<source lang="bash">
- !/bin/bash
- Ugly hotfix so we can work in our own folders later on (who hardcodes filepaths AND doesn't allow output folders?!)
sed -i 's|X:../input|X:../../input|' ../prediction/*.ali
PDBS=( 2GK1 1O7A 3GH5 )
for pdb in "${PDBS[@]}" #Gotta love bash syntax do
echo "$pdb" mkdir -p ../prediction/$pdb/ mkdir -p ../prediction/${pdb}_2D/
./createModel.py ../P06865${pdb}.ali ${pdb} ../prediction/$pdb/ &> ../prediction/$pdb/log ./createModel.py ../P06865${pdb}-2d.ali ${pdb} ../prediction/${pdb}_2D/ &> ../prediction/${pdb}_2D/log
done
mkdir -p ../prediction/3GH5_ED/ ./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log </source>
Creating MSAs: <source lang="python">
- !/usr/bin/env python
from modeller import * import os
OUT='../prediction/' IN='../input/'
P=OUT+'3GH5_3NSM/'
if not os.path.exists (P) :
os.mkdir(P)
env = environ() aln = alignment(env)
for (code, chain) in (('3GH5', 'A'), ('3NSM', 'A')):
mdl = model(env, file=IN+code+'.pdb', model_segment=('FIRST:'+chain, 'LAST:'+chain)) aln.append_model(mdl, atom_files=code, align_codes=code+chain)
aln.salign() aln.write(file=P+'mymas.pap', alignment_format='PAP') aln.write(file=P+'mymsa.ali', alignment_format='PIR')
aln = alignment(env) aln.append(file=P+'mymsa.ali', align_codes='all') aln_block = len(aln)
- Read aligned sequence(s):
aln.append(file='../input/P06865.fasta', align_codes='P06865', alignment_format='FASTA')
- Structure sensitive variable gap penalty sequence-sequence alignment:
aln.salign() aln.write(file=P+'mymsa.ali', alignment_format='PIR') aln.write(file=P+'mymsa.pap', alignment_format='PAP')
- And the same thing again for 3gh5 and 2gk1
</source>
Script for creating the models from the MSAs: <source lang="python">
- !/usr/bin/env python
from modeller import * from modeller.automodel import * import os
OUT='../prediction'
os.chdir(OUT)
P='3GH5_3NSM/'
os.chdir(P)
env = environ() a = automodel(env, alnfile='mymsa.ali',
knowns=('3GH5A', '3NSMA'), sequence='P06865')
a.starting_model = 1 a.ending_model = 1 a.make()
- And the same thing again for 3gh5 and 2gk1
</source>
3D-Jigsaw
PDB files consisting of multiple structures were created using repairPDB, for each sequence identity group 3 models were used. These pdb files were then used as input for 3D-Jigsaw.
Example script
<source lang="bash">
- !/bin/bash
IN=/mnt/home/student/meiera/4_HomoMod
for model in $IN/1_SWISSMODEL/2gk1/SwissModel_2gk1A.pdb $IN/2_MODELLER/prediction/2GK1/P06865.B99990001.pdb $IN/3_iTASSER/iTasser/2gk1/model1.pdb; do
perl repairPDB $model >> highseq.pdb
done
</source>
Evaluation
SAP and TMScore
The two scripts were called as follows, exemplified here by the Modeller data. <source lang="bash">
- !/bin/bash
PDBS=( 2GK1 1O7A 3GH5 )
for pdb in "${PDBS[@]}" #Gotta love bash syntax do
echo "$pdb" /mnt/project/pracstrucfunc12/bin/sap ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/sap.log /mnt/project/pracstrucfunc12/bin/sap ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/sap.log
/mnt/project/pracstrucfunc12/bin/TMscore ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/tmscore.log /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/tmscore.log
done
/mnt/project/pracstrucfunc12/bin/TMscore ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/tmscore.log /mnt/project/pracstrucfunc12/bin/sap ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/sap.log </source>
6Å RMSD using Pymol
- Load reference and model and align them
- Select active site from reference
- Expand selection by 6Å
modify -> expand -> by 6 A
- Exclude the model sequence
modify -> exclude -> object
- Extract selection into object
- Select according region from model
- Align selection to object