Difference between revisions of "TSD Homology modelling protocol"

From Bioinformatikpedia
(Evaluation)
m (Modeller)
 
(2 intermediate revisions by the same user not shown)
Line 108: Line 108:
 
mkdir -p ../prediction/3GH5_ED/
 
mkdir -p ../prediction/3GH5_ED/
 
./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log
 
./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log
  +
</source>
  +
  +
Creating MSAs:
  +
<source lang="python">
  +
#!/usr/bin/env python
  +
from modeller import *
  +
import os
  +
  +
OUT='../prediction/'
  +
IN='../input/'
  +
  +
P=OUT+'3GH5_3NSM/'
  +
  +
if not os.path.exists (P) :
  +
os.mkdir(P)
  +
  +
env = environ()
  +
aln = alignment(env)
  +
  +
for (code, chain) in (('3GH5', 'A'), ('3NSM', 'A')):
  +
mdl = model(env, file=IN+code+'.pdb', model_segment=('FIRST:'+chain, 'LAST:'+chain))
  +
aln.append_model(mdl, atom_files=code, align_codes=code+chain)
  +
aln.salign()
  +
aln.write(file=P+'mymas.pap', alignment_format='PAP')
  +
aln.write(file=P+'mymsa.ali', alignment_format='PIR')
  +
  +
aln = alignment(env)
  +
aln.append(file=P+'mymsa.ali', align_codes='all')
  +
aln_block = len(aln)
  +
# Read aligned sequence(s):
  +
aln.append(file='../input/P06865.fasta', align_codes='P06865', alignment_format='FASTA')
  +
# Structure sensitive variable gap penalty sequence-sequence alignment:
  +
aln.salign()
  +
aln.write(file=P+'mymsa.ali', alignment_format='PIR')
  +
aln.write(file=P+'mymsa.pap', alignment_format='PAP')
  +
#And the same thing again for 3gh5 and 2gk1
  +
</source>
  +
  +
Script for creating the models from the MSAs:
  +
<source lang="python">
  +
#!/usr/bin/env python
  +
  +
from modeller import *
  +
from modeller.automodel import *
  +
import os
  +
  +
OUT='../prediction'
  +
  +
os.chdir(OUT)
  +
  +
P='3GH5_3NSM/'
  +
  +
os.chdir(P)
  +
  +
env = environ()
  +
a = automodel(env, alnfile='mymsa.ali',
  +
knowns=('3GH5A', '3NSMA'), sequence='P06865')
  +
a.starting_model = 1
  +
a.ending_model = 1
  +
a.make()
  +
#And the same thing again for 3gh5 and 2gk1
 
</source>
 
</source>
   
Line 123: Line 184:
   
 
</source>
 
</source>
  +
 
==Evaluation==
 
==Evaluation==
   

Latest revision as of 22:42, 4 June 2012

Back to results.

Template selection

HHpred and COMA were employed in order to complement the 3 sequence identity classes.

SWISS-MODEL

The modelling was performed with the webserver of SWISS-MODEL in automated mode and alignment mode.

Visualisation with Pymol

SWISS-MODEL provides a model file containing the error coding in the B factor column. This error coding was adopted for the superimposed reference-model illustration. The command therefore is:

spectrum b, red_blue, <model-name>

iTasser

Modeller

Get all necessary files <source lang="bash"> cd ../input/

  1. HEXA sequence

wget http://www.uniprot.org/uniprot/P06865.fasta sed -ri 's/^>.+/>P06865/' P06865.fasta

  1. Reference structure

wget http://www.pdb.org/pdb/files/2GJX.pdb

  1. Templates

wget http://www.pdb.org/pdb/files/2GK1.pdb wget http://www.pdb.org/pdb/files/3GH5.pdb wget http://www.pdb.org/pdb/files/1O7A.pdb </source>

Create the pairwise alignments, using Modeller internal methods <source lang=python>

  1. !/usr/bin/env python

from modeller import *

OUT='../prediction/'

def makeAli(targetId, targetFile, templateId, templateFile, templateChain) :

       env = environ()
       aln = alignment(env)
       mdl = model(env, file=templateFile, model_segment=('FIRST:'+templateChain, 'LAST:'+templateChain))
       aln.append_model(mdl, align_codes=templateId, atom_files=templateFile)
       aln.append(file=targetFile, align_codes=targetId, alignment_format='FASTA')
       aln.align2d()
       aln.check()
       aln.write(file=OUT+targetId+templateId+'-2d.ali', alignment_format='PIR')
       aln.malign()
       aln.check()
       aln.write(file=OUT+targetId+templateId+'.ali', alignment_format='PIR')
       return

makeAli('P06865','../input/P06865.fasta','2GK1', '../input/2GK1.pdb','A' ) makeAli('P06865','../input/P06865.fasta','1O7A', '../input/1O7A.pdb','D' ) makeAli('P06865','../input/P06865.fasta','3GH5', '../input/3GH5.pdb','A' ) </source>

Call modeller on a single alignment file <source lang="python">

  1. !/usr/bin/env python

from modeller import * from modeller.automodel import * import os import sys

ali = sys.argv[1] strc = sys.argv[2] wd = sys.argv[3]

os.chdir(wd)

env = environ()

a = automodel(env,

            alnfile  = ali,
            knowns   = strc,
            sequence = 'P06865',
            assess_methods=(assess.DOPE, assess.GA341))

a.starting_model= 1 a.ending_model = 1 a.make() </source>

Caller script for all modeller predictions

<source lang="bash">

  1. !/bin/bash
  1. Ugly hotfix so we can work in our own folders later on (who hardcodes filepaths AND doesn't allow output folders?!)

sed -i 's|X:../input|X:../../input|' ../prediction/*.ali

PDBS=( 2GK1 1O7A 3GH5 )

for pdb in "${PDBS[@]}" #Gotta love bash syntax do

 echo "$pdb"
 mkdir -p ../prediction/$pdb/
 mkdir -p ../prediction/${pdb}_2D/
 ./createModel.py ../P06865${pdb}.ali ${pdb} ../prediction/$pdb/ &> ../prediction/$pdb/log
 ./createModel.py ../P06865${pdb}-2d.ali ${pdb} ../prediction/${pdb}_2D/ &> ../prediction/${pdb}_2D/log

done

mkdir -p ../prediction/3GH5_ED/ ./createModel.py ../P068653GH5_edited.pir 3GH5 ../prediction/3GH5_ED/ &> ../prediction/3GH5_ED/log </source>

Creating MSAs: <source lang="python">

  1. !/usr/bin/env python

from modeller import * import os

OUT='../prediction/' IN='../input/'

P=OUT+'3GH5_3NSM/'

if not os.path.exists (P) :

       os.mkdir(P)

env = environ() aln = alignment(env)

for (code, chain) in (('3GH5', 'A'), ('3NSM', 'A')):

  mdl = model(env, file=IN+code+'.pdb', model_segment=('FIRST:'+chain, 'LAST:'+chain))
  aln.append_model(mdl, atom_files=code, align_codes=code+chain)

aln.salign() aln.write(file=P+'mymas.pap', alignment_format='PAP') aln.write(file=P+'mymsa.ali', alignment_format='PIR')

aln = alignment(env) aln.append(file=P+'mymsa.ali', align_codes='all') aln_block = len(aln)

  1. Read aligned sequence(s):

aln.append(file='../input/P06865.fasta', align_codes='P06865', alignment_format='FASTA')

  1. Structure sensitive variable gap penalty sequence-sequence alignment:

aln.salign() aln.write(file=P+'mymsa.ali', alignment_format='PIR') aln.write(file=P+'mymsa.pap', alignment_format='PAP')

  1. And the same thing again for 3gh5 and 2gk1

</source>

Script for creating the models from the MSAs: <source lang="python">

  1. !/usr/bin/env python

from modeller import * from modeller.automodel import * import os

OUT='../prediction'

os.chdir(OUT)

P='3GH5_3NSM/'

os.chdir(P)

env = environ() a = automodel(env, alnfile='mymsa.ali',

             knowns=('3GH5A', '3NSMA'), sequence='P06865')

a.starting_model = 1 a.ending_model = 1 a.make()

  1. And the same thing again for 3gh5 and 2gk1

</source>

3D-Jigsaw

PDB files consisting of multiple structures were created using repairPDB, for each sequence identity group 3 models were used. These pdb files were then used as input for 3D-Jigsaw.
Example script <source lang="bash">

  1. !/bin/bash

IN=/mnt/home/student/meiera/4_HomoMod

for model in $IN/1_SWISSMODEL/2gk1/SwissModel_2gk1A.pdb $IN/2_MODELLER/prediction/2GK1/P06865.B99990001.pdb $IN/3_iTASSER/iTasser/2gk1/model1.pdb; do

        perl repairPDB $model >> highseq.pdb

done

</source>

Evaluation

SAP and TMScore

The two scripts were called as follows, exemplified here by the Modeller data. <source lang="bash">

  1. !/bin/bash


PDBS=( 2GK1 1O7A 3GH5 )

for pdb in "${PDBS[@]}" #Gotta love bash syntax do

 echo "$pdb"
 /mnt/project/pracstrucfunc12/bin/sap ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/sap.log
 /mnt/project/pracstrucfunc12/bin/sap ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/sap.log
 /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/$pdb/P06865.B99990001.pdb ../REFERENZchaina.pdb &> ../prediction/$pdb/tmscore.log
 /mnt/project/pracstrucfunc12/bin/TMscore ../prediction/${pdb}_2D/P06865.B999900012d.pdb ../REFERENZchaina.pdb &> ../prediction/${pdb}_2D/tmscore.log

done

/mnt/project/pracstrucfunc12/bin/TMscore ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/tmscore.log /mnt/project/pracstrucfunc12/bin/sap ../prediction/3GH5_ED/P06865.B99990001ed.pdb ../REFERENZchaina.pdb &> ../prediction/3GH5_ED/sap.log </source>

6Å RMSD using Pymol

  • Load reference and model and align them
  • Select active site from reference
  • Expand selection by 6Å
modify -> expand -> by 6 A
  • Exclude the model sequence
modify -> exclude -> object
  • Extract selection into object
  • Select according region from model
  • Align selection to object