CD task4 protocol

From Bioinformatikpedia

Template identification

HHPred output

Modeller

We modified the scripts provided in the tutorial to suit our demands.

Target

The sequence of out target Aspartoacylase saved in .pir format:

>P1;p45381
sequence:p45381:::::::0.00: 0.00
MTSCHIAEEHIQKVAIFGGTHGNELTGVFLVKHWLENGAEIQRTGLEVKPFITNPRAVKK
CTRYIDCDLNRIFDLENLGKKMSEDLPYEVRRAQEINHLFGPKDSEDSYDIIFDLHNTTS
NMGCTLILEDSRNNFLIQMFHYIKTSLAPLPCYVYLIEHPSLKYATTRSIAKYPVGIEVG
PQPQGVLRADILDQMRKMIKHALDFIHHFNEGKEFPPCAIEVYKIIEKVDYPRDENGEIA
AIIHPNLQDQDWKPLHPGDPMFLTLDGKTIPLGGDCTVYPVFVNEAAYYEKKEAFAKTTK
LTLNAKSIRCCLH*

Alignment

Example alignment script for creating the pairwise alignment of 2GU2 and Aspa.

from modeller import *
env = environ()
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## o$
aln = alignment(env)
mdl = model(env, file='2GU2', model_segment=('FIRST:A', 'END:A'))
aln.append_model(mdl, align_codes='2GU2', atom_files='2GU2')
aln.append(file='../p45381.pir', align_codes='p45381')
aln.align2d()
aln.check()
aln.write(file='aspa_2gu2-2d.ali', alignment_format='PIR')
aln.malign()
aln.check()
aln.write(file='aspa_2gu2.ali', alignment_format='PIR')

Single template modelling

Example alignment script for creating a model of Aspartoacylase from a single template 2GU2:A.

from modeller import *
from modeller.automodel import *    
log.verbose()   
env = environ() 
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
a = automodel(env,
            alnfile  = 'aspa_2gu2-2d.ali',   
            knowns   = '2GU2A',              
            sequence = 'p45381',
            assess_methods=(assess.DOPE, assess.GA341))
a.starting_model= 1                
a.ending_model  = 1                
a.make()

Alignments for different templates

2GU2 <figure id="2gu2_align">Modeller 2gu2 align.png</figure>
3NFZ <figure id="3nfz_align">Modeller 3nfz align.png</figure>

Scores

2GU2

Filename                          molpdf     DOPE score    GA341 score
----------------------------------------------------------------------
p45381.B99990001.pdb          1640.80847   -36987.52734        1.00000

I-Tasser

C Score

C-score is a confidence score for estimating the quality of predicted models by I-TASSER. It is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of [-5,2], where a C-score of higher value signifies a model with a high confidence and vice-versa.

Template > 80% Seq Id: 2GU2 (Rattus Norvegicus)

For the > 80% sequence identity template, we used default values, simply providing the template identifier 2GU2 and Chain A for modelling.