Homology Modeling of ARS A
Contents
HHpred
We used the webserver and
Modeller
Proteins used as templates
We identified the following proteins (see Alignment TASK) as potential targets for homology modeling:used the following
SeqIdentifier | Seq Identity (from TASK 2) | source | Protein function | True homolog (HSSP) | Seq Identity (pairw. ali.) |
---|---|---|---|---|---|
1P49 | 39.0% | Homo Sapiens | Steryl-Sulfatase | yes | 31.9% |
1FSU | 28.0% | Homo Sapiens | Arylsulfatase B | yes | 26.5% |
2VQR | 20.0% | Rhizobium leguminosarum | Monoester Hydrolase | no | 20.3% |
3ED4 | 32.0% | Escherichia coli | Arylsulfatase | yes | 27.7% |
Our potential templates, identified by the database searches contain all homologs with known structure, regarding to HSSP.
Single template modelling
In order to predict the structure using a single template structure, modeller needs pairwise sequence alignments in PIR format. Modeller provides two different methods to calculate pairwise sequence alignments. alignment.malign()
uses classical dynamic programming to align two sequences. alignment.alig2dn()
also uses a dynamic programming approach, but includes structural information to optimize the alignment (e.g. tries to place gaps outside of secondary structure elements). We applied both alignment methods and created eight pairwise sequnece alignments of the above templates with the target. The script used for this purpose is shown below:
from modeller import *
env = environ()
aln = alignment(env)
mdl = model(env, file='template_name', model_segment=('FIRST:@', 'END:'))
aln.append_model(mdl, align_codes='template_name', atom_files='template_name')
aln.append(file='1AUK.pir', align_codes='target_name')
aln.align2d()
aln.check()
aln.write(file='target-template-2d.ali', alignment_format='PIR')
aln.malign()
aln.check()
aln.write(file='target-template.ali', alignment_format='PIR')
For these alignments we constructed eight models, using the following script:
from modeller import *
from modeller.automodel import *
log.verbose()
env = environ()
a = automodel(env,
alnfile = '1AUK-1FSU-2d.ali',
knowns = '1FSU',
sequence = '1AUK',
assess_methods=(assess.DOPE, assess.GA341))
a.starting_model= 1
a.ending_model = 1
a.make()
We modified the paths and filenames in the scripts such that it matched our proteins of interest.
Next, we calculated RMSD and TM scores of the models to get a first impression on how much the models deviate from the original structure. The results are depicted in the table below.
Further on, we visualised the models using pymol. We load both structures into the program and performed a structural alignment to superimpose and compare them visually. The pymol commands and the images are shown below:
align 1AUK, MODEL
hide all
show cartoon
# select color of modelled structure via graphical interface
ray
cmd.png("MODEL.png")
Alignment method | 1P49 | 2VQR | 1FSU | 3ED4 |
Classical Dynamic Programming |
||||
Dynamic Programming with structural information from the template |
3ED4
Alignment method | 3ED4A | 3ED4B | 3ED4C | 3ED4D |
Dynamic Programming with structural information from the template |
Modification of Alignments
TM-scores and RMSD of the single template models
We downloaded the TMscore FORTRAN source code from http://zhanglab.ccmb.med.umich.edu/TM-score/ and compiled it using
gfortran -static -O3 -ffast-math -lm -o TMscore TMscore.f
TMscores were calculated as follows:
./TMscore MODEL.pdb REAL_STRUCTURE.pdb
PDB Identifier | TM-score | RMSD | |
---|---|---|---|
Dynamic Programing with structural information | |||
1P49 | 0.7960 | - | |
2VQR | 0.4825 | - | |
1FSU | 0.7146 | - | |
3ED4 | 0.3881 | - | |
3ED4A | 0.7268 | - | |
3ED4B | 0.7251 | - | |
3ED4C | 0.6518 | - | |
3ED4D | 0.7303 | - | |
Dynamic Programing with structural information | |||
1P49 | 0.7731 | - | |
2VQR | 0.3183 | - | |
1FSU | 0.7223 | - | |
3ED4 | 0.3122 | - |