Workflow homology modelling glucocerebrosidase

From Bioinformatikpedia
Revision as of 18:00, 12 June 2011 by Braunt (talk | contribs) (MODELLER)

Detailed workflow of the different homology modelling approaches for glucocerebrosidase. Return to overview.

MODELLER

Pairwise Sequence Alignments

1. Preparation of the Alignment File

  • Save target protein sequence in PIR-format: TARGET.pir
  • Save PDB-file of template sequence: TEMPLATE.pdb
    If PDB-file consists of several chains: split pdb file with the help of splitpdb (note that minor changes are needed, so that ATOM coordinates get listed in the resulting PDB-file instead of HETATOMS).
  • Run the following Python script with command 'mod9.9 align.py' to create a target-template alignment in PIR-format:

log.verbose()
env = environ()
aln = alignment(env)
mdl= model(env, file='TEMPLATE')
aln.append_model(mdl, align_codes='TEMPLATE')
aln.append(file='TARGET.pir', align_codes=('TARGET'))
aln.align(gap_penalties_1d=(-600,-400))
aln.write(file='TARGET_TEMPLATE.ali', alignment_format='PIR')
aln.write(file='TARGET_TEMPLATE.pap', alignment_format='PAP')

2. Modelling of the Target Structure

  • Run the following Python script with command 'mod9.9 model.py' to model the structure of the target sequence:
    Note that all files (alignment- and structure file) must be in the same folder

from modeller.automodel import *
log.verbose()
env = environ()
env.io.atom_files_directory = 
a = automodel (env, alnfile = 'TARGET_TEMPLATE.ali', knowns = 'TEMPLATE', sequence = 'TARGET')
a.starting_model = 1
a.ending_model = 1
a.make()


Multiple Sequence Alignments

1. Preparation of the Alignment File

  • Save target protein sequence in PIR-format: TARGET.pir
  • Save PDB-files of template sequences: TEMPLATE_x.pdb
  • Run the following Python script with command 'mod9.9 align_templates.py' to create an alignment of the templates in PIR-format:

log.verbose()
env = environ()
aln = alignment(env)
for (code, chain) in ((TEMPLATE_1, CHAIN), (TEMPLATE2, CHAIN), ...):
  mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain))
  aln.append_model(mdl, atom_files=code, align_codes=code+chain)
aln.salign()
aln.write(file='msa.ali', alignment_format='PIR')
aln.write(file='msa.pap', alignment_format='PAP')

  • Run the following Python script with command 'mod9.9 align_target.py' to add the TARGET sequence to the multiple sequence alignment:

from modeller import *
log.verbose()
env = environ()
aln = alignment(env)
aln.append(file='msa.ali', align_codes='all')
aln_block = len(aln)
aln.append(file='TARGET.pir', align_codes='TARGET')
aln.salign()
aln.write(file='msa.ali', alignment_format='PIR')
aln.write(file='msa.pap', alignment_format='PAP')

2. Modelling of the Target Strucutre

  • Modify the following line of the python script given for the pairwise sequence alignments:
Original: a = automodel (env, alnfile = 'TARGET_TEMPLATE.ali', knowns = 'TEMPLATE', sequence = 'TARGET')
Modification: a = automodel(env, alnfile='msa.ali', knowns=('TEMPLATE_1', 'TEMPLATE_2', ...), sequence='TARGET')

I-TASSER

I-TASSER provides the possibility to exclude homologous structures with a certain sequence identity cut-off, which was used in this analysis as well.

SWISS-MODEL

Automated Mode

The automated mode should only be used, if target and template share more than 50% of sequence identity.

Alignment Mode

To create the Alignments needed as input, the tool ClustalW2 was used with standard settings. Additionally the Alignment created with MODELLER was used for 2WNW