Difference between revisions of "Workflow homology modelling glucocerebrosidase"

From Bioinformatikpedia
Line 92: Line 92:
   
 
To create the Alignments needed as input, the tool [http://www.ebi.ac.uk/Tools/msa/clustalw2/ ClustalW2] was used with standard settings. Additionally the Alignment created with MODELLER was used for 2WNW
 
To create the Alignments needed as input, the tool [http://www.ebi.ac.uk/Tools/msa/clustalw2/ ClustalW2] was used with standard settings. Additionally the Alignment created with MODELLER was used for 2WNW
  +
  +
== Evaluation ==
  +
  +
=== All-Atom RMSD in Area of 6Å around Active Site ===
  +
# Define active site: residues E235 and E340 form active site of glucocerebrosidadse<ref>http://www.nature.com/embor/journal/v4/n7/full/embor873.html</ref>
  +
# Load reference structure (1OGS) and models into pymol and superimpose them (action -> align -> to molecule -> 1OGS).
  +
# Select active site residues in reference structure and expand selection by 6Å (action -> modify -> expand -> by 6A, residues)
  +
# Save selected residues in different selections (one for each structure)
  +
# Align selections of model to selection of reference structure (action -> align -> to selection -> ref_active_site)
  +
# The resulting RMSD is given after the alignment.
  +
  +
== References ==
  +
<references/>

Revision as of 22:43, 13 June 2011

Detailed workflow of the different homology modelling approaches for glucocerebrosidase. Return to overview.

MODELLER

Pairwise Sequence Alignments

1. Preparation of the Alignment File

  • Save target protein sequence in PIR-format: TARGET.pir
  • Save PDB-file of template sequence: TEMPLATE.pdb
    If PDB-file consists of several chains: split pdb file with the help of splitpdb (note that minor changes are needed, so that ATOM coordinates get listed in the resulting PDB-file instead of HETATOMS).
  • Run the following Python script with command 'mod9.9 align.py' to create a target-template alignment in PIR-format:

log.verbose()
env = environ()
aln = alignment(env)
mdl= model(env, file='TEMPLATE')
aln.append_model(mdl, align_codes='TEMPLATE')
aln.append(file='TARGET.pir', align_codes=('TARGET'))
aln.align(gap_penalties_1d=(-600,-400))
aln.write(file='TARGET_TEMPLATE.ali', alignment_format='PIR')
aln.write(file='TARGET_TEMPLATE.pap', alignment_format='PAP')

2. Modelling of the Target Structure

  • Run the following Python script with command 'mod9.9 model.py' to model the structure of the target sequence:
    Note that all files (alignment- and structure file) must be in the same folder

from modeller.automodel import *
log.verbose()
env = environ()
env.io.atom_files_directory = 
a = automodel (env, alnfile = 'TARGET_TEMPLATE.ali', knowns = 'TEMPLATE', sequence = 'TARGET')
a.starting_model = 1
a.ending_model = 1
a.make()


Multiple Sequence Alignments

1. Preparation of the Alignment File

  • Save target protein sequence in PIR-format: TARGET.pir
  • Save PDB-files of template sequences: TEMPLATE_x.pdb
  • Run the following Python script with command 'mod9.9 align_templates.py' to create an alignment of the templates in PIR-format:

log.verbose()
env = environ()
aln = alignment(env)
for (code, chain) in ((TEMPLATE_1, CHAIN), (TEMPLATE2, CHAIN), ...):
  mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain))
  aln.append_model(mdl, atom_files=code, align_codes=code+chain)
aln.salign()
aln.write(file='msa.ali', alignment_format='PIR')
aln.write(file='msa.pap', alignment_format='PAP')

  • Run the following Python script with command 'mod9.9 align_target.py' to add the TARGET sequence to the multiple sequence alignment:

from modeller import *
log.verbose()
env = environ()
aln = alignment(env)
aln.append(file='msa.ali', align_codes='all')
aln_block = len(aln)
aln.append(file='TARGET.pir', align_codes='TARGET')
aln.salign()
aln.write(file='msa.ali', alignment_format='PIR')
aln.write(file='msa.pap', alignment_format='PAP')

2. Modelling of the Target Strucutre

  • Modify the following line of the python script given for the pairwise sequence alignments:
Original: a = automodel (env, alnfile = 'TARGET_TEMPLATE.ali', knowns = 'TEMPLATE', sequence = 'TARGET')
Modification: a = automodel(env, alnfile='msa.ali', knowns=('TEMPLATE_1', 'TEMPLATE_2', ...), sequence='TARGET')

I-TASSER

I-TASSER provides the possibility to exclude homologous structures with a certain sequence identity cut-off, which was used in this analysis as well.

SWISS-MODEL

Automated Mode

The automated mode should only be used, if target and template share more than 50% of sequence identity.

Alignment Mode

To create the Alignments needed as input, the tool ClustalW2 was used with standard settings. Additionally the Alignment created with MODELLER was used for 2WNW

Evaluation

All-Atom RMSD in Area of 6Å around Active Site

  1. Define active site: residues E235 and E340 form active site of glucocerebrosidadse<ref>http://www.nature.com/embor/journal/v4/n7/full/embor873.html</ref>
  2. Load reference structure (1OGS) and models into pymol and superimpose them (action -> align -> to molecule -> 1OGS).
  3. Select active site residues in reference structure and expand selection by 6Å (action -> modify -> expand -> by 6A, residues)
  4. Save selected residues in different selections (one for each structure)
  5. Align selections of model to selection of reference structure (action -> align -> to selection -> ref_active_site)
  6. The resulting RMSD is given after the alignment.

References

<references/>