Difference between revisions of "Using Modeller for TASK 4"

From Bioinformatikpedia
(Pairwise Alignments)
 
(4 intermediate revisions by 3 users not shown)
Line 3: Line 3:
 
=== Pairwise Alignments ===
 
=== Pairwise Alignments ===
   
Modeller needs pairwise sequence alignments of the target and template in PIR format in order to predict the structure. Modeller provides two different methods to calculate pairwise sequence alignments. <code> alignment.malign() </code> uses classical dynamic programming to align two sequences. <code> alignment.alig2dn() </code> also uses a dynamic programming approach, but includes structural information from the template to optimize the alignment (e.g. tries to place gaps outside of secondary structure elements). To create the alignment, you first need to store your target sequence in PIR format. Just copy-paste your protein identifier and sequence in the corresponiding regions of the following example:
+
Modeller needs pairwise sequence alignments of the target and template in PIR format in order to predict the structure. Modeller provides two different methods to calculate pairwise sequence alignments. <code> alignment.malign() </code> uses classical dynamic programming to align two sequences. <code> alignment.alig2dn() </code> also uses a dynamic programming approach, but includes structural information from the template to optimize the alignment (e.g. tries to place gaps outside of secondary structure elements). To create the alignment, you first need to store your target sequence in PIR format. Just copy-paste your protein identifier and sequence in the corresponding regions of the following example:
   
 
>P1;protein_id
 
>P1;protein_id
Line 9: Line 9:
 
your_amino_acid_sequence_needs_to_be_here*
 
your_amino_acid_sequence_needs_to_be_here*
   
Having done this you should be able to calculate the pairwise alignment. An example script, which executes both alignment methods described above, is shown below. Just reoplace "target" and "template_name" with your protein identifier. Also note, that we installed modeller locally on our home computers. After installation, you need to get a license code from (http://salilab.org/modeller/registration.html). Before the program is not licensed, it won't run. If you run the program using the VirtualBox you need to set the paths in <code> env.io.atom_files_directory </code> correctly.
+
Having done this you should be able to calculate the pairwise alignment. An example script, which executes both alignment methods described above, is shown below. Just replace "target" and "template_name" with your protein identifier. Also note, that we installed modeller locally on our home computers. After installation, you need to get a license code from (http://salilab.org/modeller/registration.html). Before the program is not licensed, it won't run. If you run the program using the VirtualBox you need to set the paths in <code> env.io.atom_files_directory </code> correctly.
   
 
<code>
 
<code>
Line 26: Line 26:
 
aln.write(file='target-template.ali', alignment_format='PIR')
 
aln.write(file='target-template.ali', alignment_format='PIR')
 
</code>
 
</code>
 
   
 
=== Single template modeling ===
 
=== Single template modeling ===
Line 59: Line 58:
 
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
 
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
 
aln = alignment(env)
 
aln = alignment(env)
  +
# Replace PROTEIN with your templates and CHAIN with the chains you want to use for modeling
 
for (code, chain) in (('PROTEIN', 'CHAIN'), ('ANOTHER_PROTEIN', 'ANOTHER_CHAIN'), ...):
 
for (code, chain) in (('PROTEIN', 'CHAIN'), ('ANOTHER_PROTEIN', 'ANOTHER_CHAIN'), ...):
 
mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain))
 
mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain))
Line 96: Line 96:
 
env = environ()
 
env = environ()
 
a = automodel(env, alnfile='mymsa.ali',
 
a = automodel(env, alnfile='mymsa.ali',
knowns=('PROTEIN', 'ANOTHER_PROTEIN', ...), sequence='target_name')
+
knowns=('first_template', 'second_template', ...), sequence='target_name')
 
a.starting_model = 1
 
a.starting_model = 1
 
a.ending_model = 1
 
a.ending_model = 1

Latest revision as of 14:54, 10 June 2013

This tutorial basically relies on the official modeller tutorial. Please read it for further information.

Pairwise Alignments

Modeller needs pairwise sequence alignments of the target and template in PIR format in order to predict the structure. Modeller provides two different methods to calculate pairwise sequence alignments. alignment.malign() uses classical dynamic programming to align two sequences. alignment.alig2dn() also uses a dynamic programming approach, but includes structural information from the template to optimize the alignment (e.g. tries to place gaps outside of secondary structure elements). To create the alignment, you first need to store your target sequence in PIR format. Just copy-paste your protein identifier and sequence in the corresponding regions of the following example:

>P1;protein_id
sequence:protein_id:::::::0.00: 0.00
your_amino_acid_sequence_needs_to_be_here*

Having done this you should be able to calculate the pairwise alignment. An example script, which executes both alignment methods described above, is shown below. Just replace "target" and "template_name" with your protein identifier. Also note, that we installed modeller locally on our home computers. After installation, you need to get a license code from (http://salilab.org/modeller/registration.html). Before the program is not licensed, it won't run. If you run the program using the VirtualBox you need to set the paths in env.io.atom_files_directory correctly.


from modeller import *
env = environ()
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
aln = alignment(env)
mdl = model(env, file='template_name', model_segment=('FIRST:@', 'END:'))
aln.append_model(mdl, align_codes='template_name', atom_files='template_name')
aln.append(file='target.pir', align_codes='target_name')
aln.align2d()
aln.check()
aln.write(file='target-template-2d.ali', alignment_format='PIR') 
aln.malign()
aln.check()
aln.write(file='target-template.ali', alignment_format='PIR') 

Single template modeling

To run the actual modeling of your protein using default parameters you can use the following script:


from modeller import *
from modeller.automodel import *    
log.verbose()   
env = environ() 
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
a = automodel(env,
             alnfile  = 'my_alignment.ali',   
             knowns   = 'name_of_template',              
             sequence = 'name_of_target',
             assess_methods=(assess.DOPE, assess.GA341))
a.starting_model= 1                
a.ending_model  = 1                
a.make()                          


Multiple Alignments

The Advanced Modeling tutorial for multiple template modeling suggests to first align all templates and then add the target sequence to this alignment. An example script for multiple template alignment using default parameters is given below:


from modeller import *
log.verbose()
env = environ()
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
aln = alignment(env)
# Replace PROTEIN with your templates and CHAIN with the chains you want to use for modeling
for (code, chain) in (('PROTEIN', 'CHAIN'), ('ANOTHER_PROTEIN', 'ANOTHER_CHAIN'), ...):
   mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain))
   aln.append_model(mdl, atom_files=code, align_codes=code+chain)
aln.salign()
aln.write(file='mymas.pap', alignment_format='PAP')
aln.write(file='mymsa.ali', alignment_format='PIR')

Then you are able to add your target sequence to the alignment:


from modeller import *
log.verbose()
env = environ()
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
env.libs.topology.read(file='$(LIB)/top_heav.lib')
# Read aligned structure(s):
aln = alignment(env)
aln.append(file='mymsa.ali', align_codes='all')
aln_block = len(aln)
# Read aligned sequence(s):
aln.append(file='target.pir', align_codes='target_name')
# Structure sensitive variable gap penalty sequence-sequence alignment:
aln.salign()
aln.write(file='mymsa.ali', alignment_format='PIR')
aln.write(file='mymsa.pap', alignment_format='PAP')

Multiple template modeling

The multiple template modeling is performed analogously to the single template modeling. The script for modeling with default parameters is depicted below:


from modeller import *
from modeller.automodel import *
# env.io.atom_files_directory = ["/apps/modeller9.9/bin/examples/commands/", "/path/to/your/files/"] ## only needed in the VirtualBox
env = environ()
a = automodel(env, alnfile='mymsa.ali',
              knowns=('first_template', 'second_template', ...), sequence='target_name')
a.starting_model = 1
a.ending_model = 1
a.make()