Lab Journal - Task 5 (PAH)
To use Modeller we followed the tutorial written by the students 2011.
- First the target sequence must be provided into PIR-format (2pah has one Lysine at the end and therefore is one aa longer than P00439???):
An example python script for template 1j8u:
Creating the alignment <source lang=python> from modeller import * env = environ() aln = alignment(env) mdl = model(env, file='1j8u.pdb', model_segment=('FIRST:@', 'END:')) aln.append_model(mdl, align_codes='1j8u', atom_files='1j8u') aln.append(file='target.pir', align_codes='2pah') aln.align2d() aln.check() aln.write(file='/mnt/home/student/waldraffs/masterpracitcal/Task5/alignments/1j8u-2pah-2d.ali', alignment_format='PIR') aln.malign() aln.check() aln.write(file='/mnt/home/student/waldraffs/masterpracitcal/Task5/alignments/1j8u-2pah.ali', alignment_format='PIR') </source> Single template modeling <source lang=python> from modeller import * from modeller.automodel import * log.verbose() env = environ() a = automodel(env,
alnfile = '/mnt/home/student/waldraffs/masterpracitcal/Task5/alignments/1j8u-2pah.ali', knowns = '1j8u', sequence = '2pah', assess_methods=(assess.DOPE, assess.GA341))
a.starting_model= 1 a.ending_model = 1 a.make() </source>
Also done for 2phm and 3luy.
We did following steps on the I-TASSER server for the protein structure and function prediction:
- Paste query: We set the fasta sequence of our protein PAH (P00439) as query sequence into the big field on the beginning.
- Paste template (optional): Now, we have three different templates (1J8U, 2PHm and 3LUY) given and so we have to run three different jobs with I-TASSER. Therefore, we type the PDB ID into the field "Specify template without alignment" in
- Paste cutoff (optional): Since the protein 1J8U has a 100% identity to our protein (2PAH), we did not set any cutoff value for this job. However, we set the cutoff for the PDB ID of 2PHM to 90% and for 3LUY to 35%. For this purpose, one has to set the cutoff in percent in
Option IIinto the field "Exclude homologous templates". Subsequent, one can run the job by clicking onto the button "Run I-TASSER.
Since the server is overloaded with jobs, one run takes about 60 hours for completion and everyone can only submit one job!!!
The GDT-TS scores were calculated with the LGA server. Following options are used:
-4 -o2 -gdc -3 -sda
Thereby the delivered pdb files are limited to chain A by adding '_A' (for example 2pah_A).
For the calculation of the Cα RMSD we used the sap-script, which can be found on the biolab servers and used as followed:
/mnt/project/pracstrucfunc13/bin/sap file1.pdb file2.pdb > out.txt