Difference between revisions of "Lab Journal - Task 4 (PAH)"

From Bioinformatikpedia
(Evaluate sequence alignments)
(Evaluate sequence alignments)
Line 3: Line 3:
 
The perl script hhmakemodel can be found in ''/usr/share/hhsuite/scripts''. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i).
 
The perl script hhmakemodel can be found in ''/usr/share/hhsuite/scripts''. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i).
   
-d <pdbdirs> directories containing the pdb files (for PDB, SCOP, or DALI
 
sequences) (default=/cluster/databases/pdb/all)
 
-m <int> [<int> ...] pick hits with specified indices (default='-m 1')
 
-ts <file.pdb> write the PDB-formatted models based on *pairwise*
 
alignments into file.pdb
 
   
 
So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:<br>
 
So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:<br>
Line 15: Line 10:
   
 
26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy.
 
26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy.
We choose eight entries thereby trying to get the whole range of e-values and scores:
+
For hhmakemodel.pl we choose eight entries thereby trying to get the whole range of e-values and scores on default it would only take the first:
 
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
 
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
 
1 1phz_A Protein (phenylalanine 100.0 7E-165 3E-169 1181.8 0.0 429 1-429 1-429 (429)
 
1 1phz_A Protein (phenylalanine 100.0 7E-165 3E-169 1181.8 0.0 429 1-429 1-429 (429)
Line 25: Line 20:
 
19 1wyp_A Calponin 1; CH domain, 29.4 15 0.00057 29.7 0.0 68 201-269 47-115 (136)
 
19 1wyp_A Calponin 1; CH domain, 29.4 15 0.00057 29.7 0.0 68 201-269 47-115 (136)
 
26 1a6s_A GAG polyprotein; core p 20.6 29 0.0011 28.0 0.0 42 141-196 42-83 (87)
 
26 1a6s_A GAG polyprotein; core p 20.6 29 0.0011 28.0 0.0 42 141-196 42-83 (87)
  +
  +
-d <pdbdirs> directories containing the pdb files (for PDB, SCOP, or DALI
  +
sequences) (default=/cluster/databases/pdb/all)
  +
-m <int> [<int> ...] pick hits with specified indices (default='-m 1')
  +
-ts <file.pdb> write the PDB-formatted models based on *pairwise*
  +
alignments into file.pdb
  +
<code> perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 6 7 9 12 19 26 -ts /mnt/home/student/waldraffs/Masterpraktikum/Task4/model_PAH.pdb </code>

Revision as of 11:39, 1 June 2013

Explore structural alignments

Evaluate sequence alignments

The perl script hhmakemodel can be found in /usr/share/hhsuite/scripts. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i).


So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:
hhsearch -i /mnt/home/student/waldraffs/Masterpraktikum/PAH.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -Z 10000 -B 10000

26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy. For hhmakemodel.pl we choose eight entries thereby trying to get the whole range of e-values and scores on default it would only take the first:

No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
 1 1phz_A Protein (phenylalanine  100.0  7E-165  3E-169 1181.8   0.0  429    1-429     1-429 (429)
 2 1j8u_A Phenylalanine-4-hydroxy 100.0  3E-135  1E-139  951.9   0.0  325  103-427     1-325 (325)
 6 2v27_A Phenylalanine hydroxyla 100.0 3.6E-74 1.4E-78  528.8   0.0  231  172-406    13-248 (275)
 7 2qmx_A Prephenate dehydratase;  98.2 1.1E-09   4E-14   98.4   0.0   67   33-99    199-266 (283)
 9 3luy_A Probable chorismate mut  98.1 3.3E-09 1.2E-13   97.7   0.0   67   33-99    206-274 (329)
12 1qey_A MNT-C, protein (regulat  54.0     3.4 0.00013   28.3   0.0   12  189-200    18-29  (31)
19 1wyp_A Calponin 1; CH domain,   29.4      15 0.00057   29.7   0.0   68  201-269    47-115 (136)
26 1a6s_A GAG polyprotein; core p  20.6      29  0.0011   28.0   0.0   42  141-196    42-83  (87)
-d   <pdbdirs>         directories containing the pdb files (for PDB, SCOP, or DALI
                       sequences) (default=/cluster/databases/pdb/all)
-m   <int> [<int> ...] pick hits with specified indices  (default='-m 1')
-ts  <file.pdb>        write the PDB-formatted models based on *pairwise*
                       alignments into file.pdb

perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 6 7 9 12 19 26 -ts /mnt/home/student/waldraffs/Masterpraktikum/Task4/model_PAH.pdb