Difference between revisions of "Lab Journal - Task 4 (PAH)"
(→Evaluate sequence alignments) |
(→Evaluate sequence alignments) |
||
Line 3: | Line 3: | ||
The perl script hhmakemodel can be found in ''/usr/share/hhsuite/scripts''. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i). |
The perl script hhmakemodel can be found in ''/usr/share/hhsuite/scripts''. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i). |
||
− | -d <pdbdirs> directories containing the pdb files (for PDB, SCOP, or DALI |
||
− | sequences) (default=/cluster/databases/pdb/all) |
||
− | -m <int> [<int> ...] pick hits with specified indices (default='-m 1') |
||
− | -ts <file.pdb> write the PDB-formatted models based on *pairwise* |
||
− | alignments into file.pdb |
||
So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:<br> |
So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:<br> |
||
Line 15: | Line 10: | ||
26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy. |
26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy. |
||
− | + | For hhmakemodel.pl we choose eight entries thereby trying to get the whole range of e-values and scores on default it would only take the first: |
|
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM |
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM |
||
1 1phz_A Protein (phenylalanine 100.0 7E-165 3E-169 1181.8 0.0 429 1-429 1-429 (429) |
1 1phz_A Protein (phenylalanine 100.0 7E-165 3E-169 1181.8 0.0 429 1-429 1-429 (429) |
||
Line 25: | Line 20: | ||
19 1wyp_A Calponin 1; CH domain, 29.4 15 0.00057 29.7 0.0 68 201-269 47-115 (136) |
19 1wyp_A Calponin 1; CH domain, 29.4 15 0.00057 29.7 0.0 68 201-269 47-115 (136) |
||
26 1a6s_A GAG polyprotein; core p 20.6 29 0.0011 28.0 0.0 42 141-196 42-83 (87) |
26 1a6s_A GAG polyprotein; core p 20.6 29 0.0011 28.0 0.0 42 141-196 42-83 (87) |
||
+ | |||
+ | -d <pdbdirs> directories containing the pdb files (for PDB, SCOP, or DALI |
||
+ | sequences) (default=/cluster/databases/pdb/all) |
||
+ | -m <int> [<int> ...] pick hits with specified indices (default='-m 1') |
||
+ | -ts <file.pdb> write the PDB-formatted models based on *pairwise* |
||
+ | alignments into file.pdb |
||
+ | <code> perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 6 7 9 12 19 26 -ts /mnt/home/student/waldraffs/Masterpraktikum/Task4/model_PAH.pdb </code> |
Revision as of 10:39, 1 June 2013
Explore structural alignments
Evaluate sequence alignments
The perl script hhmakemodel can be found in /usr/share/hhsuite/scripts. It needs a input file in form of a result file from hhsearch with hit list and alignments (-i).
So first a hhr file is crated with the FASTA file of our protein PAH and the pdb database. The output file is called hhsearch_PAH.hhr. Furthermore we set the maximal reported lines in summary and alignments to 10000:
hhsearch -i /mnt/home/student/waldraffs/Masterpraktikum/PAH.fasta -d /mnt/project/pracstrucfunc13/data/hhblits/pdb70_current_hhm_db -o /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -Z 10000 -B 10000
26 entries were found. Two of them are also used in explore structural alignments: 1j8u, 3luy. For hhmakemodel.pl we choose eight entries thereby trying to get the whole range of e-values and scores on default it would only take the first:
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 1phz_A Protein (phenylalanine 100.0 7E-165 3E-169 1181.8 0.0 429 1-429 1-429 (429) 2 1j8u_A Phenylalanine-4-hydroxy 100.0 3E-135 1E-139 951.9 0.0 325 103-427 1-325 (325) 6 2v27_A Phenylalanine hydroxyla 100.0 3.6E-74 1.4E-78 528.8 0.0 231 172-406 13-248 (275) 7 2qmx_A Prephenate dehydratase; 98.2 1.1E-09 4E-14 98.4 0.0 67 33-99 199-266 (283) 9 3luy_A Probable chorismate mut 98.1 3.3E-09 1.2E-13 97.7 0.0 67 33-99 206-274 (329) 12 1qey_A MNT-C, protein (regulat 54.0 3.4 0.00013 28.3 0.0 12 189-200 18-29 (31) 19 1wyp_A Calponin 1; CH domain, 29.4 15 0.00057 29.7 0.0 68 201-269 47-115 (136) 26 1a6s_A GAG polyprotein; core p 20.6 29 0.0011 28.0 0.0 42 141-196 42-83 (87)
-d <pdbdirs> directories containing the pdb files (for PDB, SCOP, or DALI sequences) (default=/cluster/databases/pdb/all) -m <int> [<int> ...] pick hits with specified indices (default='-m 1') -ts <file.pdb> write the PDB-formatted models based on *pairwise* alignments into file.pdb
perl /usr/share/hhsuite/scripts/hhmakemodel.pl -i /mnt/home/student/waldraffs/Masterpraktikum/Task4/hhsearch_PAH.hhr -d /mnt/project/pracstrucfunc13/data/pdb/20120401/entries/* -m 1 2 6 7 9 12 19 26 -ts /mnt/home/student/waldraffs/Masterpraktikum/Task4/model_PAH.pdb