Task 2 - Alignments with PAH Reference
Contents
Task 2 - Alignments with PAH Reference
Sequence Searches
BLAST
Running
<source lang="bash"> time sudo blastall -p blastp -d '/data/blast/nr/nr' -i ./reference.fasta -o './reference.blast' -b 500 </source>
real 11m30.762s
user 3m11.440s
sys 0m12.250s
Results
FASTA
Installation
Used Virtual Box with Linux.
download von fasta3.tar.gz von ftp://ftp.ebi.ac.uk/pub/software/unix/fasta/ unzipped: /home/student/Download/fasta moved: sudo mv /home/student/Download/fasta /apps/fasta build with: make -f /apps/fasta/make/Makefile.linux64 all
Running
time ./fasta36 /home/student/reference.fasta /data/nr/nr interactive response: Enter filename for results []: /home/student/reference.fasta_search How many scores do you want to see: 500 More scores? 0
Display alignments also? (y/n) [n] y number of alignments [500]? 500
real 10m13.878s user 7m26.270s sys 0m20.230s
Results
PSI-BLAST
Running
Parameterset 1
time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e10E-6_i3.blast' -h 10E-6 -j 3 -C /home/student/workspace/reference_i3_e10E-6.chk real 37m56.447s user 14m27.620s sys 0m54.620s
Parameterset 2
time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e005_i3.blast' -h 0.005 -j 3 -C /home/student/workspace/reference_i3_e005.chk real 37m41.487s user 14m42.850s sys 0m52.370s
Parameterset 3
time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e005_i5.blast' -h 0.005 -j 5 -C /home/student/workspace/reference_i5_e005.chk real 62m22.175s user 26m25.410s sys 1m20.700s
Parameterset 4
time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e10E-6_i5.blast' -h 10E-6 -j 5 -C /home/student/workspace/reference_i5_e10E-6.chk real 61m59.284s user 25m55.920s sys 1m21.620s
Results
HHSearch
Installation
Preparing the HHM-Database
downloaded from ftp://ftp.tuebingen.mpg.de/pub/protevo/HHsearch/databases/ the archive pdb70_29May10.hhm.tar.gz unzipped to /data/hmm/pdb70/ made a database with cat: cat *.hhm >> pdb70.db and placed the db in an appropriate directory: sudo mv pdb70.db ../pdb70.db
Configure HHSearch-Tools
Changes in /apps/bin/addpsipred: my $psipreddir="/apps/psipred_2.5"; # Put the directory path with the PSIPRED executables my $ncbidir="/apps/blast_old/bin"; # Put the directory path with the BLAST executables my $perl="/apps/bin"; # Put the directory path where reformat.pl is lying my $dummydb="/home/student/tmp"; # Put the name given to the dummy blast directory (or leave this name)
copied /apps/bin/reformat to /apps/bin/reformat.pl
Running
Parameterset 1
time hhsearch -i reference.fasta -d /data/hmm/pdb70.db -b 500 -o reference_simple.hhsearch real 8m33.171s user 5m14.530s sys 0m3.510s
Parameterset 2
alignblast reference_psi_e10E-6_i3.blast reference_psi_e10E-6_i3.a3m addpsipred /home/student/workspace/reference_psi_e10E-6_i3.a3m time hhsearch -i reference_psi_e10E-6_i3.a3m -d /data/hmm/pdb70.db -o reference_psi_e10E-6_i3.hhsearch real 16m27.258s user 7m47.220s sys 0m6.290s
Parameterset 3
alignblast reference_psi_e005_i3.blast reference_psi_e005_i3.a3m addpsipred /home/student/workspace/reference_psi_e005_i3.a3m time hhsearch -i reference_psi_e005_i3.a3m -d /data/hmm/pdb70.db -o reference_psi_e005_i3.hhsearch real 16m7.216s user 7m41.840s sys 0m5.570s
Parameterset 4
alignblast reference_psi_e005_i5.blast reference_psi_e005_i5.blast.a3m addpsipred /home/student/workspace/reference_psi_e005_i5.blast.a3m time hhsearch -i reference_psi_e005_i5.blast.a3m -d /data/hmm/pdb70.db -o reference_psi_e005_i5.blast.hhsearch real 7m49.907s user 7m15.310s sys 0m4.320s
Parameterset 5
alignblast reference_psi_e10E-6_i5.blast reference_psi_e10E-6_i5.a3m addpsipred /home/student/workspace/reference_psi_e10E-6_i5.a3m time hhsearch -i reference_psi_e10E-6_i5.a3m -d /data/hmm/pdb70.db -o reference_psi_e10E-6_i5.hhsearch real 8m10.730s user 7m33.190s sys 0m5.390s
Comparing the Results
HSSP - Some Positives
Getting the entry of PAH from HSSP http://mrs.cmbi.ru.nl/mrs-5/entry?db=hssp&id=2pah&q=phenylalanine%20hydroxylase
HSSP - More Positives
hhsearch is run with a pdb-set. for blast was nr used. nr contains swissprot, refseq, PIR, PRF, PDB and GenBank CDS translations entries. hssp contains only swissprot entries. That's why a mapping of the swissprot-entries and the other databases is necessary. For this purpose we created a java-tool: