Task 2 - Alignments with PAH Reference

From Bioinformatikpedia
Revision as of 10:45, 22 May 2011 by Meier (talk | contribs)

Task 2 - Alignments with PAH Reference

Sequence Searches

BLAST

Running

time sudo blastall -p blastp -d '/data/blast/nr/nr' -i ./reference.fasta -o './reference.blast' -b 500

real 11m30.762s
user 3m11.440s
sys 0m12.250s

Results

FASTA

Installation

Used Virtual Box with Linux.

download von fasta3.tar.gz von ftp://ftp.ebi.ac.uk/pub/software/unix/fasta/ unzipped: /home/student/Download/fasta moved: sudo mv /home/student/Download/fasta /apps/fasta build with: make -f /apps/fasta/make/Makefile.linux64 all

Running

time ./fasta36 /home/student/reference.fasta /data/nr/nr interactive response: Enter filename for results []: /home/student/reference.fasta_search How many scores do you want to see: 500 More scores? 0

	Display alignments also? (y/n) [n] y
	number of alignments [500]? 500

real 10m13.878s user 7m26.270s sys 0m20.230s

Results

PSI-BLAST

Running

Parameterset 1

time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e10E-6_i3.blast' -h 10E-6 -j 3 -C /home/student/workspace/reference_i3_e10E-6.chk real 37m56.447s user 14m27.620s sys 0m54.620s

Parameterset 2

time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e005_i3.blast' -h 0.005 -j 3 -C /home/student/workspace/reference_i3_e005.chk real 37m41.487s user 14m42.850s sys 0m52.370s

Parameterset 3

time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e005_i5.blast' -h 0.005 -j 5 -C /home/student/workspace/reference_i5_e005.chk real 62m22.175s user 26m25.410s sys 1m20.700s

Parameterset 4

time blastpgp -d '/data/nr/nr' -i reference.fasta -o '/home/student/workspace/reference_psi_e10E-6_i5.blast' -h 10E-6 -j 5 -C /home/student/workspace/reference_i5_e10E-6.chk real 61m59.284s user 25m55.920s sys 1m21.620s

Results

HHSearch

Installation

Preparing the HHM-Database

downloaded from ftp://ftp.tuebingen.mpg.de/pub/protevo/HHsearch/databases/ the archive pdb70_29May10.hhm.tar.gz unzipped to /data/hmm/pdb70/ made a database with cat: cat *.hhm >> pdb70.db and placed the db in an appropriate directory: sudo mv pdb70.db ../pdb70.db

Configure HHSearch-Tools

Changes in /apps/bin/addpsipred: my $psipreddir="/apps/psipred_2.5"; # Put the directory path with the PSIPRED executables my $ncbidir="/apps/blast_old/bin"; # Put the directory path with the BLAST executables my $perl="/apps/bin"; # Put the directory path where reformat.pl is lying my $dummydb="/home/student/tmp"; # Put the name given to the dummy blast directory (or leave this name)

copied /apps/bin/reformat to /apps/bin/reformat.pl


Running

Parameterset 1

time hhsearch -i reference.fasta -d /data/hmm/pdb70.db -b 500 -o reference_simple.hhsearch real 8m33.171s user 5m14.530s sys 0m3.510s

Parameterset 2

alignblast reference_psi_e10E-6_i3.blast reference_psi_e10E-6_i3.a3m addpsipred /home/student/workspace/reference_psi_e10E-6_i3.a3m time hhsearch -i reference_psi_e10E-6_i3.a3m -d /data/hmm/pdb70.db -o reference_psi_e10E-6_i3.hhsearch real 16m27.258s user 7m47.220s sys 0m6.290s

Parameterset 3

alignblast reference_psi_e005_i3.blast reference_psi_e005_i3.a3m addpsipred /home/student/workspace/reference_psi_e005_i3.a3m time hhsearch -i reference_psi_e005_i3.a3m -d /data/hmm/pdb70.db -o reference_psi_e005_i3.hhsearch real 16m7.216s user 7m41.840s sys 0m5.570s

Parameterset 4

alignblast reference_psi_e005_i5.blast reference_psi_e005_i5.blast.a3m addpsipred /home/student/workspace/reference_psi_e005_i5.blast.a3m time hhsearch -i reference_psi_e005_i5.blast.a3m -d /data/hmm/pdb70.db -o reference_psi_e005_i5.blast.hhsearch real 7m49.907s user 7m15.310s sys 0m4.320s

Parameterset 5

alignblast reference_psi_e10E-6_i5.blast reference_psi_e10E-6_i5.a3m addpsipred /home/student/workspace/reference_psi_e10E-6_i5.a3m time hhsearch -i reference_psi_e10E-6_i5.a3m -d /data/hmm/pdb70.db -o reference_psi_e10E-6_i5.hhsearch real 8m10.730s user 7m33.190s sys 0m5.390s

Comparing the Results

HSSP - Some Positives

Getting the entry of PAH from HSSP http://mrs.cmbi.ru.nl/mrs-5/entry?db=hssp&id=2pah&q=phenylalanine%20hydroxylase

HSSP - More Positives

hhsearch is run with a pdb-set. for blast was nr used. nr contains swissprot, refseq, PIR, PRF, PDB and GenBank CDS translations entries. hssp contains only swissprot entries. That's why a mapping of the swissprot-entries and the other databases is necessary. For this purpose we created a java-tool: