Lab journal task 2
Sequence Searches
All searches were done with an increase number of output lines (10000) in the summary hit list and for the reported alignments in order to have all found hits displayed. The analyses were all conducted using all displayed hits. We set no specific evalue cutoff.
The HFE protein sequence has the Uniprot ID Q30201, NCBI ID 1890180 and PDB ID 1A6Z_A.
All Blast, Psiblast and hhblits output files that where analyses where first parse using the perl script parse_output.pl. For example:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt
Blast
A Blast search in big_80 was executed using the standard parameter settings:
blastall -p blastp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -o /mnt/home/student/betza/task2/blast/res_blast.txt -v 10000 -b 10000
PSiblast
Example call for Psiblast with 10 iterations and evalue cutoff of 10E-10:
blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -j 10 -h 10E-10 -v 10000 -b 10000 -o /mnt/home/student/betza /task2/psiblast/new/j1h1/j1hi.txt -Q /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.pssm -C /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.chk
The perl scrip devide_psiblast_out.pl was then used to divide the different psiblast iterations in order to be able to analyse the results of the last iteration alone.
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/devide_psiblast_out.pl /mnt/home/student/betza/task2/psiblast/new/j1h1/j1hi.txt
The /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres database was searched reloading the checkfiles created earlier with the -R flag, example for 2 iterations and e-value cutoff 2E-3:
blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres -j 1 -h 0.002 -v 10000 -b 10000 -m $OF -o /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.$FE -Q /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.pssm -R /mnt/home/student/betza/task2/psiblast/new/j2h2/j2h2.chk
HHblits
hhblits commandline call:
hhblits -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/rost_db/data/hhblits/uniprot20_02Sep11 -o /mnt/home/student/betza/task2/hhblits/hfe.hhr -oa3m /mnt/home/student/betza/task2/hhblits/hfe.a3m -oalis /mnt/home/student/betza/task2/hhblits/hfe -ohhm /mnt/home/student/betza/task2/hfe.hhm -Z 10000 -B 10000
The output files were analysed using the script parse_output.pl:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt --query 1a6z_A --sot L30 --out_h /mnt/home/student/betza/task2/hhblits/
Evaluation
CATH
The python script compareCath.py was written to check the overlap of the query protein's CATH fold classes with those of the hits.
Example call:
python /mnt/home/student/betza/scripts/compareCath.py -i /mnt/home/student/betza/task2/blast/res_blast.txt_results -q 1a6zA > /mnt/home/student/betza/task2/blast /res_blast.txt/blast_cath
COPS
This analyses were also done using the script parse_output.pl:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt --query 1a6z_A --sot L30