Lab journal task 2

From Bioinformatikpedia
Revision as of 02:03, 30 August 2013 by Betza (talk | contribs) (Multiple Alignments)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Sequence Searches

All searches were done with an increase number of output lines (10000) in the summary hit list and for the reported alignments in order to have all found hits displayed. The analyses were all conducted using all displayed hits. We set no specific evalue cutoff.

The HFE protein sequence has the Uniprot ID Q30201, NCBI ID 1890180 and PDB ID 1A6Z_A.

All Blast, Psiblast and hhblits output files that where analyses where first parse using the perl script For example:

perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/ --out_p /mnt/home/student/betza/task2/blast/res_blast.txt  


A Blast search in big_80 was executed using the standard parameter settings:

blastall -p blastp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -o /mnt/home/student/betza/task2/blast/res_blast.txt  
-v 10000 -b 10000


Example call for Psiblast with 10 iterations and evalue cutoff of 10E-10:

blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -j 10 -h 10E-10 -v 10000 -b 10000 
-o /mnt/home/student/betza
/task2/psiblast/new/j1h1/j1hi.txt -Q /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.pssm 
-C /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.chk

The perl scrip was then used to divide the different psiblast iterations in order to be able to analyse the results of the last iteration alone.

perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/ 

The /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres database was searched reloading the checkfiles created earlier with the -R flag, example for 2 iterations and e-value cutoff 2E-3:

blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres -j 1 -h 0.002 -v 10000 -b 10000 
-m $OF -o /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.$FE -Q /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.pssm 
-R /mnt/home/student/betza/task2/psiblast/new/j2h2/j2h2.chk


hhblits commandline call:

hhblits -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/rost_db/data/hhblits/uniprot20_02Sep11 -o /mnt/home/student/betza/task2/hhblits/hfe.hhr -oa3m /mnt/home/student/betza/task2/hhblits/hfe.a3m -oalis /mnt/home/student/betza/task2/hhblits/hfe -ohhm /mnt/home/student/betza/task2/hfe.hhm -Z 10000 -B 10000

The output files were analysed using the script

perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/ --out_p /mnt/home/student/betza/task2/blast/res_blast.txt --query 1a6z_A --sot L30 --out_h /mnt/home/student/betza/task2/hhblits/


The python script was written to check the overlap of the query protein's CATH fold classes with those of the hits. Example call:

python /mnt/home/student/betza/scripts/ -i /mnt/home/student/betza/task2/blast/res_blast.txt_results -q 1a6zA > /mnt/home/student/betza/task2/blast

This analyses were also done using the script

perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/ --out_p /mnt/home/student/betza/task2/blast/res_blast.txt 
--query 1a6z_A --sot L30

Multiple Alignments

The sequences used were selected from the psiblast run with 2 iterations and an evalue cutoff of 10E-3.


  • Version 2.1

ClustalW was exectued on the student computers with standard parameters using:

clustalw -INFILE=<fastaFile>


  • Version 7

For MAFFT, the web server was used with the following parameters:

Parameter Value
Alignment strategy auto
Scoring Matrix Blosum62
Gap opening penalty 1.53
Offset value 0
Number of homologs for profile building 50
Evalue threshold 1e-10

T-Coffee and Expresso

  • T-Coffe Version 8.99
  • Expresso Version 9.03

For T-Coffe and Expresso, the T-Coffe web server was used with standard parameters. Expresso automatically finds the PDB structures of the sequences in the alignment and thus does not need additional input.

Jalview was used for visualisation of the MSAs and to load the secondary structure assignments from Uniprot.