Difference between revisions of "Lab journal task 2"
(→T-Coffee and Expresso) |
(→Multiple Alignments) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 56: | Line 56: | ||
*Version 2.1 |
*Version 2.1 |
||
+ | ClustalW was exectued on the student computers with standard parameters using: |
||
+ | |||
+ | clustalw -INFILE=<fastaFile> |
||
Line 77: | Line 80: | ||
|- |
|- |
||
| Evalue threshold || 1e-10 |
| Evalue threshold || 1e-10 |
||
− | | |
+ | |} |
=== T-Coffee and Expresso === |
=== T-Coffee and Expresso === |
||
*T-Coffe Version 8.99 |
*T-Coffe Version 8.99 |
||
− | *Expresso |
+ | *Expresso Version 9.03 |
For T-Coffe and Expresso, the [http://www.tcoffee.org/ T-Coffe web server] was used with standard parameters. Expresso automatically finds the PDB structures of the sequences in the alignment and thus does not need additional input. |
For T-Coffe and Expresso, the [http://www.tcoffee.org/ T-Coffe web server] was used with standard parameters. Expresso automatically finds the PDB structures of the sequences in the alignment and thus does not need additional input. |
||
+ | |||
+ | |||
+ | |||
+ | Jalview was used for visualisation of the MSAs and to load the secondary structure assignments from Uniprot. |
Latest revision as of 02:03, 30 August 2013
Contents
Sequence Searches
All searches were done with an increase number of output lines (10000) in the summary hit list and for the reported alignments in order to have all found hits displayed. The analyses were all conducted using all displayed hits. We set no specific evalue cutoff.
The HFE protein sequence has the Uniprot ID Q30201, NCBI ID 1890180 and PDB ID 1A6Z_A.
All Blast, Psiblast and hhblits output files that where analyses where first parse using the perl script parse_output.pl. For example:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt
Blast
A Blast search in big_80 was executed using the standard parameter settings:
blastall -p blastp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -o /mnt/home/student/betza/task2/blast/res_blast.txt -v 10000 -b 10000
PSiblast
Example call for Psiblast with 10 iterations and evalue cutoff of 10E-10:
blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/pracstrucfunc13/data/big/big_80 -j 10 -h 10E-10 -v 10000 -b 10000 -o /mnt/home/student/betza /task2/psiblast/new/j1h1/j1hi.txt -Q /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.pssm -C /mnt/home/student/betza/task2/psiblast/new/j1h1/j1h1.chk
The perl scrip devide_psiblast_out.pl was then used to divide the different psiblast iterations in order to be able to analyse the results of the last iteration alone.
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/devide_psiblast_out.pl /mnt/home/student/betza/task2/psiblast/new/j1h1/j1hi.txt
The /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres database was searched reloading the checkfiles created earlier with the -R flag, example for 2 iterations and e-value cutoff 2E-3:
blastpgp -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/home/rost/kloppmann/data/blast_db/pdb_seqres -j 1 -h 0.002 -v 10000 -b 10000 -m $OF -o /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.$FE -Q /mnt/home/student/betza/task2/psiblast/pdb/new/j2h2/j2h2.pssm -R /mnt/home/student/betza/task2/psiblast/new/j2h2/j2h2.chk
HHblits
hhblits commandline call:
hhblits -i /mnt/home/student/betza/data/hfe.fasta -d /mnt/project/rost_db/data/hhblits/uniprot20_02Sep11 -o /mnt/home/student/betza/task2/hhblits/hfe.hhr -oa3m /mnt/home/student/betza/task2/hhblits/hfe.a3m -oalis /mnt/home/student/betza/task2/hhblits/hfe -ohhm /mnt/home/student/betza/task2/hfe.hhm -Z 10000 -B 10000
The output files were analysed using the script parse_output.pl:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt --query 1a6z_A --sot L30 --out_h /mnt/home/student/betza/task2/hhblits/
Evaluation
CATH
The python script compareCath.py was written to check the overlap of the query protein's CATH fold classes with those of the hits.
Example call:
python /mnt/home/student/betza/scripts/compareCath.py -i /mnt/home/student/betza/task2/blast/res_blast.txt_results -q 1a6zA > /mnt/home/student/betza/task2/blast /res_blast.txt/blast_cath
COPS
This analyses were also done using the script parse_output.pl:
perl /mnt/home/student/kalemanovm/master_practical/Assignment2_Alignments/scripts/task1/parse_output.pl --out_p /mnt/home/student/betza/task2/blast/res_blast.txt --query 1a6z_A --sot L30
Multiple Alignments
The sequences used were selected from the psiblast run with 2 iterations and an evalue cutoff of 10E-3.
ClustalW
- Version 2.1
ClustalW was exectued on the student computers with standard parameters using:
clustalw -INFILE=<fastaFile>
MAFFT
- Version 7
For MAFFT, the web server was used with the following parameters:
Parameter | Value |
---|---|
Alignment strategy | auto |
Scoring Matrix | Blosum62 |
Gap opening penalty | 1.53 |
Offset value | 0 |
Number of homologs for profile building | 50 |
Evalue threshold | 1e-10 |
T-Coffee and Expresso
- T-Coffe Version 8.99
- Expresso Version 9.03
For T-Coffe and Expresso, the T-Coffe web server was used with standard parameters. Expresso automatically finds the PDB structures of the sequences in the alignment and thus does not need additional input.
Jalview was used for visualisation of the MSAs and to load the secondary structure assignments from Uniprot.