Fabry:Sequence alignments (sequence searches and multiple alignments)/Journal
From Bioinformatikpedia
Revision as of 07:42, 5 May 2012 by Rackersederj (talk | contribs)
Please see Task 2 Results for our results on this topic. Please see also Task 2 Scripts for the used scripts.
Contents
Reference sequence
The reference sequence of α-Galactosidase A that will be used in the following tasks was obtained from Swissprot P06280.
>gi|4504009|ref|NP_000160.1| alpha-galactosidase A precursor [Homo sapiens] MQLRNPELHLGCALALRFLALVSWDIPGARALDNGLARTPTMGWLHWERFMCNLDCQEEPDSCISEKLFM EMAELMVSEGWKDAGYEYLCIDDCWMAPQRDSEGRLQADPQRFPHGIRQLANYVHSKGLKLGIYADVGNK TCAGFPGSFGYYDIDAQTFADWGVDLLKFDGCYCDSLENLADGYKHMSLALNRTGRSIVYSCEWPLYMWP FQKPNYTEIRQYCNHWRNFADIDDSWKSIKSILDWTSFNQERIVDVAGPGGWNDPDMLVIGNFGLSWNQQ VTQMALWAIMAAPLFMSNDLRHISPQAKALLQDKDVIAINQDPLGKQGYQLRQGDNFEVWERPLSGLAWA VAMINRQEIGGPRSYTIAVASLGKGVACNPACFITQLLPVKRKLGFYEWTSRLRSHINPTGTVLLQLENT MQMSLKDLL
Sequence searches
Blast
We searched the "big80" database with Blast with the following command:
blastall -p blastp -d /mnt/project/pracstrucfunc12/data/big/big_80 -i P06280.fasta -m 0 -o blastsearch_default.out -v 700 -b 700 ./extract_ids_blast.sh blastsearch_default.out perl ../download-annotation.pl blastsearch_default_ids.txt perl ../compare_GO_terms.pl P06280 blastsearch_default_ids_GOterms.tsv perl parse_blast.pl blastsearch_default.out
The run took about 2 minutes (see section Time)
Psi-Blast
Iterations: 2 Evalue: 0.002 real 3m30.256s user 2m58.070s sys 0m13.360s Iterations: 2 Evalue: 0.000000001 real 3m8.507s user 3m5.180s sys 0m2.400s Iterations: 2 Evalue: 0.0000000001 real 3m10.271s user 3m7.620s sys 0m2.190s Iterations: 10 Evalue: 0.002 real 15m29.218s user 15m8.910s sys 0m12.730s Iterations: 10 Evalue: 0.000000001 real 16m33.748s user 16m12.500s sys 0m13.080s Iterations: 10 Evalue: 0.0000000001 real 16m20.137s user 15m55.910s sys 0m13.190s
HHblits / HHsearch
We searched the "big80" database with HHblits using the default settings and also with the maximum number of possible iterations (8) with the following commands:
time hhblits -i ../P06280.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -e 0.003 -o hhblits_default.out -E 0.003 -z 700 ./extract_ids_hhblits.sh hhblits_default.out perl ../download-annotation.pl hhblits_default_ids.txt perl ../compare_GO_terms.pl P06280 hhblits_default_ids_GOterms.tsv perl parse_hhblits.pl hhblits_default.out time hhblits -i ../P06280.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -e 0.003 -o hhblits_n8_neu.out -E 0.003 -n 8 -z 800 -b 800 ./extract_ids_hhblits.sh hhblits_n8_neu.out perl ../download-annotation.pl hhblits_n8_neu_ids.txt perl ../compare_GO_terms.pl P06280 hhblits_n8_neu_ids_GOterms.tsv perl parse_hhblits.pl hhblits_n8_neu.out R CMD BATCH hist_hhblits.R
The first HHblits run took about 2.5 minutes, the second one about 16 minutes (see section Time).
Comparison
Venn diagrams created with Oliveros, J.C. (2007) VENNY. An interactive tool for comparing lists with Venn Diagrams.
>R CMD BATCH all_Evalues.R