Fabry:Sequence alignments (sequence searches and multiple alignments)/Journal

From Bioinformatikpedia
Revision as of 19:09, 6 May 2012 by Rackersederj (talk | contribs) (Blast)

Please see Task 2 Results for our results on this topic. Please see also Task 2 Scripts for the used scripts.


Sequence searches

Blast

We searched the "big80" database with Blast with the following command:

blastall -p blastp -d /mnt/project/pracstrucfunc12/data/big/big_80 -i P06280.fasta -m 0 -o blastsearch_default.out -v 700 -b 700
perl extract_ids_blast.pl blastsearch_default.out
perl ../download-annotation.pl blastsearch_default_ids.txt
perl ../compare_GO_terms.pl P06280 blastsearch_default_ids_GOterms.tsv
perl parse_blast.pl blastsearch_default.out

Psi-Blast

Iterations:	 2
Evalue:		0.002

real	3m30.256s
user	2m58.070s
sys	0m13.360s

Iterations:	 2
Evalue:		0.000000001

real	3m8.507s
user	3m5.180s
sys	0m2.400s

Iterations:	 2
Evalue:		0.0000000001

real	3m10.271s
user	3m7.620s
sys	0m2.190s

Iterations:	 10
Evalue:		0.002

real	15m29.218s
user	15m8.910s
sys	0m12.730s

Iterations:	 10
Evalue:		0.000000001

real	16m33.748s
user	16m12.500s
sys	0m13.080s

Iterations:	 10
Evalue:		0.0000000001

real	16m20.137s
user	15m55.910s
sys	0m13.190s

HHblits / HHsearch

We searched the "big80" database with HHblits using the default settings and also with the maximum number of possible iterations (8) with the following commands:

time hhblits -i ../P06280.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -e 0.003 -o hhblits_default.out -E 0.003  -z 700
./extract_ids_hhblits.sh hhblits_default.out
perl ../download-annotation.pl hhblits_default_ids.txt
perl ../compare_GO_terms.pl P06280 hhblits_default_ids_GOterms.tsv
perl parse_hhblits.pl hhblits_default.out

time hhblits -i ../P06280.fasta -d /mnt/project/pracstrucfunc12/data/hhblits/uniprot20_current -e 0.003 -o hhblits_n8_neu.out -E 0.003 -n 8 -z 800 -b 800
./extract_ids_hhblits.sh hhblits_n8_neu.out
perl ../download-annotation.pl hhblits_n8_neu_ids.txt
perl ../compare_GO_terms.pl P06280 hhblits_n8_neu_ids_GOterms.tsv
perl parse_hhblits.pl hhblits_n8_neu.out

R CMD BATCH hist_hhblits.R

Comparison

Venn diagrams created with Oliveros, J.C. (2007) VENNY. An interactive tool for comparing lists with Venn Diagrams.

  >R CMD BATCH all_Evalues.R


Multiple sequence alignments

Results

The following commands were used to generate the multiple sequence alignments. The pictures were obtained by using jalview

clustalw -infile=fabry_dataset_99.fasta -outfile=msa/clustalw_fabry_dataset_99.msa &
clustalw -infile=fabry_dataset_89.fasta -outfile=msa/clustalw_fabry_dataset_89.msa &
clustalw -infile=fabry_dataset_59.fasta -outfile=msa/clustalw_fabry_dataset_59.msa &
clustalw -infile=fabry_dataset_39.fasta -outfile=msa/clustalw_fabry_dataset_39.msa &

muscle -in fabry_dataset_99.fasta -out msa/muscle_fabry_dataset_99.msa &
muscle -in fabry_dataset_89.fasta -out msa/muscle_fabry_dataset_89.msa &
muscle -in fabry_dataset_59.fasta -out msa/muscle_fabry_dataset_59.msa &
muscle -in fabry_dataset_39.fasta -out msa/muscle_fabry_dataset_39.msa &

/mnt/opt/T-Coffee/bin/t_coffee -seq fabry_dataset_99.fasta  -outfile msa/tcoffee_fabry_dataset_99.msa  &
/mnt/opt/T-Coffee/bin/t_coffee -seq fabry_dataset_89.fasta  -outfile msa/tcoffee_fabry_dataset_89.msa  &
/mnt/opt/T-Coffee/bin/t_coffee -seq fabry_dataset_59.fasta  -outfile msa/tcoffee_fabry_dataset_59.msa  &
/mnt/opt/T-Coffee/bin/t_coffee -seq fabry_dataset_39.fasta  -outfile msa/tcoffee_fabry_dataset_39.msa  &

We counted the number of gaps and conserved columns with the perl script "countGaps.pl".

perl countGaps.pl msa/clustalw_fabry_dataset_0.msa  > msa/clustalw_fabry_dataset_0.counts
perl countGaps.pl msa/clustalw_fabry_dataset_40.msa > msa/clustalw_fabry_dataset_40.counts
perl countGaps.pl msa/clustalw_fabry_dataset_61.msa > msa/clustalw_fabry_dataset_61.counts
 
perl countGaps.pl msa/muscle_fabry_dataset_0.msa  > msa/muscle_fabry_dataset_0.counts
perl countGaps.pl msa/muscle_fabry_dataset_40.msa > msa/muscle_fabry_dataset_40.counts
perl countGaps.pl msa/muscle_fabry_dataset_61.msa > msa/muscle_fabry_dataset_61.counts

perl countGaps.pl msa/tcoffe_fabry_dataset_0.msa  > msa/tcoffe_fabry_dataset_0.counts
perl countGaps.pl msa/tcoffe_fabry_dataset_40.msa > msa/tcoffe_fabry_dataset_40.counts
perl countGaps.pl msa/tcoffe_fabry_dataset_61.msa > msa/tcoffe_fabry_dataset_61.counts

perl countGaps.pl msa/3Dcoffee_fabry_dataset_0.msa  > msa/3Dcoffe_fabry_dataset_0.counts
perl countGaps.pl msa/3Dcoffee_fabry_dataset_40.msa > msa/3Dcoffe_fabry_dataset_40.counts
perl countGaps.pl msa/3Dcoffee_fabry_dataset_61.msa > msa/3Dcoffe_fabry_dataset_61.counts