Difference between revisions of "Fabry:Sequence alignments (sequence searches and multiple alignments):Results"

From Bioinformatikpedia
(Comparing the Evalues)
m (Comparing the Evalues)
Line 61: Line 61:
 
[[File:Fabry_animation.gif]]
 
[[File:Fabry_animation.gif]]
 
|}
 
|}
Above you can see a histogram of the distribution of the E-values, when the search is performed with different methods.
+
Above you can see a histogram of the distribution of the E-values, for the search performed with different methods.
 
The R Script is based on Andrea's [[ARSA_search_protocol | R Script psiBlast.evalueHist.Rscript]]
 
The R Script is based on Andrea's [[ARSA_search_protocol | R Script psiBlast.evalueHist.Rscript]]
   

Revision as of 07:20, 5 May 2012

Please see Task 2 for our scripts and line of action on this topic.

Sequence searches

Blast

GO terms of P06280 and each BLAST hit (with Evalue <= 0.003) compared. Percentage terms shared, in relation to number of GO terms of P06280 (AGAL_HUMAN) in the upper picture, in the secon picture in relation to number of each hit
Histogram of the logarithmic E-values of the BLAST hits for P06280
Histogram of the positive amino acids of the pairwise alignments of the BLAST hits for P06280
Histogram of the identical amino acids of the pairwise alignments of the BLAST hits for P06280
Histogram of the length of the BLAST hits for P06280

Number of hits with Evalue < 0.003: 663

Psi-Blast

HHblits

2 iterations - default

GO terms of P06280 and each HHblits hit (with Evalue < 0.003) compared. Percentage terms shared, in relation to number of GO terms of P06280 (AGAL_HUMAN) in the upper picture, in the secon picture in relation to number of each hit
Histogram of the logarithmic E-values of the HHblits hits for P06280

Number of hits with Evalue < 0.003: 326

Histogram of the similarity of the HHblits hits to P06280
Histogram of the identical amino acids of the pairwise alignments of the HHblits hits for P06280

8 iterations

GO terms of P06280 and each HHblits hit (with Evalue < 0.003 and 8 iterations) compared. Percentage terms shared, in relation to number of GO terms of P06280 (AGAL_HUMAN) in the upper picture, in the secon picture in relation to number of each hit
Histogram of the logarithmic E-values of the HHblits search with 8 iterations for P06280

Number of hits with Evalue < 0.003: 729

Histogram of the similarity of the BLAST hits (search with 8 iterations) to P06280
Histogram of the identical amino acids of the pairwise alignments of the BLAST hits (search with 8 iterations) for P06280

Comparison

Comparing the hits

Venn diagram of proteins found by BLAST, HHBlits and HHBlits with 8 iterations
Venn diagram of the proteins found by BLAST, Psi-BLAST (10 iterations and E-value cutoff 10e-10 ) and HHBlits with 8 iterations
Venn diagram of the first 100 proteins found by BLAST, HHBlits and HHBlits with 8 iterations
Venn diagram of the first 100 proteins found by BLAST, Psi-BLAST and HHBlits with 8 iterations

Venn diagrams created with Oliveros, J.C. (2007) VENNY. An interactive tool for comparing lists with Venn Diagrams.

Comparing the Evalues

Fabry animation.gif

Above you can see a histogram of the distribution of the E-values, for the search performed with different methods. The R Script is based on Andrea's R Script psiBlast.evalueHist.Rscript

As one can clearly see, the number of significant hits in the Psi-Blast search exceeds the number of hits in any of the other two searches by far. Also this histogram looks more like a normal distribution with mean -80, while the histograms of the BLAST and the HHBlits search do not, but rather tend towards the zero point. The least hits are generated by the "ordinary" BLAST search (663), the Psi-BLAST search finds the ten-fold number (6868). Thus in respect to the E-values I would prefer using Psi-Blast.