Difference between revisions of "Sequence Alignments TSD"

Revision as of 20:40, 3 May 2012

Sequence searches

There are several alignment methods provided by various initiatives, who tackle the problem of sequence searches. Here some of them are applied for the Hex A protein and analyzed. For the searches non redundant protein databases are used. The outputs are adapted to each other and put in comparison in order to determine the best results. A protocol containing the basic steps taken is available.

Blast

The first sequence similarity search with the Hex A protein was run with Blast. Here the default settings which provide an output of 250 alignments cover a just a small fraction of similar proteins as the e-value of the last hit receives a significantly low e-value of 3e-48. This shows that the sequence search can be continued and more sequences added safely. This is especially important because there are sequences with a comparably low sequence identity of 20% needed for the multiple sequence alignment. The sequence identity correlates with the hit rank of blast, meaning that with a worse sequence identity the e-value is overall expected to increase. To manage between quality deterioration with a worse e-value and on the other hand the need for low sequence identity a limitation of the output sequences was chosen of 1200. Here the e-value does not go beyond 1e-4 and thus the quality of the alignment is still sufficient but there are also sequences aligned with the required identity.

PSI-Blast

Iterations	2	2	10	10
E-value	0.002	10E-10	0.002	10E-10
BIG80	3m53	4m3	18m57	21m9
BIG	17m19	11m13	16m39	11m13

Table 1: Different performances of PSI-Blast.

@@ Line 3: / Line 3: @@
 Here some of them are applied for the Hex A protein and analyzed. For the searches non redundant protein databases are used. The outputs are adapted to each other and put in comparison in order to determine the best results. A [[Sequence Alignments Protocol TSD|protocol]] containing the basic steps taken is available.
 ===Blast===
+The first sequence similarity search with the Hex A protein was run with Blast. Here the default settings which provide an output of 250 alignments cover a just a small fraction of similar proteins as the e-value of the last hit receives a significantly low e-value of 3e-48. This shows that the sequence search can be continued and more sequences added safely. This is especially important because there are sequences with a comparably low sequence identity of 20% needed for the multiple sequence alignment. The sequence identity correlates with the hit rank of blast, meaning that with a worse  sequence identity the e-value is overall expected to increase. To manage between quality deterioration with a worse e-value and on the other hand the need for low sequence identity a limitation of the output sequences was chosen of 1200. Here the e-value does not go beyond 1e-4 and thus the quality of the alignment is still sufficient but there are also sequences aligned with the required identity.
 ===PSI-Blast===

Difference between revisions of "Sequence Alignments TSD"

Revision as of 20:40, 3 May 2012

Contents

Sequence searches

Blast

PSI-Blast

HHBlits

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools