|Author||David J. Lipman, William R. Pearson|
|Reference||Rapid and sensitive protein similarity searches|
|Short description||Alignment heuristic|
|Method||look-up table & agglomerative alignment|
FASTA is one of the heuristics to sequence alignments.
The algorithm works in four steps:
- Identify regions of highest density in each sequence comparison: Searching for identical tuples (length 2) between the sequence and a look-up-table of the database. A match is declared, if a certain number of consecutive identical tuples (ktup-value) was found between two sequences. The matches can be visualized by diagonals in a matrix comparing two sequences. The best 10 local regions selected from all the diagonals put together are then saved.
- Rescan the regions taken using the scoring matrices. trimming the ends of the region to include only those contributing to the highest score.
- Optimal alignment of initial regions as a combination of compatible regions with maximal score.
- Use a banded Smith-Waterman algorithm to calculate an optimal score for alignment.