FASTA

From Bioinformatikpedia
Revision as of 16:30, 17 August 2011 by Meier (talk | contribs)

Basic Information

Author David J. Lipman, William R. Pearson
Year 1985
Reference Rapid and sensitive protein similarity searches
Short description Alignment heuristic
Method look-up table & agglomerative alignment

FASTA is one of the heuristics to sequence alignments.




Details

The algorithm works in four steps:

  • Identify regions of highest density in each sequence comparison: Searching for identical tuples (length 2) between the sequence and a look-up-table of the database. A match is declared, if a certain number of consecutive identical tuples (ktup-value) was found between two sequences. The matches can be visualized by diagonals in a matrix comparing two sequences. The best 10 local regions selected from all the diagonals put together are then saved.
  • Rescan the regions taken using the scoring matrices. trimming the ends of the region to include only those contributing to the highest score.
  • Optimal alignment of initial regions as a combination of compatible regions with maximal score.
  • Use a banded Smith-Waterman algorithm to calculate an optimal score for alignment.