TitleA linear time algorithm for finding all maximal scoring subsequences.
Publication TypeJournal Article
Year of Publication1999
AuthorsRuzzo WL, Tompa M
JournalProceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology
Pagination234-41
Date or Month Published1999
ISSN1553-0833
KeywordsAlgorithms, Models, Statistical, Sequence Analysis, DNA, Software, Time Factors
AbstractGiven a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguous subsequences having greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the high-scoring subsequences correspond to regions of unusual composition in a nucleic acid or protein sequence. For instance, Altschul, Karlin, and others have used this approach to identify transmembrane regions, DNA binding domains, and regions of high charge in proteins.
Downloadshttp://www.ncbi.nlm.nih.gov/pubmed/10786306?dopt=Abstract
Alternate JournalProc Int Conf Intell Syst Mol Biol
Citation Key1895
PubMed ID10786306