NIST

string matching

(classic problem)

Definition: The problem of finding occurrence(s) of a pattern string within another string or body of text. There are many different algorithms for efficient searching.

Also known as exact string matching, string searching, text searching.

Generalization (I am a kind of ...)
search.

Specialization (... is a kind of me.)
brute force string search, Knuth-Morris-Pratt algorithm, Boyer-Moore, Zhu-Takaoka, quick search, deterministic finite automata string search, Karp-Rabin, Shift-Or, Aho-Corasick, Smith algorithm, strsrch.

See also string matching with errors, optimal mismatch, phonetic coding, string matching on ordered alphabets, suffix tree, inverted index.

Note: For large collections that are searched often, it may be far faster, though more complicated, to start with an inverted index. The name "exact string matching" is in contrast to string matching with errors.

Author: PEB

Implementation

Christian Charras' and Thierry Lecroq's Exact String Matching Algorithms (C), (C++ and Pascal), Strmat (C) - a collection of string matching and pattern discovery programs, (Fortran), (Fortran).
Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 2 November 2020.
HTML page formatted Mon Nov 2 12:36:42 2020.

Cite this as:
Paul E. Black, "string matching", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. 2 November 2020. (accessed TODAY) Available from: https://www.nist.gov/dads/HTML/stringMatching.html