Definition: (1) The smallest number of insertions, deletions, and substitutions required to change one string or tree into another. (2) A Θ(m × n) algorithm to compute the distance between strings, where m and n are the lengths of the strings.
Also known as edit distance.
Generalization (I am a kind of ...)
string matching with errors.
Aggregate child (... is a part of or used in me.)
See also double metaphone, soundex, Jaro-Winkler, Hamming distance.
Note: (1) From Algorithms and Theory of Computation Handbook, page 14-35, Copyright © 1999 by CRC Press LLC. Appearing in the Dictionary of Computer Science, Engineering and Technology, Copyright © 2000 CRC Press LLC.
Wikipedia entry which has links to implementations. March 2003 pictures of Levenshtein at a reception.
Vladimir I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Doklady Akademii Nauk SSSR, 163(4):845-848, 1965 (Russian). English translation in Soviet Physics Doklady, 10(8):707-710, 1966.
(Doklady is Russian for "Report". Sometimes transliterated in English as Doclady or Dokladi.)
If you have suggestions, corrections, or comments, please get in touch with Paul E. Black.
Entry modified 14 August 2008.
HTML page formatted Mon Jul 16 11:56:53 2012.
Cite this as:
Algorithms and Theory of Computation Handbook, CRC Press LLC, 1999, "Levenshtein distance", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed., U.S. National Institute of Standards and Technology. 14 August 2008. (accessed TODAY) Available from: http://www.nist.gov/dads/HTML/Levenshtein.html