NIST

phonetic coding

(classic problem)

Definition: Code a string based on how it is pronounced.

Specialization (... is a kind of me.)
double metaphone, Jaro-Winkler, Caverphone, NYSIIS, soundex.

See also string matching with errors.

Note: Because spelling variants of names are common in English, algorithms have been developed to code names based on how they sound. Searching and matching is done by converting a name to some phonetic coding, and comparing codings. If I type "Hansen" into my electronic telephone book, it is useful for it to offer "Hanson" as a possible match.

Levenshtein distance and other measures or algorithms allowing for spelling errors usually have sophisticated matching routines, rather than preprocessing the names.

Author: PEB


Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 17 December 2004.
HTML page formatted Wed Mar 13 12:42:46 2019.

Cite this as:
Paul E. Black, "phonetic coding", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. 17 December 2004. (accessed TODAY) Available from: https://www.nist.gov/dads/HTML/phoneticCoding.html