(algorithm)

**Definition:**
A variable-length character coding based on the frequency of each character. The algorithm is similar to *Huffman coding*, but the trees are kept in the same order as the characters. Two *adjacent* trees with the least combined frequency are joined as subtrees of a new root. As with Huffman coding, that new tree is assigned the sum of the subtrees' frequencies. Repeat until all characters are in one tree.

**Aggregate child** (... is a part of or used in me.)

*full binary tree*, *priority queue*, *greedy algorithm*.

**See also**
*Huffman coding*.

*Note:
This may not produce a true Huffman code. Although encoded messages may be up to twice as long as true Huffman codes, order-preserving Huffman codes are "quite effective for most data." [Graef06, page 5] *

* There are algorithms to produce optimum order-preserving prefix codes, but they are more complicated.*

Author: PEB

Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 5 December 2006.

HTML page formatted Fri Feb 23 10:06:08 2018.

Cite this as:

Paul E. Black, "order-preserving Huffman coding", in
*Dictionary of Algorithms and Data Structures* [online], Vreda Pieterse and Paul E. Black, eds. 5 December 2006. (accessed TODAY)
Available from: https://www.nist.gov/dads/HTML/orderPreservingHuffmanCoding.html