(algorithm)

**Definition:**
A minimal variable-length character coding based on the frequency of each character. First, each character becomes a one-node *binary tree*, with the character as the only *node*. The character's frequency is the tree's frequency. Two trees with the least frequencies are joined as the subtrees of a new root that is assigned the sum of their frequencies. Repeat until all characters are in one tree. One code bit represents each *level*. Thus more frequent characters are near the *root* and are coded with few bits, and rare characters are far from the root and are coded with many bits.

**Also known as** static Huffman coding.

**Generalization** (I am a kind of ...)

*greedy algorithm*.

**Specialization** (... is a kind of me.)

*adaptive Huffman coding*, *k-ary Huffman coding*.

**Aggregate child** (... is a part of or used in me.)

*coding tree*, *full binary tree*, *priority queue*.

**See also**
*order-preserving Huffman coding*, *arithmetic coding*, *optimal merge*, *Shannon-Fano coding*.

*Note:
The worst case for Huffman coding (or, equivalently, the longest Huffman coding for a set of characters) is when the distribution of frequencies follows the Fibonacci numbers. For this and other relations see Alex Vinokur's note on Fibonacci numbers, Lucas numbers and Huffman codes. *

Joining trees by frequency is the same as merging sequences by length in *optimal merge*. See the example there. Since a node with only one child is not optimal, any Huffman coding corresponds to a *full binary tree*.

Huffman coding is one of many lossless compression algorithms. This algorithm produces a *prefix code*.

* Shannon-Fano is a minimal prefix code. Huffman is optimal for character coding (one character-one code word) and simple to program. Arithmetic coding is even more compact, since it can allocate fractional bits, but is more complicated and patents cover some uses.*

Author: PEB

A survey on data compression, John Morris' explanation, example, and analysis. Wikipedia's encyclopedic article on Huffman coding.

**David A. Huffman**, *A Method for The Construction of Minimum Redundancy Codes*, Proceedings of IRE, 40(9):1098-1101, September 1952.

Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 1 April 2019.

HTML page formatted Thu Apr 4 11:16:52 2019.

Cite this as:

Paul E. Black, "Huffman coding", in
*Dictionary of Algorithms and Data Structures* [online], Paul E. Black, ed. 1 April 2019. (accessed TODAY)
Available from: https://www.nist.gov/dads/HTML/huffmanCoding.html