directed acyclic word graph

(data structure)

Definition: (1) A directed acyclic graph representing the suffixes of a given string in which each edge is labeled with a character. The characters along a path from the root to a node are the substring which the node represents. (2) A finite state machine that recognizes a set of words.

Also known as DAWG.

See also compact DAWG, suffix tree, trie, Patricia tree, suffix automaton.

Note: A variant, the compact DAWG, labels the edges with strings, not just single characters. In the compact DAWG, any node which is an only child is merged with its parent and the edge labels are concatenated. Another variant is the morphic DAWG, where some substrings are coded with new characters, like a simplified Huffman coding.

Author: PEB


JohnPaul Adamovsky's Directed Acyclic Word Graph or DAWG page with introduction, explanation, C implementation, and notes on optimization.

More information

(1) Andrew W. Appel and Guy J. Jacobson, The World's Fastest Scrabble Program, CACM, 31(5):572-578, May 1988. Good description and comparison with a trie.

M. Crochemore and R. Verin, Direct Construction of Compact Directed Acyclic Word Graphs, 8th Annual Symposium, CPM 97, Aarhus, Denmark, 116-129, 1997.

(2) J. Aoe, K. Morimoto, M. Shishibori and K-H. Park, A Trie Compaction Algorithm for a Large Set of Keys, IEEE Trans. Knowledge and Data Engineering, 8(3):476-491, 1996.

D. Perrin, Finite Automata, in: J. van Leeuwen, ed., Handbook of Theoretical Computer Science, Elsevier, Amsterdam, Vol. A, 3-57, 1990.

Go to the Dictionary of Algorithms and Data Structures home page.

If you have suggestions, corrections, or comments, please get in touch with Paul Black.

Entry modified 30 December 2011.
HTML page formatted Tue Jan 16 10:34:44 2018.

Cite this as:
Paul E. Black, "directed acyclic word graph", in Dictionary of Algorithms and Data Structures [online], Vreda Pieterse and Paul E. Black, eds. 30 December 2011. (accessed TODAY) Available from: