GADDAG
Encyclopedia
A GADDAG is a data structure
presented by Steven Gordon in 1994, for use in generating moves for Scrabble
and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a Directed Acyclic Word Graph (DAWG)
, such as the one used by Maven
. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries.
Quackle uses a GADDAG to generate moves.
, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG
.
For example, for the word "explain," a GADDAG will contain direct paths to the strings "e৳xplain," "xe৳plain," "pxe৳lain," "lpxe৳ain," "alpxe৳in," "ialpxe৳n," and "nialpxe."
The DAWG algorithms speed up and take advantage of the second and third constraint: the DAWG is build around the dictionary, and you only traverse it using tiles from your rack. It fails, however, to address the first constraint: supposing you want to 'hook into' the letter P in HARPY, and the board has 2 spaces before the P, you must search the dictionary for all words containing letters from your rack where the third letter is P. This is non-deterministic when searching through the DAWG, as many searches through the trie will be fruitless.
This is addressed by the GADDAG's storage of prefixes: by traversing the P branch of a GADDAG, you see all words that have a P somewhere in their composition, and can "travel up" the prefix to form the word with tiles in your rack. To use the example from the definition section, to hook onto the P, you will immediately see a path for pxe৳lain. You add tiles from your rack while appropriate, traveling backwards through the word until you encounter the ৳, meaning you've completed the prefix. You complete the move by adding to the front of the word with the suffix.
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
presented by Steven Gordon in 1994, for use in generating moves for Scrabble
Scrabble
Scrabble is a word game in which two to four players score points by forming words from individual lettered tiles on a game board marked with a 15-by-15 grid. The words are formed across and down in crossword fashion and must appear in a standard dictionary. Official reference works provide a list...
and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a Directed Acyclic Word Graph (DAWG)
Dawg
Dawg may refer to:* an African American Vernacular English word used to apply to close friends* directed acyclic word graph, a computer science data structure* Dawg, the nickname of American mandolinist David Grisman...
, such as the one used by Maven
Maven (Scrabble)
Maven is the current best known artificial intelligence Scrabble player, created by Brian Sheppard. It has been used in official licensed Hasbro Scrabble games, and the downloadable Funkitron Scrabble.-Game Phases:...
. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries.
Quackle uses a GADDAG to generate moves.
Description
A GADDAG is a specialization of a TrieTrie
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...
, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG
Dawg
Dawg may refer to:* an African American Vernacular English word used to apply to close friends* directed acyclic word graph, a computer science data structure* Dawg, the nickname of American mandolinist David Grisman...
.
Definition
For any word in a dictionary that is formed by a non-empty prefix x and a suffix y, a GADDAG contains a direct, deterministic path for any string REV(x)৳y, where ৳ is a concatenation operator.For example, for the word "explain," a GADDAG will contain direct paths to the strings "e৳xplain," "xe৳plain," "pxe৳lain," "lpxe৳ain," "alpxe৳in," "ialpxe৳n," and "nialpxe."
Use in Move Generation
Any move-generation algorithm must adhere to three types of constraints:- Board constraints: You may only build by 'hooking' onto existing letters of the board. Additionally, you may only place tiles on empty squares.
- Rack constraints: You may only place tiles with letters on your rack.
- Dictionary constraint: All words resulting from the placement of tiles exist in the game's dictionary.
The DAWG algorithms speed up and take advantage of the second and third constraint: the DAWG is build around the dictionary, and you only traverse it using tiles from your rack. It fails, however, to address the first constraint: supposing you want to 'hook into' the letter P in HARPY, and the board has 2 spaces before the P, you must search the dictionary for all words containing letters from your rack where the third letter is P. This is non-deterministic when searching through the DAWG, as many searches through the trie will be fruitless.
This is addressed by the GADDAG's storage of prefixes: by traversing the P branch of a GADDAG, you see all words that have a P somewhere in their composition, and can "travel up" the prefix to form the word with tiles in your rack. To use the example from the definition section, to hook onto the P, you will immediately see a path for pxe৳lain. You add tiles from your rack while appropriate, traveling backwards through the word until you encounter the ৳, meaning you've completed the prefix. You complete the move by adding to the front of the word with the suffix.
See Also
- Directed acyclic word graphDirected acyclic word graphIn computer science, a directed acyclic word graph is a data structure that represents a set of strings, and allows for a query operation that tests whether a given string belongs to the set in time proportional to its length...
(DAWG) - Suffix TreeSuffix treeIn computer science, a suffix tree is a data structure that presents the suffixes of a given string in a way that allows for a particularly fast implementation of many important string operations.The suffix tree for a string S is a tree whose edges are labeled with strings, such that each suffix...
- TrieTrieIn computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...
- ScrabbleScrabbleScrabble is a word game in which two to four players score points by forming words from individual lettered tiles on a game board marked with a 15-by-15 grid. The words are formed across and down in crossword fashion and must appear in a standard dictionary. Official reference works provide a list...
- Prefix Hash TreePrefix hash treeA prefix hash tree is a distributed data structure that enables more sophisticated queries over a distributed hash table . The prefix hash tree uses the lookup interface of a DHT to construct a trie-based data structure that is both efficient , and resilient A prefix hash tree (PHT) is a...