Partition refinement
Encyclopedia
In the design of algorithm
s, partition refinement is a technique for representing a partition of a set
as a data structure
that allows the partition to be refined by splitting its sets into a larger number of smaller sets. In that sense it is dual to the union-find data structure
, which also maintains a partition into disjoint sets but in which the operations merge pairs of sets together. More specifically, a partition refinement algorithm maintains a family of disjoint sets ; at the start of the algorithm, this is just a single set containing all the elements in the data structure. At each step of the algorithm, a set is presented to the algorithm, and each set that contains members of is replaced by two sets, the intersection
and the difference
. Partition refinement forms a key component of several efficient algorithms on graphs
and finite automata.
of its elements, in a form such as a doubly linked list that allows for rapid deletion, and an object for each element that points to the set containing it. In addition, each set object should have an instance variable
that may point to a second set into which it is being split.
To perform a refinement operation, loop through the elements of . For each element , find the set containing , and check whether a second set for has already been formed. If not, create the second set and add to a list of the sets that are split by the operation. Then, regardless of whether a new second set was formed, remove from and add it to the second set.
Finally, after all elements of have been processed in this way, loop through , separating each current set from the second set that has been split from it, and report both of these sets as newly formed sets from the refinement operation.
The time to perform the refinement operations in this way is , independent of the number of elements or the total number of sets in the data structure.
. In this problem, one is given as input a deterministic finite automaton, and must find an equivalent automaton with as few states as possible. The algorithm maintains a partition of the states of the input automaton into subsets, with the property that any two states in different subsets must be mapped to different states of the output automaton; initially, there are two subsets, one containing all the accepting states and one containing the remaining states. At each step one of the subsets and one of the input symbols of the automaton are chosen, and the subsets of states are refined into states for which a transition labeled would lead to , and states for which an -transition would lead somewhere else. When a set that has already been chosen is split by a refinement, only one of the two resulting sets (the smaller of the two) needs to be chosen again; in this way, each state participates in the sets for refinement steps and the overall algorithm takes time , where is the number of initial states and is the size of the alphabet.
Partition refinement was applied by in an efficient implementation of the Coffman–Graham algorithm
for parallel scheduling. Sethi showed that it could be used to construct a lexicographically ordered topological sort of a given directed acyclic graph
in linear time; this lexicographic topological ordering is one of the key steps of the Coffman–Graham algorithm. In this application, the elements of the disjoint sets are vertices of the input graph and the sets used to refine the partition are sets of neighbors of vertices. Since the total number of neighbors of all vertices is just the number of edges in the graph, the algorithm takes time linear in the number of edges, its input size.
Partition refinement also forms a key step in lexicographic breadth-first search
, a graph search algorithm with applications in the recognition of chordal graph
s and several other important classes of graphs. Again, the disjoint set elements are vertices and the set represent sets of neighbors, so the algorithm takes linear time.
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s, partition refinement is a technique for representing a partition of a set
Partition of a set
In mathematics, a partition of a set X is a division of X into non-overlapping and non-empty "parts" or "blocks" or "cells" that cover all of X...
as a data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
that allows the partition to be refined by splitting its sets into a larger number of smaller sets. In that sense it is dual to the union-find data structure
Disjoint-set data structure
In computing, a disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number of disjoint subsets. A union-find algorithm is an algorithm that performs two useful operations on such a data structure:* Find: Determine which set a particular element...
, which also maintains a partition into disjoint sets but in which the operations merge pairs of sets together. More specifically, a partition refinement algorithm maintains a family of disjoint sets ; at the start of the algorithm, this is just a single set containing all the elements in the data structure. At each step of the algorithm, a set is presented to the algorithm, and each set that contains members of is replaced by two sets, the intersection
Intersection (set theory)
In mathematics, the intersection of two sets A and B is the set that contains all elements of A that also belong to B , but no other elements....
and the difference
Complement (set theory)
In set theory, a complement of a set A refers to things not in , A. The relative complement of A with respect to a set B, is the set of elements in B but not in A...
. Partition refinement forms a key component of several efficient algorithms on graphs
Graph theory
In mathematics and computer science, graph theory is the study of graphs, mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...
and finite automata.
Data structure
A partition refinement algorithm may be implemented by maintaining an object for each set that stores a collectionCollection (computing)
In computer science, a collection is a grouping of some variable number of data items that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting...
of its elements, in a form such as a doubly linked list that allows for rapid deletion, and an object for each element that points to the set containing it. In addition, each set object should have an instance variable
Instance variable
In object-oriented programming with classes, an instance variable is a variable defined in a class , for which each object of the class has a separate copy. They live in memory for the life of the object....
that may point to a second set into which it is being split.
To perform a refinement operation, loop through the elements of . For each element , find the set containing , and check whether a second set for has already been formed. If not, create the second set and add to a list of the sets that are split by the operation. Then, regardless of whether a new second set was formed, remove from and add it to the second set.
Finally, after all elements of have been processed in this way, loop through , separating each current set from the second set that has been split from it, and report both of these sets as newly formed sets from the refinement operation.
The time to perform the refinement operations in this way is , independent of the number of elements or the total number of sets in the data structure.
Applications
Possibly the first application of partition refinement was in an algorithm by for DFA minimizationDfa minimization
In computer science, more specifically in the branch of automata theory, DFA minimization is the task of transforming a given deterministic finite automaton into an equivalent DFA that has minimum number of states. Here, two DFAs are called equivalent if they describe the same regular language...
. In this problem, one is given as input a deterministic finite automaton, and must find an equivalent automaton with as few states as possible. The algorithm maintains a partition of the states of the input automaton into subsets, with the property that any two states in different subsets must be mapped to different states of the output automaton; initially, there are two subsets, one containing all the accepting states and one containing the remaining states. At each step one of the subsets and one of the input symbols of the automaton are chosen, and the subsets of states are refined into states for which a transition labeled would lead to , and states for which an -transition would lead somewhere else. When a set that has already been chosen is split by a refinement, only one of the two resulting sets (the smaller of the two) needs to be chosen again; in this way, each state participates in the sets for refinement steps and the overall algorithm takes time , where is the number of initial states and is the size of the alphabet.
Partition refinement was applied by in an efficient implementation of the Coffman–Graham algorithm
Coffman–Graham algorithm
In job shop scheduling and graph drawing, the Coffman–Graham algorithm is an algorithm, named after Edward G. Coffman, Jr. and Ronald Graham, for arranging the elements of a partially ordered set into a sequence of levels...
for parallel scheduling. Sethi showed that it could be used to construct a lexicographically ordered topological sort of a given directed acyclic graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...
in linear time; this lexicographic topological ordering is one of the key steps of the Coffman–Graham algorithm. In this application, the elements of the disjoint sets are vertices of the input graph and the sets used to refine the partition are sets of neighbors of vertices. Since the total number of neighbors of all vertices is just the number of edges in the graph, the algorithm takes time linear in the number of edges, its input size.
Partition refinement also forms a key step in lexicographic breadth-first search
Lexicographic breadth-first search
In computer science, lexicographic breadth-first search or Lex-BFS is a linear time algorithm for ordering the vertices of a graph, that is used as part of other graph algorithms such as the recognition of chordal graphs and optimal coloring of distance-hereditary graphs...
, a graph search algorithm with applications in the recognition of chordal graph
Chordal graph
In the mathematical area of graph theory, a graph is chordal if each of its cycles of four or more nodes has a chord, which is an edge joining two nodes that are not adjacent in the cycle. An equivalent definition is that any chordless cycles have at most three nodes...
s and several other important classes of graphs. Again, the disjoint set elements are vertices and the set represent sets of neighbors, so the algorithm takes linear time.