Pseudoforest
Encyclopedia
In graph theory
, a pseudoforest is an undirected graph in which every connected component
has at most one cycle
. That is, it is a system of vertices
and edges connecting pairs of vertices, such that no two cycles of consecutive edges share any vertex with each other, nor can any two cycles be connected to each other by a path of consecutive edges. A pseudotree is a connected pseudoforest.
The names are justified by analogy to the more commonly studied trees
and forests. (A tree is a connected graph with no cycles; a forest is a disjoint union of trees.) Gabow and Tarjan attribute the naming of pseudoforests to Dantzig's 1963 book on linear programming
, in which pseudoforests arise in the solution of certain network flow
problems. Pseudoforests also form graph-theoretic models of functions and occur in several algorithm
ic problems. Pseudoforests are sparse graphs – they have very few edges relative to their number of vertices – and their matroid
structure allows several other families of sparse graphs to be decomposed as unions of forests and pseudoforests.
and edges such that each edge has two vertices (which may coincide) as endpoints. That is, we allow multiple edges (edges with the same pair of endpoints) and loops (edges whose two endpoints are the same vertex). A subgraph of a graph is the graph formed by any subsets of its vertices and edges such that each edge in the edge subset has both endpoints in the vertex subset.
A connected component
of an undirected graph is the subgraph consisting of the vertices and edges that can be reached by following edges from a single given starting vertex. A graph is connected if every vertex or edge is reachable from every other vertex or edge. A cycle
in an undirected graph is a connected subgraph in which each vertex is incident to exactly two edges, or is a loop.
A pseudoforest is an undirected graph in which each connected component contains at most one cycle. Equivalently, it is an undirected graph in which each connected component has no more edges than vertices. The components that have no cycles are just trees
, while the components that have a single cycle within them are called 1-trees or unicyclic graphs. That is, a 1-tree is a connected graph containing exactly one cycle. A pseudoforest with a single connected component (usually called a pseudotree, although some authors define a pseudotree to be a 1-tree) is either a tree or a 1-tree; in general a pseudoforest may have multiple connected components as long as all of them are trees or 1-trees.
If one removes from a 1-tree one of the edges in its cycle, the result is a tree. Reversing this process, if one augments a tree by connecting any two of its vertices by a new edge, the result is a 1-tree; the path in the tree connecting the two endpoints of the added edge, together with the added edge itself, form the 1-tree's unique cycle. If one augments a 1-tree by adding an edge that connects one of its vertices to a newly added vertex, the result is again a 1-tree, with one more vertex; an alternative method for constructing 1-trees is to start with a single cycle and then repeat this augmentation operation any number of times. The edges of any 1-tree can be partitioned in a unique way into two subgraphs, one of which is a cycle and the other of which is a forest, such that each tree of the forest contains exactly one vertex of the cycle.
Certain more specific types of pseudoforests have also been studied.
s. Like an undirected graph, a directed graph consists of vertices and edges, but each edge is directed from one of its endpoints to the other endpoint. A directed pseudoforest is a directed graph in which each vertex has at most one outgoing edge; that is, it has outdegree at most one. A directed 1-forest – most commonly called a functional graph (see below), sometimes maximal directed pseudoforest – is a directed graph in which each vertex has outdegree exactly one. If D is a directed pseudoforest, the undirected graph formed by removing the direction from each edge of D is an undirected pseudoforest.
Moving from individual graphs to graph families, if a family of graphs has the property that every subgraph of a graph in the family is also in the family, and every graph in the family has at most as many edges as vertices, then the family contains only pseudoforests. For instance, every subgraph of a thrackle (a graph drawn
so that every pair of edges has one point of intersection) is also a thrackle, so Conway's conjecture
that every thrackle has at most as many edges as vertices can be restated as saying that every thrackle is a pseudoforest. A more precise characterization is that, if the conjecture is true, then the thrackles are exactly the pseudoforests with no four-vertex cycle and at most one odd cycle.
Streinu and Theran generalize the sparsity conditions defining pseudoforests: they define a graph as being (k,l)-sparse if every nonempty subgraph with n vertices has at most kn − l edges, and (k,l)-tight if it is (k,l)-sparse and has exactly kn − l edges. Thus, the pseudoforests are the (1,0)-sparse graphs, and the maximal pseudoforests are the (1,0)-tight graphs. Several other important families of graphs may be defined from other values of k and l,
and when l ≤ k the (k,l)-sparse graphs may be characterized as the graphs formed as the edge-disjoint union of l forests and k − l pseudoforests.
Almost every sufficiently sparse random graph
is pseudoforest. That is, if c is a constant with 0 < c < 1/2, and Pc(n) is the probability that choosing uniformly at random among the n-vertex graphs with cn edges results in a pseudoforest, then Pc(n) tends to one in the limit for large n. However, for c > 1/2, almost every random graph with cn edges has a large component that is not unicyclic.
The values for n up to 18 can be found in sequence of the On-Line Encyclopedia of Integer Sequences
.
The number of maximal directed pseudoforests on n vertices, allowing self-loops, is nn, because for each vertex there are n possible endpoints for the outgoing edge. André Joyal
used this fact to provide a bijective proof
of Cayley's formula, that the number of undirected trees on n nodes is nn − 2, by finding a bijection between maximal directed pseudoforests and undirected trees with two distinguished nodes. If self-loops are not allowed, the number of maximal directed pseudoforests is instead (n − 1)n.
s are in some sense mathematically equivalent. Any function ƒ from a set X to itself (that is, an endomorphism
of X) can be interpreted as defining a directed pseudoforest which has an edge from x to y whenever ƒ(x) = y. The resulting directed pseudoforest is maximal, and may include self-loops
whenever some value x has ƒ(x) = x. Alternatively, omitting the self-loops produces a non-maximal pseudoforest. In the other direction, any maximal directed pseudoforest determines a function ƒ such that ƒ(x) is the target of the edge that goes out from x, and any non-maximal directed pseudoforest can be made maximal by adding self-loops and then converted into a function in the same way. For this reason, maximal directed pseudoforests are sometimes called functional graphs. Viewing a function as a functional graph provides a convenient language for describing properties that are not as easily described from the function-theoretic point of view; this technique is especially applicable to problems involving iterated function
s, which correspond to paths
in functional graphs.
Cycle detection, the problem of following a path in a functional graph to find a cycle in it, has applications in cryptography
and computational number theory
, as part of Pollard's rho algorithm
for integer factorization
and as a method for finding collisions in cryptographic hash function
s. In these applications, ƒ is expected to behave randomly; Flajolet
and Odlyzko
study the graph-theoretic properties of the functional graphs arising from randomly chosen mappings. In particular, a form of the birthday paradox
implies that, in a random functional graph with n vertices, the path starting from a randomly selected vertex will typically loop back on itself to form a cycle within O(√n) steps.
Martin, Odlyzko
, and Wolfram
investigate pseudoforests that model the dynamics of cellular automata
. These functional graphs, which they call state transition diagrams, have one vertex for each possible configuration that the ensemble of cells of the automaton can be in, and an edge connecting each configuration to the configuration that follows it according to the automaton's rule. One can infer properties of the automaton from the structure of these diagrams, such as the number of components, length of limiting cycles, depth of the trees connecting non-limiting states to these cycles, or symmetries of the diagram. For instance, any vertex with no incoming edge corresponds to a Garden of Eden pattern
and a vertex with a self-loop corresponds to a still life pattern.
Another early application of functional graphs is in the trains used to study Steiner triple system
s. The train of a triple system is a functional graph having a vertex for each possible triple of symbols; each triple pqr is mapped by ƒ to stu, where pqs, prt, and qru are the triples that belong to the triple system and contain the pairs pq, pr, and qr respectively. Trains have been shown to be a powerful invariant of triple systems although somewhat cumbersome to compute.
is a mathematical structure in which certain sets of elements are defined to be independent
, in such a way that the independent sets satisfy properties modeled after the properties of linear independence
in a vector space
. One of the standard examples of a matroid is the graphic matroid in which the independent sets are the sets of edges in forests of a graph; the matroid structure of forests is important in algorithms for computing the minimum spanning tree
of the graph. Analogously, we may define matroids from pseudoforests.
For any graph G = (V,E), we may define a matroid on the edges of G, in which a set of edges is independent if and only if it forms a pseudoforest; this matroid is known as the bicircular matroid
(or bicycle matroid) of G. The smallest dependent sets for this matroid are the minimal connected subgraphs of G that have more than one cycle, and these subgraphs are sometimes called bicycles. There are three possible types of bicycle: a theta graph has two vertices that are connected by three internally disjoint paths, a figure 8 graph consists of two cycles sharing a single vertex, and a handcuff graph is formed by two disjoint cycles connected by a path.
A graph is a pseudoforest if and only if it does not contain a bicycle as a subgraph.
of a pseudoforest by contracting some of its edges and deleting others produces another pseudoforest. Therefore, the family of pseudoforests is closed
under minors, and the Robertson–Seymour theorem
implies that pseudoforests can be characterized in terms of a finite set of forbidden minors, analogously to Wagner's theorem characterizing the planar graph
s as the graphs having neither the complete graph
K5 nor the complete bipartite graph
K3,3 as minors.
As discussed above, any non-pseudoforest graph contains as a subgraph a handcuff, figure 8, or theta graph; any handcuff or figure 8 graph may be contracted to form a butterfly graph
(five-vertex figure 8), and any theta graph may be contracted to form a diamond graph
(four-vertex theta graph), so any non-pseudoforest contains either a butterfly or a diamond as a minor, and these are the only minor-minimal non-pseudoforest graphs. Thus, a graph is a pseudoforest if and only if it does not have the butterfly or the diamond as a minor. If one forbids only the diamond but not the butterfly, the resulting larger graph family consists of the cactus graph
s and disjoint unions of multiple cactus graphs.
More simply, if multigraph
s with self-loops are considered, there is only one forbidden minor, a vertex with two loops.
of different types. In these problems, one is given as input a flow network
in which the vertices model each commodity and the edges model allowable conversions between one commodity and another. Each edge is marked with a capacity (how much of a commodity can be converted per unit time), a flow multiplier (the conversion rate between commodities), and a cost (how much loss or, if negative, profit is incurred per unit of conversion). The task is to determine how much of each commodity to convert via each edge of the flow network, in order to minimize cost or maximize profit, while obeying the capacity constraints and not allowing commodities of any type to accumulate unused. This type of problem can be formulated as a linear program, and solved using the simplex algorithm
. The intermediate solutions arising from this algorithm, as well as the eventual optimal solution, have a special structure: each edge in the input network is either unused or used to its full capacity, except for a subset of the edges, forming a spanning pseudoforest of the input network, for which the flow amounts may lie between zero and the full capacity. In this application, unicyclic graphs are also sometimes called augmented trees and maximal pseudoforests are also sometimes called augmented forests.
The minimum spanning pseudoforest problem involves finding a spanning pseudoforest of minimum weight in a larger edge-weighted graph G.
Due to the matroid structure of pseudoforests, minimum-weight maximal pseudoforests may be found by greedy algorithm
s similar to those for the minimum spanning tree
problem. However, Gabow and Tarjan found a more efficient linear-time approach in this case.
The pseudoarboricity of a graph G is defined by analogy to the arboricity
as the minimum number of pseudoforests into which its edges can be partitioned; equivalently, it is the minimum k such that G is (k,0)-sparse, or the minimum k such that the edges of G can be oriented to form a directed graph with outdegree at most k. Due to the matroid structure of pseudoforests, the pseudoarboricity may be computed in polynomial time.
A random
bipartite graph
with n vertices on each side of its bipartition, and with cn edges chosen independently at random from each of the n2 possible pairs of vertices, is a pseudoforest with high probability whenever c is a constant strictly less than one. This fact plays a key role in the analysis of cuckoo hashing
, a data structure for looking up key-value pairs by looking in one of two hash tables at locations determined from the key: one can form a graph, the "cuckoo graph", whose vertices correspond to hash table locations and whose edges link the two locations at which one of the keys might be found, and the cuckoo hashing algorithm succeeds in finding locations for all of its keys if and only if the cuckoo graph is a pseudoforest.
Pseudoforests also play a key role in parallel algorithm
s for graph coloring
and related problems.
Graph theory
In mathematics and computer science, graph theory is the study of graphs, mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...
, a pseudoforest is an undirected graph in which every connected component
Connected component (graph theory)
In graph theory, a connected component of an undirected graph is a subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices. For example, the graph shown in the illustration on the right has three connected components...
has at most one cycle
Cycle (graph theory)
In graph theory, the term cycle may refer to a closed path. If repeated vertices are allowed, it is more often called a closed walk. If the path is a simple path, with no repeated vertices or edges other than the starting and ending vertices, it may also be called a simple cycle, circuit, circle,...
. That is, it is a system of vertices
Vertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...
and edges connecting pairs of vertices, such that no two cycles of consecutive edges share any vertex with each other, nor can any two cycles be connected to each other by a path of consecutive edges. A pseudotree is a connected pseudoforest.
The names are justified by analogy to the more commonly studied trees
Tree (graph theory)
In mathematics, more specifically graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without cycles is a tree...
and forests. (A tree is a connected graph with no cycles; a forest is a disjoint union of trees.) Gabow and Tarjan attribute the naming of pseudoforests to Dantzig's 1963 book on linear programming
Linear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
, in which pseudoforests arise in the solution of certain network flow
Flow network
In graph theory, a flow network is a directed graph where each edge has a capacity and each edge receives a flow. The amount of flow on an edge cannot exceed the capacity of the edge. Often in Operations Research, a directed graph is called a network, the vertices are called nodes and the edges are...
problems. Pseudoforests also form graph-theoretic models of functions and occur in several algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
ic problems. Pseudoforests are sparse graphs – they have very few edges relative to their number of vertices – and their matroid
Matroid
In combinatorics, a branch of mathematics, a matroid or independence structure is a structure that captures the essence of a notion of "independence" that generalizes linear independence in vector spaces....
structure allows several other families of sparse graphs to be decomposed as unions of forests and pseudoforests.
Definitions and structure
We define an undirected graph to be a set of verticesVertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...
and edges such that each edge has two vertices (which may coincide) as endpoints. That is, we allow multiple edges (edges with the same pair of endpoints) and loops (edges whose two endpoints are the same vertex). A subgraph of a graph is the graph formed by any subsets of its vertices and edges such that each edge in the edge subset has both endpoints in the vertex subset.
A connected component
Connected component (graph theory)
In graph theory, a connected component of an undirected graph is a subgraph in which any two vertices are connected to each other by paths, and which is connected to no additional vertices. For example, the graph shown in the illustration on the right has three connected components...
of an undirected graph is the subgraph consisting of the vertices and edges that can be reached by following edges from a single given starting vertex. A graph is connected if every vertex or edge is reachable from every other vertex or edge. A cycle
Cycle (graph theory)
In graph theory, the term cycle may refer to a closed path. If repeated vertices are allowed, it is more often called a closed walk. If the path is a simple path, with no repeated vertices or edges other than the starting and ending vertices, it may also be called a simple cycle, circuit, circle,...
in an undirected graph is a connected subgraph in which each vertex is incident to exactly two edges, or is a loop.
A pseudoforest is an undirected graph in which each connected component contains at most one cycle. Equivalently, it is an undirected graph in which each connected component has no more edges than vertices. The components that have no cycles are just trees
Tree (graph theory)
In mathematics, more specifically graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without cycles is a tree...
, while the components that have a single cycle within them are called 1-trees or unicyclic graphs. That is, a 1-tree is a connected graph containing exactly one cycle. A pseudoforest with a single connected component (usually called a pseudotree, although some authors define a pseudotree to be a 1-tree) is either a tree or a 1-tree; in general a pseudoforest may have multiple connected components as long as all of them are trees or 1-trees.
If one removes from a 1-tree one of the edges in its cycle, the result is a tree. Reversing this process, if one augments a tree by connecting any two of its vertices by a new edge, the result is a 1-tree; the path in the tree connecting the two endpoints of the added edge, together with the added edge itself, form the 1-tree's unique cycle. If one augments a 1-tree by adding an edge that connects one of its vertices to a newly added vertex, the result is again a 1-tree, with one more vertex; an alternative method for constructing 1-trees is to start with a single cycle and then repeat this augmentation operation any number of times. The edges of any 1-tree can be partitioned in a unique way into two subgraphs, one of which is a cycle and the other of which is a forest, such that each tree of the forest contains exactly one vertex of the cycle.
Certain more specific types of pseudoforests have also been studied.
- A 1-forest, sometimes called a maximal pseudoforest, is a pseudoforest to which no more edges can be added without causing some component of the graph to contain multiple cycles. If a pseudoforest contains a tree as one of its components, it cannot be a 1-forest, for one can add either an edge connecting two vertices within that tree, forming a single cycle, or an edge connecting that tree to some other component. Thus, the 1-forests are exactly the pseudoforests in which every component is a 1-tree.
- The spanning pseudoforests of an undirected graph G are the pseudoforest subgraphs of G that have all the vertices of G. Such a pseudoforest need not have any edges, since for example the subgraph that has all the vertices of G and no edges is a pseudoforest (whose components are trees consisting of a single vertex).
- The maximal pseudoforests of G are the pseudoforest subgraphs of G that are not contained within any larger pseudoforest of G. A maximal pseudoforest of G is always a spanning pseudoforest, but not conversely. If G has no connected components that are trees, then its maximal pseudoforests are 1-forests, but if G does have a tree component, its maximal pseudoforests are not 1-forests. Stated precisely, in any graph G its maximal pseudoforests consist of every tree component of G, together with one or more disjoint 1-trees covering the remaining vertices of G.
Directed pseudoforests
Versions of these definitions are also used for directed graphDirected graph
A directed graph or digraph is a pair G= of:* a set V, whose elements are called vertices or nodes,...
s. Like an undirected graph, a directed graph consists of vertices and edges, but each edge is directed from one of its endpoints to the other endpoint. A directed pseudoforest is a directed graph in which each vertex has at most one outgoing edge; that is, it has outdegree at most one. A directed 1-forest – most commonly called a functional graph (see below), sometimes maximal directed pseudoforest – is a directed graph in which each vertex has outdegree exactly one. If D is a directed pseudoforest, the undirected graph formed by removing the direction from each edge of D is an undirected pseudoforest.
Number of edges
Every pseudoforest on a set of n vertices has at most n edges, and every maximal pseudoforest on a set of n vertices has exactly n edges. Conversely, if a graph G has the property that, for every subset S of its vertices, the number of edges in the induced subgraph of S is at most the number of vertices in S, then G is a pseudoforest. 1-trees can be defined as connected graphs with equally many vertices and edges.Moving from individual graphs to graph families, if a family of graphs has the property that every subgraph of a graph in the family is also in the family, and every graph in the family has at most as many edges as vertices, then the family contains only pseudoforests. For instance, every subgraph of a thrackle (a graph drawn
Graph drawing
Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, and bioinformatics...
so that every pair of edges has one point of intersection) is also a thrackle, so Conway's conjecture
Conjecture
A conjecture is a proposition that is unproven but is thought to be true and has not been disproven. Karl Popper pioneered the use of the term "conjecture" in scientific philosophy. Conjecture is contrasted by hypothesis , which is a testable statement based on accepted grounds...
that every thrackle has at most as many edges as vertices can be restated as saying that every thrackle is a pseudoforest. A more precise characterization is that, if the conjecture is true, then the thrackles are exactly the pseudoforests with no four-vertex cycle and at most one odd cycle.
Streinu and Theran generalize the sparsity conditions defining pseudoforests: they define a graph as being (k,l)-sparse if every nonempty subgraph with n vertices has at most kn − l edges, and (k,l)-tight if it is (k,l)-sparse and has exactly kn − l edges. Thus, the pseudoforests are the (1,0)-sparse graphs, and the maximal pseudoforests are the (1,0)-tight graphs. Several other important families of graphs may be defined from other values of k and l,
and when l ≤ k the (k,l)-sparse graphs may be characterized as the graphs formed as the edge-disjoint union of l forests and k − l pseudoforests.
Almost every sufficiently sparse random graph
Random graph
In mathematics, a random graph is a graph that is generated by some random process. The theory of random graphs lies at the intersection between graph theory and probability theory, and studies the properties of typical random graphs.-Random graph models:...
is pseudoforest. That is, if c is a constant with 0 < c < 1/2, and Pc(n) is the probability that choosing uniformly at random among the n-vertex graphs with cn edges results in a pseudoforest, then Pc(n) tends to one in the limit for large n. However, for c > 1/2, almost every random graph with cn edges has a large component that is not unicyclic.
Enumeration
A graph is simple if it has no self-loops and no multiple edges with the same endpoints. The number of simple 1-trees with n labelled vertices isThe values for n up to 18 can be found in sequence of the On-Line Encyclopedia of Integer Sequences
On-Line Encyclopedia of Integer Sequences
The On-Line Encyclopedia of Integer Sequences , also cited simply as Sloane's, is an online database of integer sequences, created and maintained by N. J. A. Sloane, a researcher at AT&T Labs...
.
The number of maximal directed pseudoforests on n vertices, allowing self-loops, is nn, because for each vertex there are n possible endpoints for the outgoing edge. André Joyal
André Joyal
André Joyal is a professor of mathematics at the Université du Québec à Montréal who works on category theory. Joyal was born in Drummondville . He has three children and lives in Montreal.- Main research :...
used this fact to provide a bijective proof
Bijective proof
In combinatorics, bijective proof is a proof technique that finds a bijective function f : A → B between two sets A and B, thus proving that they have the same number of elements, |A| = |B|. One place the technique is useful is where we wish to know the size of A, but can find no direct way of...
of Cayley's formula, that the number of undirected trees on n nodes is nn − 2, by finding a bijection between maximal directed pseudoforests and undirected trees with two distinguished nodes. If self-loops are not allowed, the number of maximal directed pseudoforests is instead (n − 1)n.
Graphs of functions
Directed pseudoforests and functionFunction (mathematics)
In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...
s are in some sense mathematically equivalent. Any function ƒ from a set X to itself (that is, an endomorphism
Endomorphism
In mathematics, an endomorphism is a morphism from a mathematical object to itself. For example, an endomorphism of a vector space V is a linear map ƒ: V → V, and an endomorphism of a group G is a group homomorphism ƒ: G → G. In general, we can talk about...
of X) can be interpreted as defining a directed pseudoforest which has an edge from x to y whenever ƒ(x) = y. The resulting directed pseudoforest is maximal, and may include self-loops
Loop (graph theory)
In graph theory, a loop is an edge that connects a vertex to itself. A simple graph contains no loops....
whenever some value x has ƒ(x) = x. Alternatively, omitting the self-loops produces a non-maximal pseudoforest. In the other direction, any maximal directed pseudoforest determines a function ƒ such that ƒ(x) is the target of the edge that goes out from x, and any non-maximal directed pseudoforest can be made maximal by adding self-loops and then converted into a function in the same way. For this reason, maximal directed pseudoforests are sometimes called functional graphs. Viewing a function as a functional graph provides a convenient language for describing properties that are not as easily described from the function-theoretic point of view; this technique is especially applicable to problems involving iterated function
Iterated function
In mathematics, an iterated function is a function which is composed with itself, possibly ad infinitum, in a process called iteration. In this process, starting from some initial value, the result of applying a given function is fed again in the function as input, and this process is repeated...
s, which correspond to paths
Path (graph theory)
In graph theory, a path in a graph is a sequence of vertices such that from each of its vertices there is an edge to the next vertex in the sequence. A path may be infinite, but a finite path always has a first vertex, called its start vertex, and a last vertex, called its end vertex. Both of them...
in functional graphs.
Cycle detection, the problem of following a path in a functional graph to find a cycle in it, has applications in cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
and computational number theory
Computational number theory
In mathematics, computational number theory, also known as algorithmic number theory, is the study of algorithms for performing number theoretic computations...
, as part of Pollard's rho algorithm
Pollard's rho algorithm
Pollard's rho algorithm is a special-purpose integer factorization algorithm. It was invented by John Pollard in 1975. It is particularly effective at splitting composite numbers with small factors.-Core ideas:...
for integer factorization
Integer factorization
In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer....
and as a method for finding collisions in cryptographic hash function
Cryptographic hash function
A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the hash value, such that an accidental or intentional change to the data will change the hash value...
s. In these applications, ƒ is expected to behave randomly; Flajolet
Philippe Flajolet
Philippe Flajolet was a French computer scientist.A former student of École Polytechnique, Philippe Flajolet received his Ph.D. in computer science from University Paris Diderot in 1973 and state doctorate from Paris-Sud 11 University in 1979...
and Odlyzko
Andrew Odlyzko
Andrew Michael Odlyzko is a mathematician and a former head of the University of Minnesota's Digital Technology Center.In the field of mathematics he has published extensively on analytic number theory, computational number theory, cryptography, algorithms and computational complexity,...
study the graph-theoretic properties of the functional graphs arising from randomly chosen mappings. In particular, a form of the birthday paradox
Birthday paradox
In probability theory, the birthday problem or birthday paradox pertains to the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 366. However, 99%...
implies that, in a random functional graph with n vertices, the path starting from a randomly selected vertex will typically loop back on itself to form a cycle within O(√n) steps.
Martin, Odlyzko
Andrew Odlyzko
Andrew Michael Odlyzko is a mathematician and a former head of the University of Minnesota's Digital Technology Center.In the field of mathematics he has published extensively on analytic number theory, computational number theory, cryptography, algorithms and computational complexity,...
, and Wolfram
Stephen Wolfram
Stephen Wolfram is a British scientist and the chief designer of the Mathematica software application and the Wolfram Alpha computational knowledge engine.- Biography :...
investigate pseudoforests that model the dynamics of cellular automata
Cellular automaton
A cellular automaton is a discrete model studied in computability theory, mathematics, physics, complexity science, theoretical biology and microstructure modeling. It consists of a regular grid of cells, each in one of a finite number of states, such as "On" and "Off"...
. These functional graphs, which they call state transition diagrams, have one vertex for each possible configuration that the ensemble of cells of the automaton can be in, and an edge connecting each configuration to the configuration that follows it according to the automaton's rule. One can infer properties of the automaton from the structure of these diagrams, such as the number of components, length of limiting cycles, depth of the trees connecting non-limiting states to these cycles, or symmetries of the diagram. For instance, any vertex with no incoming edge corresponds to a Garden of Eden pattern
Garden of Eden pattern
In a cellular automaton, a Garden of Eden configuration is a configuration that cannot appear on the lattice after one time step, no matter what the initial configuration. In other words, these are the configurations with no predecessors....
and a vertex with a self-loop corresponds to a still life pattern.
Another early application of functional graphs is in the trains used to study Steiner triple system
Steiner system
250px|right|thumbnail|The [[Fano plane]] is an S Steiner triple system. The blocks are the 7 lines, each containing 3 points. Every pair of points belongs to a unique line....
s. The train of a triple system is a functional graph having a vertex for each possible triple of symbols; each triple pqr is mapped by ƒ to stu, where pqs, prt, and qru are the triples that belong to the triple system and contain the pairs pq, pr, and qr respectively. Trains have been shown to be a powerful invariant of triple systems although somewhat cumbersome to compute.
Bicircular matroid
A matroidMatroid
In combinatorics, a branch of mathematics, a matroid or independence structure is a structure that captures the essence of a notion of "independence" that generalizes linear independence in vector spaces....
is a mathematical structure in which certain sets of elements are defined to be independent
Independence system
In combinatorial mathematics, an independence system S is a pair , where E is a finite set and I is a collection of subsets of E with the following properties:...
, in such a way that the independent sets satisfy properties modeled after the properties of linear independence
Linear independence
In linear algebra, a family of vectors is linearly independent if none of them can be written as a linear combination of finitely many other vectors in the collection. A family of vectors which is not linearly independent is called linearly dependent...
in a vector space
Vector space
A vector space is a mathematical structure formed by a collection of vectors: objects that may be added together and multiplied by numbers, called scalars in this context. Scalars are often taken to be real numbers, but one may also consider vector spaces with scalar multiplication by complex...
. One of the standard examples of a matroid is the graphic matroid in which the independent sets are the sets of edges in forests of a graph; the matroid structure of forests is important in algorithms for computing the minimum spanning tree
Minimum spanning tree
Given a connected, undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together. A single graph can have many different spanning trees...
of the graph. Analogously, we may define matroids from pseudoforests.
For any graph G = (V,E), we may define a matroid on the edges of G, in which a set of edges is independent if and only if it forms a pseudoforest; this matroid is known as the bicircular matroid
Bicircular matroid
In the mathematical subject of matroid theory, the bicircular matroid of a graph G is the matroid B whose points are the edges of G and whose independent sets are the edge sets of pseudoforests of G, that is, the edge sets in which each connected component contains at most one cycle...
(or bicycle matroid) of G. The smallest dependent sets for this matroid are the minimal connected subgraphs of G that have more than one cycle, and these subgraphs are sometimes called bicycles. There are three possible types of bicycle: a theta graph has two vertices that are connected by three internally disjoint paths, a figure 8 graph consists of two cycles sharing a single vertex, and a handcuff graph is formed by two disjoint cycles connected by a path.
A graph is a pseudoforest if and only if it does not contain a bicycle as a subgraph.
Forbidden minors
Forming a minorMinor (graph theory)
In graph theory, an undirected graph H is called a minor of the graph G if H is isomorphic to a graph that can be obtained by zero or more edge contractions on a subgraph of G....
of a pseudoforest by contracting some of its edges and deleting others produces another pseudoforest. Therefore, the family of pseudoforests is closed
Closure (mathematics)
In mathematics, a set is said to be closed under some operation if performance of that operation on members of the set always produces a unique member of the same set. For example, the real numbers are closed under subtraction, but the natural numbers are not: 3 and 8 are both natural numbers, but...
under minors, and the Robertson–Seymour theorem
Robertson–Seymour theorem
In graph theory, the Robertson–Seymour theorem states that the undirected graphs, partially ordered by the graph minor relationship, form a well-quasi-ordering...
implies that pseudoforests can be characterized in terms of a finite set of forbidden minors, analogously to Wagner's theorem characterizing the planar graph
Planar graph
In graph theory, a planar graph is a graph that can be embedded in the plane, i.e., it can be drawn on the plane in such a way that its edges intersect only at their endpoints...
s as the graphs having neither the complete graph
Complete graph
In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge.-Properties:...
K5 nor the complete bipartite graph
Complete bipartite graph
In the mathematical field of graph theory, a complete bipartite graph or biclique is a special kind of bipartite graph where every vertex of the first set is connected to every vertex of the second set.- Definition :...
K3,3 as minors.
As discussed above, any non-pseudoforest graph contains as a subgraph a handcuff, figure 8, or theta graph; any handcuff or figure 8 graph may be contracted to form a butterfly graph
Butterfly graph
In the mathematical field of graph theory, the butterfly graph is a planar undirected graph with 5 vertices and 6 edges...
(five-vertex figure 8), and any theta graph may be contracted to form a diamond graph
Diamond graph
In the mathematical field of graph theory, the diamond graph is a planar undirected graph with 4 vertices and 5 edges. It consists of a complete graph K_4 minus one edge....
(four-vertex theta graph), so any non-pseudoforest contains either a butterfly or a diamond as a minor, and these are the only minor-minimal non-pseudoforest graphs. Thus, a graph is a pseudoforest if and only if it does not have the butterfly or the diamond as a minor. If one forbids only the diamond but not the butterfly, the resulting larger graph family consists of the cactus graph
Cactus graph
In graph theory, a cactus is a connected graph in which any two simple cycles have at most one vertex in common. Equivalently, every edge in such a graph belongs to at most one simple cycle. Equivalently, every block is an edge or a cycle.-Properties:Cacti are outerplanar graphs...
s and disjoint unions of multiple cactus graphs.
More simply, if multigraph
Multigraph
In mathematics, a multigraph or pseudograph is a graph which is permitted to have multiple edges, , that is, edges that have the same end nodes. Thus two vertices may be connected by more than one edge....
s with self-loops are considered, there is only one forbidden minor, a vertex with two loops.
Algorithms
An early algorithmic use of pseudoforests involves the network simplex algorithm and its application to generalized flow problems modeling the conversion between commoditiesCommodity
In economics, a commodity is the generic term for any marketable item produced to satisfy wants or needs. Economic commodities comprise goods and services....
of different types. In these problems, one is given as input a flow network
Flow network
In graph theory, a flow network is a directed graph where each edge has a capacity and each edge receives a flow. The amount of flow on an edge cannot exceed the capacity of the edge. Often in Operations Research, a directed graph is called a network, the vertices are called nodes and the edges are...
in which the vertices model each commodity and the edges model allowable conversions between one commodity and another. Each edge is marked with a capacity (how much of a commodity can be converted per unit time), a flow multiplier (the conversion rate between commodities), and a cost (how much loss or, if negative, profit is incurred per unit of conversion). The task is to determine how much of each commodity to convert via each edge of the flow network, in order to minimize cost or maximize profit, while obeying the capacity constraints and not allowing commodities of any type to accumulate unused. This type of problem can be formulated as a linear program, and solved using the simplex algorithm
Simplex algorithm
In mathematical optimization, Dantzig's simplex algorithm is a popular algorithm for linear programming. The journal Computing in Science and Engineering listed it as one of the top 10 algorithms of the twentieth century....
. The intermediate solutions arising from this algorithm, as well as the eventual optimal solution, have a special structure: each edge in the input network is either unused or used to its full capacity, except for a subset of the edges, forming a spanning pseudoforest of the input network, for which the flow amounts may lie between zero and the full capacity. In this application, unicyclic graphs are also sometimes called augmented trees and maximal pseudoforests are also sometimes called augmented forests.
The minimum spanning pseudoforest problem involves finding a spanning pseudoforest of minimum weight in a larger edge-weighted graph G.
Due to the matroid structure of pseudoforests, minimum-weight maximal pseudoforests may be found by greedy algorithm
Greedy algorithm
A greedy algorithm is any algorithm that follows the problem solving heuristic of making the locally optimal choice at each stagewith the hope of finding the global optimum....
s similar to those for the minimum spanning tree
Minimum spanning tree
Given a connected, undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together. A single graph can have many different spanning trees...
problem. However, Gabow and Tarjan found a more efficient linear-time approach in this case.
The pseudoarboricity of a graph G is defined by analogy to the arboricity
Arboricity
The arboricity of an undirected graph is the minimum number of forests into which its edges can be partitioned. Equivalently it is the minimum number of spanning forests needed to cover all the edges of the graph.-Example:...
as the minimum number of pseudoforests into which its edges can be partitioned; equivalently, it is the minimum k such that G is (k,0)-sparse, or the minimum k such that the edges of G can be oriented to form a directed graph with outdegree at most k. Due to the matroid structure of pseudoforests, the pseudoarboricity may be computed in polynomial time.
A random
Random graph
In mathematics, a random graph is a graph that is generated by some random process. The theory of random graphs lies at the intersection between graph theory and probability theory, and studies the properties of typical random graphs.-Random graph models:...
bipartite graph
Bipartite graph
In the mathematical field of graph theory, a bipartite graph is a graph whose vertices can be divided into two disjoint sets U and V such that every edge connects a vertex in U to one in V; that is, U and V are independent sets...
with n vertices on each side of its bipartition, and with cn edges chosen independently at random from each of the n2 possible pairs of vertices, is a pseudoforest with high probability whenever c is a constant strictly less than one. This fact plays a key role in the analysis of cuckoo hashing
Cuckoo hashing
Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table. Cuckoo hashing was first described by Rasmus Pagh and Flemming Friche Rodler in 2001...
, a data structure for looking up key-value pairs by looking in one of two hash tables at locations determined from the key: one can form a graph, the "cuckoo graph", whose vertices correspond to hash table locations and whose edges link the two locations at which one of the keys might be found, and the cuckoo hashing algorithm succeeds in finding locations for all of its keys if and only if the cuckoo graph is a pseudoforest.
Pseudoforests also play a key role in parallel algorithm
Parallel algorithm
In computer science, a parallel algorithm or concurrent algorithm, as opposed to a traditional sequential algorithm, is an algorithm which can be executed a piece at a time on many different processing devices, and then put back together again at the end to get the correct result.Some algorithms...
s for graph coloring
Graph coloring
In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices share the...
and related problems.