Set packing
Encyclopedia
Set packing is a classical NP-complete
problem in computational complexity theory
and combinatorics
, and was one of Karp's 21 NP-complete problems
.
Suppose we have a finite set S and a list of subsets of S. Then, the set packing problem asks if some k subsets in the list are pairwise disjoint (in other words, no two of them intersect).
More formally, given a universe and a family of subsets of ,
a packing is a subfamily of sets such that all sets in are pairwise disjoint. In the set packing decision problem
, the input is a pair and an integer ; the question is whether
there is a set packing of size or more. In the set packing optimization problem
, the input is a pair , and the task is to find a set packing that uses the most sets.
The problem is clearly in NP
since, given k subsets, we can easily verify that they are pairwise disjoint.
The optimization version
of the problem, maximum set packing, asks for the maximum number of pairwise disjoint sets in the list. It is a maximization problem that can be formulated naturally as an integer linear program, belongs to the class of packing problem
s, and its dual linear program is the set cover problem
.
In this case, the elements of the set are languages other than English, and the subsets are the sets of languages spoken by a particular ambassador. If two sets are disjoint, those two ambassadors share no languages other than English. A maximum set packing will choose the largest possible number of ambassadors under the desired constraint. Although this problem is hard to solve in general, in this example a good heuristic is to choose ambassadors who only speak unusual languages first, so that not too many others are disqualified.
. Here, we are also given a set S and a list of sets, but the goal is to determine whether we can choose k sets that together contain every element of S. These sets may overlap. The optimization version finds the minimum number of such sets. The maximum set packing need not cover every possible element.
One advantage of the set packing problem is that even if it's hard for some k, it's not hard to find a k for which it is easy on a particular input. For example, we can use a greedy algorithm
where we look for the set which intersects the smallest number of other sets, add it to our solution, and remove the sets it intersects. We continually do this until no sets are left, and we have a set packing of some size, although it may not be the maximum set packing. Although no algorithm can always produce results close to the maximum (see next section), on many practical inputs these heuristics do so.
The NP-complete exact cover problem, on the other hand, requires every element to be contained in exactly one of the subsets. Finding such an exact cover at all, regardless of size, is an NP-complete
problem. However, if we create a singleton set for each element of S and add these to the list, the resulting problem is about as easy as set packing.
Karp originally showed set packing NP-complete via a reduction from the clique problem
.
There is a weighted version of the set cover problem in which each subset is assigned a real weight and it is this weight we wish to maximize. In our example above, we might weight the ambassadors according to the populations of their countries, so that our announcement will reach the most people possible. This seems to make the problem harder, but as we explain below, most known results for the general problem apply to the weighted problem as well.
However, the problem does have a variant which is more tractable: if we assume no subset exceeds k≥3 elements, the answer can be approximated within a factor of k/2 + ε for any ε > 0; in particular, the problem with 3-element sets can be approximated within about 50%. In another more tractable variant, if no element occurs in more than k of the subsets, the answer can be approximated within a factor of k. This is also true for the weighted version.
NP-complete
In computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...
problem in computational complexity theory
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...
and combinatorics
Combinatorics
Combinatorics is a branch of mathematics concerning the study of finite or countable discrete structures. Aspects of combinatorics include counting the structures of a given kind and size , deciding when certain criteria can be met, and constructing and analyzing objects meeting the criteria ,...
, and was one of Karp's 21 NP-complete problems
Karp's 21 NP-complete problems
One of the most important results in computational complexity theory was Stephen Cook's 1971 demonstration of the first NP-complete problem, the boolean satisfiability problem...
.
Suppose we have a finite set S and a list of subsets of S. Then, the set packing problem asks if some k subsets in the list are pairwise disjoint (in other words, no two of them intersect).
More formally, given a universe and a family of subsets of ,
a packing is a subfamily of sets such that all sets in are pairwise disjoint. In the set packing decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
, the input is a pair and an integer ; the question is whether
there is a set packing of size or more. In the set packing optimization problem
Optimization problem
In mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. An optimization problem with discrete...
, the input is a pair , and the task is to find a set packing that uses the most sets.
The problem is clearly in NP
NP (complexity)
In computational complexity theory, NP is one of the most fundamental complexity classes.The abbreviation NP refers to "nondeterministic polynomial time."...
since, given k subsets, we can easily verify that they are pairwise disjoint.
The optimization version
Optimization problem
In mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. An optimization problem with discrete...
of the problem, maximum set packing, asks for the maximum number of pairwise disjoint sets in the list. It is a maximization problem that can be formulated naturally as an integer linear program, belongs to the class of packing problem
Packing problem
Packing problems are a class of optimization problems in mathematics which involve attempting to pack objects together , as densely as possible. Many of these problems can be related to real life packaging, storage and transportation issues...
s, and its dual linear program is the set cover problem
Set cover problem
The set covering problem is a classical question in computer science and complexity theory.It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms...
.
Integer linear program formulation
The maximum set packing problem can be formulated as the following integer linear program.maximize | (maximize the total value) | ||
subject to | for all | (selected sets have to be pairwise disjoint) | |
for all . | (every set is either in the set packing or not) |
Example
As a simple example, suppose you're at a convention of foreign ambassadors, each of which speaks English and also various other languages. You want to make an announcement to a group of them, but because you don't trust them, you don't want them to be able to speak among themselves without you being able to understand them. To ensure this, you will choose a group such that no two ambassadors speak the same language, other than English. On the other hand you also want to give your announcement to as many ambassadors as possible.In this case, the elements of the set are languages other than English, and the subsets are the sets of languages spoken by a particular ambassador. If two sets are disjoint, those two ambassadors share no languages other than English. A maximum set packing will choose the largest possible number of ambassadors under the desired constraint. Although this problem is hard to solve in general, in this example a good heuristic is to choose ambassadors who only speak unusual languages first, so that not too many others are disqualified.
Heuristics and related problems
Set packing is one among a family of problems related to covering or partitioning the elements of a set. One closely related problem is the set cover problemSet cover problem
The set covering problem is a classical question in computer science and complexity theory.It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms...
. Here, we are also given a set S and a list of sets, but the goal is to determine whether we can choose k sets that together contain every element of S. These sets may overlap. The optimization version finds the minimum number of such sets. The maximum set packing need not cover every possible element.
One advantage of the set packing problem is that even if it's hard for some k, it's not hard to find a k for which it is easy on a particular input. For example, we can use a greedy algorithm
Greedy algorithm
A greedy algorithm is any algorithm that follows the problem solving heuristic of making the locally optimal choice at each stagewith the hope of finding the global optimum....
where we look for the set which intersects the smallest number of other sets, add it to our solution, and remove the sets it intersects. We continually do this until no sets are left, and we have a set packing of some size, although it may not be the maximum set packing. Although no algorithm can always produce results close to the maximum (see next section), on many practical inputs these heuristics do so.
The NP-complete exact cover problem, on the other hand, requires every element to be contained in exactly one of the subsets. Finding such an exact cover at all, regardless of size, is an NP-complete
NP-complete
In computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...
problem. However, if we create a singleton set for each element of S and add these to the list, the resulting problem is about as easy as set packing.
Karp originally showed set packing NP-complete via a reduction from the clique problem
Clique problem
In computer science, the clique problem refers to any of the problems related to finding particular complete subgraphs in a graph, i.e., sets of elements where each pair of elements is connected....
.
There is a weighted version of the set cover problem in which each subset is assigned a real weight and it is this weight we wish to maximize. In our example above, we might weight the ambassadors according to the populations of their countries, so that our announcement will reach the most people possible. This seems to make the problem harder, but as we explain below, most known results for the general problem apply to the weighted problem as well.
Complexity
The set packing problem is not only NP-complete, but its optimization version (general maximum set packing problem ) has been proven as difficult to approximate as the maximum clique problem; in particular, it cannot be approximated within any constant factor. The best known algorithm approximates it within a factor of . The weighted variant can also be approximated this well.However, the problem does have a variant which is more tractable: if we assume no subset exceeds k≥3 elements, the answer can be approximated within a factor of k/2 + ε for any ε > 0; in particular, the problem with 3-element sets can be approximated within about 50%. In another more tractable variant, if no element occurs in more than k of the subsets, the answer can be approximated within a factor of k. This is also true for the weighted version.
See also
- Matching, 3-dimensional matching3-dimensional matchingIn the mathematical discipline of graph theory, a 3-dimensional matching is a generalization of bipartite matching to 3-uniform hypergraphs...
, and independent setIndependent set (graph theory)In graph theory, an independent set or stable set is a set of vertices in a graph, no two of which are adjacent. That is, it is a set I of vertices such that for every two vertices in I, there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in I...
are special cases of set packing. A maximum-size matching can be found in polynomial time, but finding a largest 3-dimensional matching or a largest independent set is NP-hard. - Packing in a hypergraphPacking in a hypergraphIn mathematics, a packing in a hypergraph is a partition of the set of the hypergraph's edges into a number of disjoint subsets such that no pair of edges in each subset share any vertex. There are two famous algorithms to achieve asymptotically optimal packing in k-uniform hypergraphs. One of them...
.
External links
- http://www.cs.sunysb.edu/~algorith/implement/syslo/implement.shtml: A Pascal program for solving the problem. From Discrete Optimization Algorithms with Pascal Programs by MacIej M. Syslo, ISBN 0-13-215509-5.
- Benchmarks with Hidden Optimum Solutions for Set Covering, Set Packing and Winner Determination
- Solving packaging problem in PHP