Semidefinite programming
Encyclopedia
Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function
over the intersection of the cone of positive semidefinite matrices
with an affine space
, i.e., a spectrahedron
.
Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in operations research
and combinatorial optimization
can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDP's are used in the context of linear matrix inequalities. SDPs are in fact a special case of cone programming
and can be efficiently solved by interior point methods.
All linear programs
can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Finally, semidefinite programming has been used in the optimization of complex systems
.
The space is equipped with the inner product
(where denotes the trace
)
A symmetric matrix is positive semidefinite if all its eigenvalues
are nonnegative; we write . Similarly, and means that is positive
definite, negative semidefinite, and negative definite,
respectively. Denote by the convex cone
of positive
semidefinite matrices. This cone defines a partial
order for by whenever is
positive semidefinite, .
problems of the type
We refer to this problem as a primal semidefinite program (P-SDP). Analogously to linear programming, we introduce a dual semidefinite program (D-SDP)
For convenience, an SDP will often be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative scalar
variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix as a diagonal entry ( for some ). To ensure that , constraints can be added for all . As another example, note that for any positive semidefinite matrix , there exists a set of vectors such that the , entry of is the scalar product
of and . Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors can be recovered in time (e.g., by using an incomplete Cholesky decomposition
of X).
where the last inequality is because both matrices are positive semidefinite.
, however, not every SDP satisfies strong duality; in general, the value of the dual SDP may lie strictly below the value of the primal.
(i) Suppose the primal problem (P-SDP) is bounded below and strictly
feasible (i.e., there exists
such that , ).
Then there is an optimal solution to (D-SDP) and
(ii) Suppose the dual problem (D-SDP) is bounded above and strictly
feasible (i.e.,
for some ).
Then there is an optimal solution to (P-SDP) and
the equality from (i) holds.
are valid if and only if
Suppose that we know from some prior knowledge (empirical results of an experiment, for example) that and . The problem of determining the smallest and largest values that can take is given by:
over the intersection of the cone of positive semidefinite matrices
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...
with an affine space
Affine space
In mathematics, an affine space is a geometric structure that generalizes the affine properties of Euclidean space. In an affine space, one can subtract points to get vectors, or add a vector to a point to get another point, but one cannot add points. In particular, there is no distinguished point...
, i.e., a spectrahedron
Spectrahedron
In convex geometry, a spectrahedron is a shape that can be represented as a linear matrix inequality. Alternatively, the set of positive semidefinite matrices forms a convex cone in , and a spectrahedron is a shape that can be formed by intersecting this cone with a linear affine...
.
Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in operations research
Operations research
Operations research is an interdisciplinary mathematical science that focuses on the effective use of technology by organizations...
and combinatorial optimization
Combinatorial optimization
In applied mathematics and theoretical computer science, combinatorial optimization is a topic that consists of finding an optimal object from a finite set of objects. In many such problems, exhaustive search is not feasible...
can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDP's are used in the context of linear matrix inequalities. SDPs are in fact a special case of cone programming
Conic optimization
Conic optimization is a subfield of convex optimization that studies a class of structured convex optimization problems called conic optimization problems...
and can be efficiently solved by interior point methods.
All linear programs
Linear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Finally, semidefinite programming has been used in the optimization of complex systems
Optimization
Optimization or optimality may refer to:* Mathematical optimization, the theory and computation of extrema or stationary points of functionsEconomics and business* Optimality, in economics; see utility and economic efficiency...
.
Notation
Denote by the space of all real symmetric matrices.The space is equipped with the inner product
Inner product space
In mathematics, an inner product space is a vector space with an additional structure called an inner product. This additional structure associates each pair of vectors in the space with a scalar quantity known as the inner product of the vectors...
(where denotes the trace
Trace (linear algebra)
In linear algebra, the trace of an n-by-n square matrix A is defined to be the sum of the elements on the main diagonal of A, i.e.,...
)
A symmetric matrix is positive semidefinite if all its eigenvalues
are nonnegative; we write . Similarly, and means that is positive
definite, negative semidefinite, and negative definite,
respectively. Denote by the convex cone
Convex cone
In linear algebra, a convex cone is a subset of a vector space over an ordered field that is closed under linear combinations with positive coefficients.-Definition:...
of positive
semidefinite matrices. This cone defines a partial
order for by whenever is
positive semidefinite, .
Primal and dual form
Linear semidefinite programming (SDP) deals with optimizationproblems of the type
We refer to this problem as a primal semidefinite program (P-SDP). Analogously to linear programming, we introduce a dual semidefinite program (D-SDP)
For convenience, an SDP will often be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative scalar
Scalar (mathematics)
In linear algebra, real numbers are called scalars and relate to vectors in a vector space through the operation of scalar multiplication, in which a vector can be multiplied by a number to produce another vector....
variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix as a diagonal entry ( for some ). To ensure that , constraints can be added for all . As another example, note that for any positive semidefinite matrix , there exists a set of vectors such that the , entry of is the scalar product
Dot product
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number obtained by multiplying corresponding entries and then summing those products...
of and . Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors can be recovered in time (e.g., by using an incomplete Cholesky decomposition
Cholesky decomposition
In linear algebra, the Cholesky decomposition or Cholesky triangle is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. It was discovered by André-Louis Cholesky for real matrices...
of X).
Weak Duality
The weak duality theorem states that the value of the primal SDP is at least the value of the dual SDP. Therefore, any feasible solution to the dual SDP lower-bounds the primal SDP value, and conversely, and feasible solution to the primal SDP upper-bounds the dual SDP value. This is becausewhere the last inequality is because both matrices are positive semidefinite.
Strong Duality
Under a condition known as Slater's condition, the value of the primal and dual SDPs are equal. This is known as strong duality. Unlike for linear programsLinear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
, however, not every SDP satisfies strong duality; in general, the value of the dual SDP may lie strictly below the value of the primal.
(i) Suppose the primal problem (P-SDP) is bounded below and strictly
feasible (i.e., there exists
such that , ).
Then there is an optimal solution to (D-SDP) and
(ii) Suppose the dual problem (D-SDP) is bounded above and strictly
feasible (i.e.,
for some ).
Then there is an optimal solution to (P-SDP) and
the equality from (i) holds.
Example 1
Consider three random variables , , and . By definition, their correlation coefficientsCorrelation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
are valid if and only if
Suppose that we know from some prior knowledge (empirical results of an experiment, for example) that and . The problem of determining the smallest and largest values that can take is given by:
- minimize/maximize
- subject to
we set to obtain the answer. This can be formulated by an SDP. We handle the inequality constraints by augmenting the variable matrix and introducing slack variableSlack variableIn an optimization problem, a slack variable is a variable that is added to an inequality constraint to transform it to an equality. Introducing a slack variable replaces an inequality constraint with an equality constraint and a nonnegativity constraint....
s, for example
Solving this SDP gives the minimum and maximum values of as and respectively.
Example 2
Consider the problem
- minimize
- subject to
where we assume that whenever .
Introducing an auxiliary variable the problem can be reformulated:
- minimize
- subject to
In this formulation, the objective is a linear function of the variables .
The first restriction can be written as
where the matrix is the square matrix with values in the diagonal equal
to the elements of the vector .
The second restriction can be written as
or equivalently
- det
Thus .
The semidefinite program associated with this problem is
- minimize
- subject to
Example 3 (Goemans-Williamson MAX CUT approximation algorithm)
Semidefinite programs are important tools for developing approximation algorithms for NP-hard maximization problems. The first approximation algorithm based on an SDP is due to Goemans and Williamson (JACM, 1995). They studied the MAX CUT problem: Given a graphGraph (mathematics)In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...
G = (V, E), output a partitionPartition of a setIn mathematics, a partition of a set X is a division of X into non-overlapping and non-empty "parts" or "blocks" or "cells" that cover all of X...
of the vertices V so as to maximize the number of edges crossing from one side to the other. This problem can be expressed as an integer quadratic program:- Maximize such that each .
Unless P = NP, we cannot solve this maximization problem efficiently. However, Goemans and Williamson observed a general three-step procedure for attacking this sort of problem:- Relax the integer quadratic program into an SDP.
- Solve the SDP (to within an arbitrarily small additive error ).
- Round the SDP solution to obtain an approximate solution to the original integer quadratic program.
For MAX CUT, the most natural relaxation is such that , where the maximization is over vectors instead of integer scalars.
This is an SDP because the objective function and constraints are all linear functions of vector inner products. Solving the SDP gives a set of unit vectors in ; since the vectors are not required to be collinear, the value of this relaxed program can only be higher than the value of the original quadratic integer program. Finally, a rounding procedure is needed to obtain a partition. Goemans and Williamson simply choose a uniformly random hyperplane through the origin and divide the vertices according to which side of the hyperplane the corresponding vectors lie. Straightforward analysis shows that this procedure achieves an expected approximation ratio (performance guarantee) of 0.87856 - ε. (The expected value of the cut is the sum over edges of the probability that the edge is cut, which is simply arccos of the angle between the vectors at the endpoints of the edge over . Comparing this probability to , in expectation the ratio is always at least 0.87856.) Assuming the Unique Games ConjectureUnique games conjectureIn computational complexity theory, the Unique Games Conjecture is a conjecture made by Subhash Khot in 2002. The conjecture postulates that the problem of determining the value of a certain type of game, known as a unique game, has NP-hard algorithmic complexity...
, it can be shown that this approximation ratio is essentially optimal.
Since the original paper of Goemans and Williamson, SDPs have been applied to develop numerous approximation algorithms. Recently, Prasad Raghavendra has developed a general framework for constraint satisfaction problems based on the Unique Games ConjectureUnique games conjectureIn computational complexity theory, the Unique Games Conjecture is a conjecture made by Subhash Khot in 2002. The conjecture postulates that the problem of determining the value of a certain type of game, known as a unique game, has NP-hard algorithmic complexity...
.
It has also been shown that the MAX CUT approximation is a kind of Locality Sensitive Hash and is related to the Simhash algorithm.
Algorithms
There are several types of algorithms for solving SDPs. These algorithms output the value of the SDP up to an additive error in time that is polynomial in the program description size and .
Interior point methods
Most codes are based on interior point methods (CSDP, SeDuMi, SDPT3, DSDP, SDPA). Robust and efficient for general linear SDP problems. Restricted by the fact that the algorithms are second-order methods and need to store and factorize a large (and often dense) matrix.
Bundle method
The code SBmethod formulates the SDP problem as a nonsmooth optimization problem and solves it by the Spectral Bundle method of nonsmooth optimization. This approach is very efficient for a special class of linear SDP problems.
Other
Algorithms based on augmented Lagrangian method (PENSDP) are similar in behavior to the interior point methods and can be specialized to some very large scale problems. Other algorithms use low-rank information and reformulation of the SDP as a nonlinear programmingNonlinear programmingIn mathematics, nonlinear programming is the process of solving a system of equalities and inequalities, collectively termed constraints, over a set of unknown real variables, along with an objective function to be maximized or minimized, where some of the constraints or the objective function are...
problem (SPDLR).
Software
The following codes are available for SDP:
SDPA, CSDP, SDPT3, SeDuMi, DSDP, PENSDP, SDPLR, SBmeth
SeDuMi runs on MATLAB and uses the Self-Dual method for solving general convex optimization problems.
Applications
Semidefinite programming has been applied to find approximate solutions to combinatorial optimization problems, such as the solution of the max cut problem with an approximation ratio of 0.87856. SDPs are also used in geometry to determine tensegrity graphs, and arise in control theory as LMIs.
External links
- Links to introductions and events in the field
- Lecture notes from Laszlo Lovasz on Semidefinite Programming
Software- Overview of existing software
- NEOS Solvers: A server that runs multiple algorithms for solving SDPs and other optimization problems
- SeDuMi: (Self-Dual-Minimization) solves optimization problems over symmetric cones (this includes semidefinite optimization).
- YALMIP: MATLAB optimization toolbox and frontend for many sdp solver
- cvx: a MATLAB frontend that uses either SeDuMi or SDPT3
- cvxopt: a python solver