P-complete
Encyclopedia
In complexity theory
, the notion of P-complete decision problem
s is useful in the analysis of both:
Formally, a decision problem
is P-complete (complete
for the complexity class P
) if it is in P and that every problem in P can be reduced to it by using an appropriate reduction.
The specific type of reduction used varies and may affect the exact set of problems. If we use NC
reductions, that is, reductions which can operate in polylogarithmic time on a parallel computer with a polynomial number of processors, then all P-complete problems lie outside NC and so cannot be effectively parallelized, under the unproven assumption that NC ≠ P. If we use the weaker log-space reduction
, this remains true, but additionally we learn that all P-complete problems lie outside L
under the weaker unproven assumption that L ≠ P. In this latter case the set P-complete may be smaller.
, which consists of those problems which can be efficiently solved on a parallel computer. This is because parallel computers can be simulated on a sequential machine.
It is not known whether NC = P. In other words, it is not known whether there are any tractable problems that are inherently sequential. Just as it is widely suspected that P does not equal NP, so it is widely suspected that NC does not equal P.
Similarly, the class L
contains all problems that can be solved by a sequential computer in logarithmic space. Such machines run in polynomial time because they can have a polynomial number of configurations. It is suspected that L ≠ P; that is, that some problems that can be solved in polynomial time also require more than logarithmic space.
Similarly to the use of NP-complete
problems to analyze the P = NP question, the P-complete problems, viewed as the "probably not parallelizable" or "probably inherently sequential" problems, serves in a similar manner to study the NC = P question. Finding an efficient way to parallelize the solution to some P-complete problem would show that NC = P. It can also be thought of as the "problems requiring superlogarithmic space"; a log-space solution to a P-complete problem (using the definition based on log-space reductions) would imply L = P.
The logic behind this is analogous to the logic that a polynomial-time solution to an NP-complete problem would prove P = NP: if we have a NC reduction from any problem in P to a problem A, and an NC solution for A, then NC = P. Similarly, if we have a log-space reduction from any problem in P to a problem A, and a log-space solution for A, then L = P.
, an input for that machine, and a number T (written in unary
), does that machine halt on that input within the first T steps? It is clear that this problem is P-complete: if we can parallelize a general simulation of a sequential computer, then we will be able to parallelize any program that runs on that computer. If this problem is in NC, then so is every other problem in P. If the number of steps is written in binary, the problem is EXPTIME-complete.
This problem illustrates a common trick in the theory of P-completeness. We aren't really interested in whether a problem can be solved quickly on a parallel machine. We're just interested in whether a parallel machine solves it much more quickly than a sequential machine. Therefore, we have to reword the problem so that the sequential version is in P. That is why this problem required T to be written in unary. If a number T is written as a binary
number (a string of n ones and zeros, where n = log T), then the obvious sequential algorithm can take time 2n. On the other hand, if T is written as a unary number (a string of n ones, where n = T), then it only takes time n. By writing T in unary rather than binary, we have reduced the obvious sequential algorithm from exponential time to linear time. That puts the sequential problem in P. Then, it will be in NC if and only if it is parallelizable.
Many other problems have been proved to be P-complete, and therefore are widely believed to be inherently sequential. These include the following problems, either as given, or in a decision-problem
form:
In order to prove that a given problem is P-complete, one typically tries to reduce a known P-complete problem to the given one.
In 1999, Jin-Yi Cai and D. Sivakumar, building on work by Ogihara, showed that if there exists a sparse language
that is P-complete, then L
= P
.
) are suspected to be difficult. Similarly there are problems that are not known to be either P-complete or NC, but are thought to be difficult to parallelize. Examples include the decision problem
forms of finding the greatest common divisor
of two numbers, and determining what answer the extended Euclidean algorithm
would return when given two numbers.
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...
, the notion of P-complete decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
s is useful in the analysis of both:
- which problems are difficult to parallelize effectively, and;
- which problems are difficult to solve in limited space.
Formally, a decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
is P-complete (complete
Complete (complexity)
In computational complexity theory, a computational problem is complete for a complexity class if it is, in a formal sense, one of the "hardest" or "most expressive" problems in the complexity class...
for the complexity class P
P (complexity)
In computational complexity theory, P, also known as PTIME or DTIME, is one of the most fundamental complexity classes. It contains all decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.Cobham's thesis holds...
) if it is in P and that every problem in P can be reduced to it by using an appropriate reduction.
The specific type of reduction used varies and may affect the exact set of problems. If we use NC
NC (complexity)
In complexity theory, the class NC is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. In other words, a problem is in NC if there exist constants c and k such that it can be solved in time O using O parallel processors...
reductions, that is, reductions which can operate in polylogarithmic time on a parallel computer with a polynomial number of processors, then all P-complete problems lie outside NC and so cannot be effectively parallelized, under the unproven assumption that NC ≠ P. If we use the weaker log-space reduction
Log-space reduction
In computational complexity theory, a log-space reduction is a reduction computable by a deterministic Turing machine using logarithmic space. Conceptually, this means it can keep a constant number of pointers into the input, along with a logarithmic number of fixed-size integers...
, this remains true, but additionally we learn that all P-complete problems lie outside L
L (complexity)
In computational complexity theory, L is the complexity class containing decision problems which can be solved by a deterministic Turing machine using a logarithmic amount of memory space...
under the weaker unproven assumption that L ≠ P. In this latter case the set P-complete may be smaller.
Motivation
The class P, typically taken to consist of all the "tractable" problems for a sequential computer, contains the class NCNC (complexity)
In complexity theory, the class NC is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. In other words, a problem is in NC if there exist constants c and k such that it can be solved in time O using O parallel processors...
, which consists of those problems which can be efficiently solved on a parallel computer. This is because parallel computers can be simulated on a sequential machine.
It is not known whether NC = P. In other words, it is not known whether there are any tractable problems that are inherently sequential. Just as it is widely suspected that P does not equal NP, so it is widely suspected that NC does not equal P.
Similarly, the class L
L (complexity)
In computational complexity theory, L is the complexity class containing decision problems which can be solved by a deterministic Turing machine using a logarithmic amount of memory space...
contains all problems that can be solved by a sequential computer in logarithmic space. Such machines run in polynomial time because they can have a polynomial number of configurations. It is suspected that L ≠ P; that is, that some problems that can be solved in polynomial time also require more than logarithmic space.
Similarly to the use of NP-complete
NP-complete
In computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...
problems to analyze the P = NP question, the P-complete problems, viewed as the "probably not parallelizable" or "probably inherently sequential" problems, serves in a similar manner to study the NC = P question. Finding an efficient way to parallelize the solution to some P-complete problem would show that NC = P. It can also be thought of as the "problems requiring superlogarithmic space"; a log-space solution to a P-complete problem (using the definition based on log-space reductions) would imply L = P.
The logic behind this is analogous to the logic that a polynomial-time solution to an NP-complete problem would prove P = NP: if we have a NC reduction from any problem in P to a problem A, and an NC solution for A, then NC = P. Similarly, if we have a log-space reduction from any problem in P to a problem A, and a log-space solution for A, then L = P.
P-complete problems
The most basic P-complete problem is this: given a Turing machineTuring machine
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a...
, an input for that machine, and a number T (written in unary
Unary numeral system
The unary numeral system is the bijective base-1 numeral system. It is the simplest numeral system to represent natural numbers: in order to represent a number N, an arbitrarily chosen symbol representing 1 is repeated N times. For example, using the symbol | , the number 6 is represented as ||||||...
), does that machine halt on that input within the first T steps? It is clear that this problem is P-complete: if we can parallelize a general simulation of a sequential computer, then we will be able to parallelize any program that runs on that computer. If this problem is in NC, then so is every other problem in P. If the number of steps is written in binary, the problem is EXPTIME-complete.
This problem illustrates a common trick in the theory of P-completeness. We aren't really interested in whether a problem can be solved quickly on a parallel machine. We're just interested in whether a parallel machine solves it much more quickly than a sequential machine. Therefore, we have to reword the problem so that the sequential version is in P. That is why this problem required T to be written in unary. If a number T is written as a binary
Binary numeral system
The binary numeral system, or base-2 number system, represents numeric values using two symbols, 0 and 1. More specifically, the usual base-2 system is a positional notation with a radix of 2...
number (a string of n ones and zeros, where n = log T), then the obvious sequential algorithm can take time 2n. On the other hand, if T is written as a unary number (a string of n ones, where n = T), then it only takes time n. By writing T in unary rather than binary, we have reduced the obvious sequential algorithm from exponential time to linear time. That puts the sequential problem in P. Then, it will be in NC if and only if it is parallelizable.
Many other problems have been proved to be P-complete, and therefore are widely believed to be inherently sequential. These include the following problems, either as given, or in a decision-problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
form:
- Circuit Value Problem (CVP) - Given a circuitBoolean circuitA Boolean circuit is a mathematical model of computation used in studying computational complexity theory. Boolean circuits are the main object of study in circuit complexity and are a special kind of circuits; a formal language can be decided by a family of Boolean circuits, one circuit for each...
, the inputs to the circuit, and one gate in the circuit, calculate the output of that gate - Restricted Case of CVP - Like CVP, except each gate has two inputs and two outputs (F and Not F), every other layer is just AND gates, the rest are OR gates (or, equivalently, all gates are NAND gates, or all gates are NOR gates), the inputs of a gate come from the immediately preceding layer
- Linear programmingLinear programmingLinear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
- Maximize a linear function subject to linear inequality constraints - Lexicographically First Depth First Search Ordering - Given a graphGraph theoryIn mathematics and computer science, graph theory is the study of graphs, mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...
with fixed ordered adjacency lists, and nodes u and v, is vertex u visited before vertex v in a depth-first search induced by the order of the adjacency lists? - Context Free Grammar Membership - Given a context-free grammarContext-free grammarIn formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
and a string, can that string be generated by that grammar? - Horn-satisfiabilityHorn-satisfiabilityIn formal logic, Horn-satisfiability, or HORNSAT, is the problem of deciding whether a given set of propositional Horn clauses is satisfiable....
: given a set of Horn clauseHorn clauseIn mathematical logic, a Horn clause is a clause with at most one positive literal. They are named after the logician Alfred Horn, who first pointed out the significance of such clauses in 1951...
s, is there a variable assignment which satisfies them? This is P's version of the boolean satisfiability problemBoolean satisfiability problemIn computer science, satisfiability is the problem of determining if the variables of a given Boolean formula can be assigned in such a way as to make the formula evaluate to TRUE...
. - Game of Life - Given an initial configuration of Conway's Game of LifeConway's Game of LifeThe Game of Life, also known simply as Life, is a cellular automaton devised by the British mathematician John Horton Conway in 1970....
, a particular cell, and a time T (in unary), is that cell alive after T steps? - LZW (algorithm) (1978 paradigm) Data Compression - given strings s and t, will compressing s with an LZ78 method add t to the dictionary? (Note that for LZ77 compression such as gzipGzipGzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...
, this is much easier, as the problem reduces to "Is t in s?".) - Type inferenceType inferenceType inference refers to the automatic deduction of the type of an expression in a programming language. If some, but not all, type annotations are already present it is referred to as type reconstruction....
for partial types - Given an untypedType theoryIn mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...
term from the lambda calculusLambda calculusIn mathematical logic and computer science, lambda calculus, also written as λ-calculus, is a formal system for function definition, function application and recursion. The portion of lambda calculus relevant to computation is now called the untyped lambda calculus...
, determine whether this term has a partial type.
In order to prove that a given problem is P-complete, one typically tries to reduce a known P-complete problem to the given one.
In 1999, Jin-Yi Cai and D. Sivakumar, building on work by Ogihara, showed that if there exists a sparse language
Sparse language
In computational complexity theory, a sparse language is a formal language such that the number of strings of length n in the language is bounded by a polynomial function of n. They are used primarily in the study of the relationship of the complexity class NP with other classes...
that is P-complete, then L
L (complexity)
In computational complexity theory, L is the complexity class containing decision problems which can be solved by a deterministic Turing machine using a logarithmic amount of memory space...
= P
P (complexity)
In computational complexity theory, P, also known as PTIME or DTIME, is one of the most fundamental complexity classes. It contains all decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.Cobham's thesis holds...
.
Problems not known to be P-complete
Some problems are not known to be either NP-hard or in P. These problems (e.g. factoringInteger factorization
In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer....
) are suspected to be difficult. Similarly there are problems that are not known to be either P-complete or NC, but are thought to be difficult to parallelize. Examples include the decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
forms of finding the greatest common divisor
Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...
of two numbers, and determining what answer the extended Euclidean algorithm
Extended Euclidean algorithm
The extended Euclidean algorithm is an extension to the Euclidean algorithm. Besides finding the greatest common divisor of integers a and b, as the Euclidean algorithm does, it also finds integers x and y that satisfy Bézout's identityThe extended Euclidean algorithm is particularly useful when a...
would return when given two numbers.