24 NP-Completeness
Finally, we will look at an amazing method to show that some problems are “at least as hard” as others: NP-Completeness
Objectives: After learning this material, you should be able to:
- Define NP-Complete.
- Name some common NP-complete problems
- Prove that a problem is NP-complete using a reduction.
24.1 NP-Completeness
Amazingly, there are problems in NP that are “at least as hard” as any other problem in NP. Such problems are called “NP-complete”. To define NP-completeness, we will consider reductions, i.e. ways to transform some Problem A into another Problem B. If Problem B is easy to solve, this would imply that A is easy to solve too (by transforming it to B). We will find types of “Problem B” that every problem in NP can be reduced to.
Definition 24.1 (Mapping reduction) A mapping reduction from decision problem A to decision problem B is an algorithm that, given an instance \(I\) of problem A, produces an instance \(I'\) of problem B such that \(I\) is a “yes” if and only if \(I'\) is a yes. It is a polynomial-time mapping reduction if the algorithm runs in polynomial time in the size of its input \(I\).
We can now make an amazing definition: NP-Complete. A problem B is NP-Complete if every other NP problem reduces to it.
Definition 24.2 (NP-Complete) A decision problem B is NP-Complete if it is in NP and, for every decision problem A in NP, there exists a polynomial-time mapping reduction from A to B.
Here is one reason NP-Completeness is important: if we have a single NP-Complete problem, and we can solve it efficiently, then we can solve every problem in NP efficiently.
Proposition 24.1 If a decision problem B is NP-Complete and if we have a polynomial-time algorithm for B, then P=NP.
Proof. Let A be any problem in NP. Because B is NP-complete, there is a polynomial-time mapping reduction \(M\) from A to B. Here is a polynomial-time algorithm for A:
- Given an input \(I\) for A, use \(M\) to produce an input \(I'\) for B. Notice that, since M runs in polynomial time, the size of \(I'\) is at most \(|I'| \leq p(|I|)\) for some polynomial \(p\), where \(|I|\) is the size of input \(I\).
- Run the polynomial-time algorithm for B on \(I'\). This takes time \(q(|I'|)\), where \(q\) is some polynomial. Return the answer.
This algorithm is correct by definition of mapping reduction. It also runs in polynomial time: The running time of \(M\) is polynomial in \(|I|\), and the running time of the algorithm for B is at most \(q(p(|I|))\), where \(q\) and \(p\) are polynomials. The composition of two polynomials is still a polynomial.
For an example of the \(p\) and \(q\) used in the proof, imagine that our reduction takes an instance \(I\) of length \(n\) and produces an instance \(I'\) of length \(n^2\). And suppose that the algorithm for B runs in time \(O(m^3)\) on inputs of length \(m\). Then the total running time is \(O((n^2)^3) = O(n^6)\).
24.2 A first NP-Complete problem
It’s not obvious whether there are any NP-Complete problems at all, but there are. Here is our first one: Satisfiability, or SAT for short.
Definition 24.3 (SAT) The Satisfiability (SAT) problem is:
- Input: a boolean formula on \(n\) variables \(x_1,\dots,x_n\), i.e. a combination of logical AND, OR, NOT, and parenthetical groupings of variables.
- Output: “yes” if there exists an assignment of True or False to each variable such that the formula overall evaluates to True; “no” if no such assignment exists.
Proposition 24.2 SAT is in NP.
Proof. The witness string is the assignment of True/False to the variables \(x_1,\dots,x_n\). The verifier simply plugs the variables into the formula and evaluates it; if True, it accepts, otherwise it rejects.
Theorem 24.1 SAT is NP-Complete.
Proof (Sketch.). We already showed that SAT is in NP. We now must show that every problem A in NP has a polynomial-time mapping reduction to SAT.
Let A be a decision problem in NP with a verifier \(V\). Here is our mapping reduction. First, we will represent the memory and variables used by \(V\) in binary. On input \(I\) of size \(n\), \(V\) uses at most \(p(n)\) total bits of memory over the whole course of its operation, for some polynomial \(p\), including the input variables. Furthermore, it takes at most \(q(n)\) total steps. We will have a new variable \(x_{i,j}\) representing memory slot \(i\) at time step \(j\), where True represents that the bit is set to 1 at that time.
For each possible step \(i = 1,2,\dots\) of \(V\), we can write down a Boolean formula \(f_i\) that evaluates to True if and only if that step was completed correctly. For example, if step \(i\) is represented as \(z = z + 1\), then the formula would involve the bits of \(v\) at step \(i-1\) and step \(i\) and would evaluate to True if the addition was carried out correctly. (Checking this claim formally would take a lot more detail, but we will skip it here.) We also make a Boolean formula \(f_{accept}\) that evalutes to True if and only if the verifier accepts based on its memory state at the last step.
Now, our mapping reduction works as follows. Given an instance \(I\) of problem A, we construct the Boolean formula \(f_0 \text{ AND } f_1 \text{ AND } \cdots \text{ AND } f_{accept}\). Here \(f_0\) is a formula that says the memory of \(V\) is correctly initialized to the instance \(I\) and that the rest of the memory, except for \(W\), is correctly initialized. We let the witness input \(W\) remain unconstrained.
The mapping reduction runs in polynomial time because each \(f_i\) can be constructed in polynomial time, and there are polynomially many of them. Further, it is correct because it returns a satisfiable formula if and only if there exists a setting of the witness \(W\), and a setting of the state variables at each time step, such that the verifier runs as it is supposed to and accepts. If there is no such witness \(W\), the formula is not satisfiable because any initial setting of \(W\) and setting of the state variables will either not represent a valid operation of \(V\), or will result in \(V\) rejecting.
24.3 More NP-Complete problems
Knowing about NP-Completeness is very important in practice, because many problems turn out to be NP-Complete. If you run into an NP-Complete problem, you know that it has no known polynomial-time algorithm. Therefore, you will need to think about how to approach it: maybe you can try an approximately-correct algorithm, a heuristic, or (if your problem is not too large) turn to algorithms out there tailored for solving NP-Complete problems as fast as possible.
What other problems are NP-Complete besides SAT? It turns out that now that we have one NP-Complete problem, we can have many more by a more simple approach.
Proposition 24.3 If A is NP-Complete, and B is in NP, and A reduces to B with a polynomial-time mapping reduction, then B is NP-Complete too.
Proof. For any problem X in NP, we can first take a polynomial-time mapping reduction from an instance \(I\) to an instance \(I'\) of A, then another reduction to an instance \(I''\) of B. Since each reduction runs in polynomial time, this is a polynomial-time mapping reduction from X to B.
To prove another problem is NP-Complete, we just have to show that SAT can be reduced to it. Here are some of the many known NP-Complete problems:
- Hamiltonian Path (see Section 23.2)
- Clique problem: given an undirected graph \(G\) and integer \(k\), does there exist a subset of \(k\) vertices that are all pairwise neighbors with each other?
- Knapsack problem, decision problem version: given an instance and a threshold \(t\), does there exist a set of items that fit in the knapsack with total value at least \(t\)?
Knapsack is an interesting case. Recall that we have a dynamic programming algorithm for knapsack that is better than the brute-force approach. However, the algorithm runs in time \(O(nW)\) where \(n\) is the number of items and \(W\) is the weight limit of the knapsack. If we consider the number of bits in the input, it only takes \(\log(W)\) bits to write down \(W\), so the running time is actually exponential in the size of the input. However, this does show that NP-Complete problems can have better algorithms than pure brute-force search.
24.4 Proving a problem is NP-Complete
To prove a decision problem is NP-complete, we have to remember two steps:
- Prove the problem is in NP.
- Prove that an NP-Complete problem reduces to it.
Exercise 24.1 The Hamiltonian Cycle problem is, given a directed graph, to find a cycle that visits every vertex (except the start and end vertex) exactly once. Prove that Hamiltonian Cycle is NP-Complete.
Hint: reduce from Hamiltonian Path.
Example solution.
First, we prove the problem is in NP. The witness will be the cycle referenced in the problem. The verifier, given a graph and a list of vertices, checks that the list contains every vertex once except the start and end vertex, which equal each other, and that it is a valid cycle in the graph (i.e. follows existing edges).
Next, we reduce from Hamiltonian Path to Hamiltonian Cycle. Given an instance of Hamiltonian Path, we add a new vertex \(w\) to the graph and connect it both to and from every existing vertex. If the previous graph had a Hamiltonian Path, then the new graph has a Hamiltonian Cycle by beginning at \(w\), following the Path, and ending at \(w\). Conversely, suppose the new graph has a Hamiltonian Cycle. It must contain \(w\). By rearranging, we can put \(w\) first and last in the Hamiltonian Cycle, i.e. \(w, v_1,\dots,v_n,w\). Then the list \(v_1,\dots,v_n\) is a path that touches every vertex besides \(w\), so it is a Hamiltonian Path in the original graph.