19 Knapsack
This section looks specifically at variants of the knapsack problem and their dynamic programming solutions.
Objectives. After learning this material, you should be able to:
- Solve each knapsack variant using dynamic programming.
- Identify the DP components of the knapsack algorithms.
- Solve new DP problems involving 2d arrays.
19.1 Duplicates allowed
In the knapsack problem, we are given a set of items \(i=1,\dots,n\) each with a value \(v_i \in \mathbb{R}_+\) (a positive number) and a weight or size \(w_i \in \mathbb{N}\) (a nonnegative integer).
We are given a number \(W \in \mathbb{N}\) which is the maximum weight our knapsack can hold, also called the capacity or size of the knapsack. We must find the max-value subset of items that can fit in the knapsack.
In the duplicates allowed version, there are unlimited copies of each item available.
Exercise 19.1 Given this input instance, what is the optimal solution? Suppose \(W = 7\).
| Item | Value | Weight |
|---|---|---|
| 1 | 4 | 2 |
| 2 | 5 | 3 |
| 3 | 8 | 5 |
Solution.
The optimal solution is two copies of item 1 and one copy of item 2, for a value of 13. The total weight is \(2 \cdot 2 + 3 = 7\), which is feasible as it matches the weight limit. We can check that every other feasible solution has lower value. For example, these are feasible solutions: three copies of item 1; or two copies of item 2; or one copy of item 3 and one copy of item 1.
Let’s look for a DP solution. Recall the components of a DP solution: subproblem definition, computing the final value, the recurrence, and reconstructing the solution. Here a natural subproblem is to have a smaller-capacity knapsack. Let’s try it: our subproblem definition is to let $C[w] = $ the maximum value we can fit in a knapsack of size \(w\). With this subproblem, computing the final value is easy, as it is just \(C[W]\).
For the recurrence, the base case is where \(w = 0\), i.e. no items can fit, and the optimal value is zero. So we set \(C[0] = 0\). For the inductive case: For \(w \geq 1\), we set \[\begin{equation} C[w] = \max\begin{cases} C[w-1] \\ \max_{i : w_i \leq w} ~ v_i + C[w - w_i] \end{cases} . \end{equation}\]
In other words, we can write the inductive case as an algorithm: * Set \(C[w] = C[w-1]\), in other words, consider the optimal solution for a knapsack of size \(w-1\). * That solution can fit in this knapsack as well, since this one is only larger. * For each item \(i\): * Check if item \(i\) can fit in this knapsack, i.e. check if \(w_i \leq w\). * If not, skip this item and keep going. * Put item \(i\) in the knapsack. We now have a space of \(w - w_i\) remaining, and we have a value of \(v_i\). * Fill the remaining space optimally. Luckily, we already solved that subproblem: it gives a value of \(C[w - w_i]\). * Check if the resulting total value, \(v_i + C[w - w_i]\), is better than our current solution. If so, keep it.
Claim 1 The recurrence above is correct, i.e. $C[w] = $ the maximum value we can fit in a knapsack of size \(w\).
Proof. If \(w=0\), then \(C[w] = 0\) because no items can fit.
Now, given \(w \geq 1\), either no items fit, or at least one item fits. If no item fits, then \(C[w] = C[w-1] = 0\), which is correct.
So now suppose that at least one item fits. The optimal solution has at least one item, say \(i\). Now for the remaining space \(w-w_i\) (which is at least zero since \(i\) fits in the knapsack), it must be used optimally. So the total value from the remaining space is \(C[w - w_i]\), by inductive hypothesis. (If it were not used optimally, then we could get a better solution for the space and then add item \(i\) to it and obtain a better solution for \(C[w]\), which contradicts the assumption that this is optimal.)
So \(C[w] = v_i + C[w - w_i]\). So the recurrence is correct, since the optimal solution is the result of picking the best such item \(i\).
Combining these gives a dynamic programming algorithm:
Algorithm 19.1
Correctness: As usual in dynamic programming, correctness follows from correctness of the DP elements, which were argued above.
Efficiency: Space usage is dominated by \(C\), which uses \(O(W)\) space. For running time, we have nested loops, the outer one has \(W\) iterations and the inner one has \(n\) iterations, and the interior operations are constant time per iteration. So running time is \(O(nW)\).
Exercise 19.2 Modify the algorithm to reconstruct the actual list of items in the optimal knapsack.
Hint: Recall that for reconstruction, we should keep track of the choices our algorithm needed to make at each subproblem. At subproblem \(C[j]\), what were our choices? Then, how do we backtrack from the very end, i.e. \(C[W]\), to the beginning, to reconstruct the set?
Solution.
At each x, the choice was which item i to put in the knapsack. This took up w[i] space, and then we needed to add the rest of the solution, which was the optimal solution for C[x - w[i]].
knapsack_dups_2(v, w, W):
# v[i] = value, w[i] = weight, W = weight limit
let C[0] = 0
let Item[0] = null
for x = 1 to W:
C[x] = C[x-1]
Item[x] = null
for i = 1 to n, if w[i] <= x:
C[x] = max(C[x], v[i] + C[x - w[i]])
if C[x] == v[i] + C[x - w[i]]:
Item[x] = i
return kd_reconstruct(w, W, C, Item)
kd_reconstruct(w, W, C, Item):
let x = W
let solution = empty list
while x > 0:
if Item[x] == none:
set x = x - 1
else:
add Item[x] to solution
set x = x - w[Item[x]]
return solution Instead of a list, we could use an array where the \(i\)th entry counts how many copies of item \(i\) are in the solution. This would be more space-efficient for large instances.
19.2 No Duplicates
In this version of the knapsack problem, there is just one copy of each item, but the rest of the problem (including the format of the input) is the same.
We might hope to modify the previous solution while keeping the subproblem \(C[w]\) essentially the same. For example, by somehow remembering which items were used in \(C[w]\). This turns out to fail, in part because there could be multiple optimal subsets for \(C[w]\), and remembering all of them turns out to be prohibitive. Instead, we need a trick.
Subproblem definition. The trick is to introduce an extra dimension to our subproblems. Specifically, for our subproblem definition, let \(C[k,w]\) be the maximum value one can obtain from a knapsack of size \(w\) using only items from the subset \(\{1,\dots,k\}\).
Computing the final solution. We will simply return \(C[n,W]\) where \(n\) is the number of items and \(W\) is the knapsack capacity.
Recurrence. The base case is pretty straightforward: \(C[k,0] = 0\) for all item indexes \(k\) and \(C[0,w] = 0\) for all capacities \(w\).
For the inductive case, we set for \(k \geq 1, w \geq 1\): \[\begin{equation} C[k,w] = \max \begin{cases} C[k-1, w] \\ v_k + C[k-1, w - w_k] & \text{(if $w_k \leq w$)} \end{cases} . \end{equation}\]
Claim 2 The recurrence is correct, i.e. \(C[k,w]\) = the maximum value obtainable from a knapsack of size \(w\) using only items \(\{1,\dots,k\}\).
Proof. For the optimal solution with items \(1,\dots,k\) and capacity \(w\), there are two possibilities: we either include item \(k\), or we don’t. If we don’t, then the optimal solution uses only items \(1,\dots,k-1\), so its value is \(C[k-1,w]\).
If we do, then the remaining space is \(w - w_k\), and to fill it, we are only allowed to use items \(1,\dots,k-1\) because we just used item \(k\). So the optimal way to fill the remaining space is \(C[k-1, w - w_k]\), and our total value is \(v_k + C[k-1, w - w_k]\). Note this is only possible if \(w_k \leq w\), as otherwise item \(k\) cannot fit.
Since these are the only two possibilities (or only one possibility if \(w_k > w\)), and the recurrence chooses the best of both, it is optimal.
Our algorithm is therefore:
Algorithm 19.2
Correctness. As usual for dynamic programming, correctness follows almost immediately from the above arguments that the three components (subproblem, final solution, recurrence) are correct.
Efficiency. Initialization takes \(O(n + W)\) time, and returning the final result is constant time. There are nested loops of \(n\) and \(W\) iterations, with constant-time operations in the innermost loop, so runtime is \(O(n W)\). Space is dominated by \(C\), which uses \(O(n W)\) space.
Exercise 19.3 How do we modify the no-duplicates knapsack algorithm to return the optimal subset of items (i.e. reconstruct the solution), not just its value?
Solution.
We can create an array D[i,x] = False if C[i,x] == C[i-1,x], and otherwise D[i,x] = True, meaning that C[i,x] == v[i] + C[i-1,w-w[i]] and we used item i at this stage.
We then start with D[n,W]. If False, we go to D[n-1, W]. If True, we add item n to the knapsack and go to D[n-1, W - w[n]]. We continue in this way until we get to D[0,x] for any x.