4  Divide and Conquer Algorithms

Modified

February 25, 2026

This section introduces a powerful algorithmic paradigm, divide-and-conquer. This paradigm is closely related to recursion, and we introduce recursive algorithms as well.

Objectives. After learning this material, you should be able to:

4.1 Recursion and Divide-and-Conquer

You have probably seen recursive algorithms before, so we will recall them briefly. An algorithm is recursive if, in order to compute its output, it calls itself on a modified input. An example of a recursive algorithm is this naive implementation of the Fibonacci numbers:

Algorithm 4.1 (Fibonacci numbers)  

fib(n):
   if n == 0:
      return 0
   elseif n == 1:
      return 1
   else:
      return fib(n-1) + fib(n-2)

We will be looking at recursive algorithms that fall into the following algorithmic paradigm.

Definition 4.1 (Divide-and-conquer) A divide-and-conquer algorithm splits the input into parts, solves a useful subproblem on each part, then combines the solutions together to produce an answer to the original problem.

Often, divide-and-conquer algorithms are recursive in nature because the subproblems are of the same form as the original problem. This book will focus on recursive examples.

4.3 Mergesort

As with binary search, Mergesort is a recursive divide-and-conquer problem. It also applies to lists of elements, and as with binary search above, we will focus on the case where the elements are numbers, but the ideas can easily extend to other types of list elements.

The Sorting Problem:

  • Input: an array A of numbers.
  • Output: an array containing the same numbers in sorted order.

The idea of Mergesort is to split the list in half and sort the left and right halves separately. Then, we need to run the merge subroutine to combine the two sorted sublists into our final sorted list.

Algorithm 4.3 (Mergesort)  

mergesort(A):
   if len(A) <= 1:
      return A
   let i = int(len(A)/2)
   let B = mergesort(A[0:i])
   let C = mergesort(A[i+1:end])
   return merge(B, C)

merge(B, C):
   let D = []
   let i = 0
   let j = 0
   while i < len(B) or j < len(C):
      if j >= len(C) or (i < len(B) and B[i] < C[j]):
         append B[i] to D
         i += 1
      else:
         append C[j] to D
         j += 1
   return D

4.3.1 Correctness analysis

First, we need a lemma that the “merge” subroutine is correct.

Lemma 4.1 The subroutine merge(B,C), given two sorted lists, outputs a sorted list containing all of the elements in B and C.

Proof. All elements of B and C eventually get added to D because we only increment i when we add B[i] to D and similarly for j and C[j]. Now we just need to show that D is sorted. To do so, we claim that in the while loop of lines 5-11, each iteration, the smallest remaining element of B[i:end] and C[j:end] is added to D. This follows because B is sorted, so its minimum remaining element is located at B[i], and similarly for C[j], and we add the smaller of these to D. (If we have already added all elements of B, then \(i > len(B)\), so we add C[j]; and vice versa.) Because we always append the smallest remaining element to D, D is sorted.

Proposition 4.3 Mergesort correctly sorts the input.

Proof. By induction. For base cases, if len(A) is 0 or 1, mergsort(A) returns A, which is correct.

For the inductive case, let \(len(A) \geq 2\). The elements of A are partitioned into B and C in lines 7 and 8. Each of these is shorter than A, so by inductive hypothesis, B and C are correctly sorted by the recursive calls. By Lemma 4.1, merge(B,C) returns a sorted version of A, so mergesort is correct.

4.3.2 Runtime analysis

We again assume that subarrays can be referenced in constant time.

Lemma 4.2 merge(B,C) runs in time at most 10(len(B) + len(C)) + 4.

Proof. We first analyze the while loop, lines 5-11. Inside the loop, we take a constant number of steps, upper-bounded by about 10. How many iterations does the while loop take? Each iteration, we either increase i or j, but not both. We stop incrementing i when it reaches len(B) and we stop incrementing j when it reaches len(C). When both happen, the while loop completes. Therefore, there are len(B) + len(C) iterations of the while loop, so lines 5-11 take a total of at most 10(len(B) + len(C)) time. Adding 4 steps for lines 2-4 and line 12 gives a running time bound of 10(len(B) + len(C)) + 4.

Proposition 4.4 mergesort(A) runs in time \(O(n \log(n))\), where \(n = len(A)\).

Proof. We will use a similar strategy to binary search: we calculate how much work happens in each call to mergesort(), then we calculate the number of calls. However, we will see that this situation is a bit more complex.

It will be helpful to have an expression for the running time of mergesort on a given input size. Define T(n) to be the running time of mergesort on lists of length n.

First, how much work is done in mergesort(A) itself, ignoring recursive calls? The answer is about 6 steps plus the work done in merge(B,C), which gives a total (using the Lemma) of 10 + 10(len(B) + len(C)). We can observe that len(B) + len(C) = len(A), so this runtime is 10 + 10 len(A). To simplify analysis, let’s consider the case \(len(A) \geq 1\), so we can upper-bound this work by 10 + 10 len(A) \(\leq\) 20 len(A).

Now, we need to consider the recursive calls. To simplify the analysis, we will pretend that \(len(A) = 2^k\) for some integer \(k\), i.e. the length is a power of \(2\). That implies that, because we divide the array in half each time, every subarray we create throughout the algorithm has even length until the length reaches \(1\).

We recursively call mergesort twice, each on a list of length n/2. This gives the following recurrence relation:

\[\begin{align*} T(n) = 2T(n/2) + 20n \end{align*}\]

Here is how it relates to the code:

mergesort(A):                     # T(n) time total
   if len(A) <= 1:
      return A
   let i = int(len(A)/2)
   let B = mergesort(A[0:i])      # T(n/2)
   let C = mergesort(A[i+1:end])  # T(n/2)
   return merge(B, C)             # O(n)

Verbally, the running time of mergesort on a list of length \(n\) is equal to twice its running time on lists of length \(n/2\), plus an extra \(20n\) steps needed to merge the lists and complete the rest of the algorithm.

Solving the recurrence. Now the question is: what formula T(n) solves the recurrence? The method we will use is called the recursion tree method. We create a tree where each node represents a call to mergesort() and its two children represent the two recursive calls it makes. We can make the following key observations.

  • Each child node represents a call on an array of half the length of its parent. Therefore, a node at distance \(t\) from the root represents a call to mergesort() with an array of length \(len(A)/2^t = 2^{k-t}\).
  • Therefore, the height of the tree is \(k\), because after \(k\) halvings, we end up with an array of length \(1\) and make no more recursive calls.
  • The recursion tree is a complete binary tree, which implies that it has \(2^t\) nodes at distance exactly \(t\) from the root, for each \(t=0,\dots,k\).

We are now ready to add up the total time complexity by summing over the “layers” of the recursion tree. At each layer \(t=0,\dots,k\), there are \(2^t\) nodes. Each one does a total amount of work 10 + 10 len(input) = \(10 + 10 2^{k-t}\). Summing, the total time complexity is: \[\begin{align*} T(n) &= \sum_{t=0}^k 2^t \left( 20 \cdot 2^{k-t} \right) \\ &= 20 \sum_{t=0}^k 2^k \\ &= 20 k 2^k . \end{align*}\]

Recalling that \(n = len(A)\) and \(k = \log(n)\), we can rewrite this running time bound as \(T(n) = 20 n \log(n) = O(n \log(n))\).

As a note, we can check that \(T(n) = 20n \log(n)\) satisfies our original recurrence:

\[\begin{align*} 2T(n/2) + 20n &= 2 \left( 20(n/2)\log(n/2) \right) + 20n \\ &= 2 \left( 10n (\log(n) - 1) \right) + 20n \\ &= 2 \left( 10n \log(n) - 10n \right) + 20n \\ &= 20n \log(n) - 20n + 20n \\ &= 20n \log(n) . \end{align*}\]

4.4 Integer Multiplication

Let’s consider one more problem before we discuss how to analyze divide-and-conquer algorithms in general.

The Integer Multiplication Problem:

  • Input: two positive n-bit integers \(x,y\).
  • Output: the product \(xy\).

The “grade-school” algorithm is to multiply each bit of \(x\) by each bit of \(y\) and take the results, suitably shifted, and add them all up. We won’t go into to the analysis, but this takes \(O(n^2)\) time, as we must consider all pairs of bits.

Can we multiply faster than that? Famous mathematicians thought no. But in fact, we can, using a divide-and-conquer algorithm. First, we’ll look at an approach that is not ultimately faster, but paves the way.

Algorithm 4.4 (Divide-and-Conquer Multiplication)  

multiply(x, y):         # assume both have n bits
   let n = len(x)
   if n <= 1:
      return x*y
   let x0,x1 = split(x)
   let y0,y1 = split(y)
   a = multiply(x1,y1)
   b = multiply(x1,y0)
   c = multiply(x0,y1)
   d = multiply(x0,y0)
   return a*2^n + (b+c)*2^(n/2) + d

// Divide an integer into upper and lower halves
split(x):
   let x0 = x % 2^(n/2)     // % is the remainder operation
   let x1 = int(x/2^(n/2))  // int() rounds down
   return x0, x1

Let’s take correctness as given and analyze this algorithm using the recurrence-tree method. Let \(T(n)\) be the running time of multiply(). The split() subroutine is \(O(n)\) time using bit operations. Line 11 is actually \(O(n)\) as well, because adding is \(O(n)\) and multiplying by a power of two is just a bit shift, so also \(O(n)\). That leaves lines 7-10. Each calls multiply() on inputs of half the original size, so they take \(T(n/2)\) time each.

\[\begin{align*} T(n) &= 4 T(n/2) + O(n) . \end{align*}\]

Let \(n = 2^k\). The tree method gives that at each level \(t=0,1,\dots,\log(n)\), there are \(4^t\) nodes and each node does \(n/2^t = 2^{k-t}\) work.

\[\begin{align*} T(n) &= \sum_{t=0}^{k} 4^t (2^{k-t}) \\ &= \sum_{t=0}^{k} 2^{2t} 2^{k-t} \\ &= \sum_{t=0}^{k} 2^{t+k} \\ &= 2^k \sum_{t=0}^{k} 2^t \\ &= 2^k (2^{k+1} - 1) \\ &\approx (2^k)^2 \\ &= n^2 . \end{align*}\] A more rigorous proof wouldn’t hurt, but we conclude the algorithm is \(O(n^2)\). No faster! However, a clever observation allows us to speed it up asymptotically. We’ll see this next.

4.4.1 Karatsuba’s multiplication algorithm

We know that addition and subtraction of \(n\)-bit numbers are \(O(n)\) time operations. Asymptotically, these are much, much cheaper than multiplication, which is \(O(n^2)\) as far as we know. Therefore, we should be willing to do a few extra additions and subtractions if they can help us save even one multiplication. That is the idea behind the next algorithm.

Remember that \(x = 2^n (x_1) + x_0\) and \(y = 2^n (y_1) + y_0\), where \(x_1,x_0,y_1,y_0\) are each \(n/2\) bits. We want to compute \(xy = (2^n x_1 + x_0)(2^n y_1 + y_0) = 2^{2n} x_1y_1 + 2^{n}(x_1y_0 + x_0y_1) + x_0y_0\). Let’s take for granted that we’ll need to compute \(x_1y_1\) and \(x_0y_0\). Can we get the middle term any easier?

The key observation is that we can compute the following product of \(n/2\) bit numbers: \((x_1 + x_0)(y_1 + y_0) = x_1y_1 + x_0y_1 + x_1y_0 + x_0y_0\). Subtracting off \(x_1y_1\) and \(x_0y_0\) leaves us with the middle term we need, \(x_0y_1 + y_0x_1\).

Algorithm 4.5 (Karatsuba Multiplication)  

multiply(x, y):         # assume both have n bits
   let n = len(x)
   if n <= 1:
      return x*y
   let x0,x1 = split(x)
   let y0,y1 = split(y)
   a = multiply(x1,y1)
   d = multiply(x0,y0)
   x_sum = x1 + x0
   y_sum = y1 + y0
   product = multiply(x_sum, y_sum)
   b = a + d - product
   return a*2^n + b*2^(n/2) + d

We now show that Karatsuba’s algorithm runs faster than the \(O(n^2)\) algorithms above.

Proposition 4.5 Karatsuba multiplication runs in time \(O(n^c)\) where \(c = \log_2(3) \approx 1.58.\)

Proof. We first need to write the recurrence. We have:

multiply(x, y):                     # T(n) steps
   let n = len(x)
   if n <= 1:                       # O(1)
      return x*y                    # O(1)
   let x0,x1 = split(x)             # O(n)
   let y0,y1 = split(y)             # O(n)
   a = multiply(x1,y1)              # T(n/2)
   d = multiply(x0,y0)              # T(n/2)
   x_sum = x1 + x0                  # O(n)
   y_sum = y1 + y0                  # O(n)
   product = multiply(x_sum, y_sum) # T(n/2)
   b = a + d - product              # O(n)
   return a*2^n + b*2^(n/2) + d     # O(n)

We can write this recurrence as: \[\begin{align*} T(n) &= 3 T(n/2) + O(n) . \end{align*}\] It turns out that the constants on the \(O(n)\) don’t matter for the big-O solution to the recurrence. We can analyze it with the tree method. Just as with mergesort, there are \(k = \log_2(n)\) levels of the tree, because we cut the input size in half at each level. But now, there are \(3^t\) nodes at level \(t=0,1,\dots,\log_2(n)\). In each node at level \(t\), we have an input size of \(n/2^t\), so we do at most \(C n/2^t\) work at that node for some \(C\). The total is: \[\begin{align*} T(n) &= \sum_{t=0}^{k} 3^t C \frac{n}{2^t} \\ &= Cn \sum_{t=0}^k \left(\frac{3}{2}\right)^t \\ &= C' n \left(\frac{3}{2}\right)^{k} & \text{for some $C'$ (geometric series)} \\ &= C' 3^k \\ &= C' 2^{k \log_2(3)} \\ &= C' n^{\log_2(3)} . \end{align*}\] We used that the sum of an increasing geometric series, in this case \(\frac{3}{2} + \left(\frac{3}{2}\right)^2 + \cdots + \left(\frac{3}{2}\right)^k\), is bounded by a constant times its final element.

There are even faster algorithms than Karatsuba’s for integer multiplication. They are also based on the idea of divide and conquer.