Introduction to Divide and Conquer

You have two algorithms that both “split the array in half and recurse.” One sorts n numbers in O(n log n); the other finds a value in O(log n). Same split, same recursion — so why is one a log faster than the other? Until you can answer that in one sentence, you don’t actually understand divide and conquer.

Divide and conquer is recursion with a particular flavor: the recursive calls operate on independent, non-overlapping sub-instances of the original problem. There’s no shared work between the subcalls — each one solves a distinct piece, and the combine step stitches the pieces.

That non-overlapping property is what separates D&C from dynamic programming. DP exploits repeated subproblems; D&C exploits independent ones. Same recursive shape, opposite philosophical move.

The signature shape

Every D&C algorithm looks like this:

solve(problem):
    if problem is small enough:
        return brute_force(problem)               # base case
    a, b = split(problem)                          # divide
    answer_a = solve(a)                            # conquer
    answer_b = solve(b)                            # conquer
    return combine(answer_a, answer_b, problem)    # combine

Three knobs to tune problem-to-problem:

How do I split? (Usually in half. Sometimes by pivot value. Sometimes by digit position. Sometimes “all binary splits.”)
What’s the base case? (Usually n <= 1.)
How do I combine? (This is where the algorithm lives. Merge sort’s combine is “merge two sorted arrays.” Quicksort’s combine is “do nothing — the split already sorted everything across the boundary.”)

A worked example: merge sort viewed as D&C

Sort an array of n integers.

Split: cut the array in half.
Conquer: recursively sort each half.
Combine: merge the two sorted halves into one sorted result.

Merge sort recursion tree on n=8 — each node splits its work into two halves

■ recursive call   ■ base case — 15 total calls

The recurrence:

T(n) = 2 T(n / 2) + O(n)

Two recursive calls on n / 2 each, plus an O(n) merge step. By the Master Theorem (next page), this is O(n log n).

A different worked example: binary search

Find a target in a sorted array of length n.

Split: look at the middle element. Either it’s the target, or the target is strictly in the left half, or strictly in the right half.
Conquer: recurse on the correct half.
Combine: nothing — just return the answer from the conquer step.

Predict first

Merge sort's recurrence is T(n) = 2 T(n/2) + O(n). Binary search splits in half too. Before reading on — what's binary search's recurrence, and does it have a 1 or a 2 in front of T(n/2)?

The recurrence:

T(n) = T(n / 2) + O(1)

Only one recursive call (we discard the other half), plus a constant comparison. That gives O(log n).

Binary search and merge sort have the same “split in half” division, but wildly different runtimes because:

Merge sort recurses on both halves (the 2 in 2 T(n/2)).
Binary search recurses on one half (the 1 in T(n/2)).
Merge sort’s combine is O(n); binary search’s is O(1).

Same template, different parameters, different complexity. That’s the unifying lesson.

The cost-of-combine analysis

A useful mental shortcut for estimating runtime before you write any code:

Subcalls	Each subproblem size	Combine cost	Total
1	`n / 2`	`O(1)`	`O(log n)`
1	`n / 2`	`O(n)`	`O(n)`
2	`n / 2`	`O(1)`	`O(n)`
2	`n / 2`	`O(n)`	`O(n log n)`
2	`n / 2`	`O(n²)`	`O(n²)`
2	`n / 2`	`O(n³)`	`O(n³)`
8	`n / 2`	`O(n²)`	`O(n³)`
7	`n / 2`	`O(n²)`	`O(n^log₂ 7) ≈ O(n^2.807)` (Strassen)

The cost-of-combine is usually what you control as the algorithm designer. Merge sort’s combine is the entire reason it’s O(n log n) and not O(n²). Strassen beat the natural O(n³) for matrix multiplication by inventing a clever combine step that used 7 sub-multiplies instead of 8 — that single optimization changed the asymptotic exponent.

The relationship to recursion

D&C is recursion. Specifically:

Recursion = any function that calls itself. No constraints on subproblem structure.
D&C = recursion where the subproblems are independent (no shared work) and smaller versions of the original (same problem, smaller input).
DP = recursion where the subproblems overlap and you cache the answers.

If you write a recursive function and the subcalls repeat, memoize → it’s DP. If they don’t repeat → it’s D&C. If you can’t tell, it’s still a valid algorithm; the label is just a teaching aid.

The most important sentence in this chapter: D&C is what you get when you take recursion seriously about subproblem independence. Merge sort’s left-half-sort doesn’t depend on anything the right-half-sort does, so we can solve them in parallel (in fact, merge sort parallelizes embarrassingly well). DP’s subproblems share work, so they have to be ordered. D&C’s don’t — that’s the freedom you trade for not getting memoization for free.

When D&C is the right tool

The tell-tale signs:

The problem has a natural binary split — sorted array (midpoint), tree (left subtree / right subtree), interval (split point), grid (quadrants).
The combine step is doable in time bounded by f(n) where the recurrence still gives a reasonable bound — typically O(n) for O(n log n) algorithms, O(1) for O(log n) algorithms.
The subproblems don’t repeat (otherwise you’re really doing DP and should memoize).
The answer for the whole is constructible from the answers for the parts (some way).

When D&C is not the right tool

The combine step is too expensive — if the recurrence comes out worse than just doing it iteratively, skip the recursion overhead.
The subproblems massively overlap — you’d be re-computing the same work in different branches. Memoize (or use iterative DP).
The problem is inherently sequential — running the recursion adds nothing.

A worked example end-to-end: Maximum Subarray (D&C)

Find the contiguous subarray with the maximum sum.

The standard solution is Kadane’s algorithm — an O(n) 1D DP we covered on Day 14. But Maximum Subarray is also the textbook D&C exercise.

Three observations make D&C work:

The max subarray is either entirely in the left half, entirely in the right half, or crosses the midpoint.
The left-half and right-half cases are recursive subproblems.
The crossing case requires extra work: find the best suffix of the left half plus the best prefix of the right half, both anchored at the midpoint. That’s O(n).

def max_subarray(nums):
    def solve(lo, hi):
        if lo == hi: return nums[lo]
        mid = (lo + hi) // 2
        left  = solve(lo, mid)
        right = solve(mid + 1, hi)
        # crossing
        left_max = float('-inf'); s = 0
        for i in range(mid, lo - 1, -1):
            s += nums[i]; left_max = max(left_max, s)
        right_max = float('-inf'); s = 0
        for i in range(mid + 1, hi + 1):
            s += nums[i]; right_max = max(right_max, s)
        cross = left_max + right_max
        return max(left, right, cross)
    return solve(0, len(nums) - 1)

Recurrence: T(n) = 2 T(n / 2) + O(n) → O(n log n).

The recursion on the two halves is the easy part. The line that makes this a correct D&C — not just “best of left, best of right” — is the combine. Rebuild it from memory:

python · fill in the blanks0/1 hints

def max_subarray(nums):
  def solve(lo, hi):
      if lo == hi: return nums[lo]
      mid = (lo + hi) // 2
      left  = solve(lo, mid)
      right = solve(mid + 1, hi)
      # best suffix of left half + best prefix of right half, anchored at mid
      left_max, s = float('-inf'), 0
      for i in range(mid, lo - 1, -1):
          s += nums[i]; left_max = max(left_max, s)
      right_max, s = float('-inf'), 0
      for i in range(mid + 1, hi + 1):
          s += nums[i]; right_max = max(right_max, s)
      # ??? the optimal subarray is one of THREE things
  return solve(0, len(nums) - 1)

If you blanked it to return max(left, right) you’d silently drop every answer that straddles the midpoint — the single subtlest bug in D&C max subarray.

Worse than Kadane’s O(n). So why teach the D&C version? Because:

The framing — “left, right, or crossing” — generalizes to problems where there isn’t a Kadane-style trick.
It’s a classic interview question precisely because it tests whether you can think D&C-shaped.
The crossing-step idea (anchor at the midpoint, scan outward) reappears in Closest Pair of Points, in the merge step of merge-sort-based inversion counting, and elsewhere.

Quick check

What's the defining property that distinguishes D&C from DP?

Binary search uses the same 'split in half' division as merge sort but runs in O(log n) instead of O(n log n). Why?

In Maximum Subarray D&C, why is the 'crossing' case necessary?

This sharpens the recursion you already know (Day 14) into a named pattern: recursion with independent subproblems. The “1 vs 2 recursive calls, cheap vs expensive combine” knobs you just met are the only two dials that set every D&C runtime. Next: the template — the split/recurse/combine skeleton and the four recurrence shapes those dials produce, so you can read a runtime off a problem before writing any code.

Overview The Template

Finished this page?