Advanced DP

The seven classic patterns assume your state fits a simple index or a pair. But what if the state is “which subset have I picked”? Or “which interval am I solving”? Or “this node and all its descendants”? These don’t break DP — they just need a cleverer state. This page is the four state shapes (bitmask, interval, tree, and squeezed-memory) that unlock the hard half of DP problems.

The seven classic patterns will carry you through most interviews. The five techniques on this page are the next layer — they appear in the harder questions, the contest problems, and the “now do it under tight memory” follow-ups.

1. Space optimization

The single highest-leverage trick in DP. Most table cells are read once or twice after they’re written; almost none need to live for the whole computation. If you can identify the frontier — the set of cells your current row depends on — you can throw away everything else.

The “two-row” pattern

If dp[i][j] only depends on dp[i - 1][...] (the previous row), keep two rows: prev and cur. After each row is filled, swap them.

For LCS (dp[i][j] depends on dp[i-1][j-1], dp[i-1][j], dp[i][j-1]) the two-row form is:

O(n × m) time stays the same; O(n × m) space drops to O(m).

The “one-row” pattern (reverse iteration)

For knapsack-style problems where dp[i][w] depends only on dp[i - 1][w] and dp[i - 1][w - weight[i]], you can use a single row if you iterate the inner loop in reverse. The reverse order ensures that when you read dp[w - weight[i]], it still holds the value from “row i - 1” — not yet updated.

dp = [0] * (W + 1)
for i in range(n):
    for w in range(W, weight[i] - 1, -1):    # reverse
        dp[w] = max(dp[w], dp[w - weight[i]] + value[i])

O(n × W) time, O(W) space. Iterate forward instead and you’ve silently turned 0/1 knapsack into unbounded knapsack — same code, different problem.

When you can’t optimize

If your transition reaches more than the previous row — e.g. dp[i][j] depends on dp[i - 3][j] or dp[i - k][j] for variable k — you have to keep at least k + 1 rows around. Rule of thumb: the space you need is the maximum row-distance your transition reaches.

2. Bitmask DP

When the state needs to track which elements have been chosen from a small universe (n <= ~22), encode the subset as the bits of an integer. The state is then (mask, ...). There are 2^n masks, so this works only for small n.

Canonical: Traveling Salesman (Held–Karp)

n cities, distance d[i][j] between every pair. Find the shortest tour visiting every city once and returning to the start.

State: dp[mask][i] = min cost of a path that has visited exactly the cities in mask and currently sits at city i. Transition: dp[mask][i] = min(dp[mask ^ (1 << i)][j] + d[j][i]) over j in mask \ {i}.

Time: O(2^n × n²). Space: O(2^n × n). The classic Held–Karp algorithm — exponential, but a huge improvement over naive O(n!).

Same skin

Shortest Path Visiting All Nodes — BFS over (mask, node) states (or DP).
Number of Ways to Wear Different Hats — dp[mask] over people-mask, iterating hats.
Partition to K Equal Sum Subsets — dp[mask] over picked-elements.
Find the Shortest Superstring — dp[mask][i] where i is the last string in the supersequence.

3. Interval DP

State is dp[l][r] — the answer over the sub-array (or substring) from index l to index r. Transitions usually try every split point k in (l, r). Iteration order is by length of the interval, not by position.

Canonical: Matrix Chain Multiplication

You have matrices A1, A2, ..., An with dimensions p[0] x p[1], p[1] x p[2], etc. Find the cheapest parenthesization.

State: dp[l][r] = min multiplications to compute Al × ... × Ar. Transition: dp[l][r] = min over k in [l, r) of dp[l][k] + dp[k + 1][r] + p[l - 1] × p[k] × p[r].

Recall the heart of interval DP — the split that combines two shorter intervals:

python · fill in the blanks0/1 hints

for length in range(2, n + 1):          # fill by INTERVAL LENGTH, shortest first
  for l in range(1, n - length + 2):
      r = l + length - 1
      dp[l][r] = float('inf')
      for k in range(l, r):                       # try every split point k
          # ??? cost of splitting [l, r] at k
          dp[l][r] = min(dp[l][r], cost)

Time: O(n³). Space: O(n²).

Same skin

Burst Balloons — dp[l][r] = max coins from popping balloons in the open interval (l, r).
Palindrome Partitioning II — min cuts; uses interval-style preprocessing.
Stone Game / Predict the Winner — game-theory variant of interval DP.
Longest Palindromic Subsequence — dp[l][r] = LPS of s[l..r].

4. DP on Trees

State lives on tree nodes, transitions combine answers from a node’s children. Almost always solved with post-order DFS — compute children first, then combine.

Canonical: House Robber III

Same rules as House Robber, but the houses form a binary tree. You can’t rob a node and either of its children.

State: for each node, return a pair (rob_this, skip_this). Transition:

rob_this = node.val + skip_left + skip_right
skip_this = max(rob_left, skip_left) + max(rob_right, skip_right)

Predict first

House Robber III runs on a tree, and the DFS returns a PAIR (rob, skip) for each node — not a single 'best for this subtree.' Why does it need both numbers?

Time: O(n) — each node visited once. Space: O(h) recursion stack.

Same skin

Binary Tree Maximum Path Sum — at each node, return “best one-armed path through me”; track global best with both arms.
Diameter of a Tree — same shape, length instead of sum.
Longest Path with Different Adjacent Characters — node value is a constraint, not a sum.
Tree Distance Sum / Rerooting — two DFS passes — first computes “answer rooted at root,” second re-roots to every node.

5. Subsequence vs Substring — the eternal pitfall

These two words look almost identical and produce wildly different problems. Burn the distinction in.

Term	Definition	DP shape
Subsequence	Pick elements in order, but can skip any of them.	`dp[i][j]` carries forward through mismatches with `max`.
Substring	A contiguous slice — no skipping.	`dp[i][j]` resets to 0 on mismatch; answer is max over the whole table.

LCS finds the longest common subsequence: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) on mismatch. Longest Common Substring finds the longest common contiguous slice: dp[i][j] = 0 on mismatch, and the answer is max(dp[i][j]) over the whole table, not dp[n][m].

Same for palindromic subsequence (O(n²) LPS via LCS of s and reverse(s)) vs palindromic substring (different DP — dp[l][r] = (s[l] == s[r]) and dp[l + 1][r - 1], expand-around-center, or Manacher’s).

When the problem says “subsequence” reach for the inclusion-exclusion-style DP. When it says “substring” or “subarray” reach for the contiguous-reset DP.

6. The DP optimization toolkit (preview)

Three named tricks that show up at the harder end of contests:

Knuth’s optimization — for interval DP with monotonic split points. Drops O(n³) to O(n²).
Divide and Conquer optimization — for DPs where the optimal split point is monotonic in the row index. Same O(n³) → O(n²) shape.
Convex Hull Trick — for DPs of the form dp[i] = min(dp[j] + b[j] × a[i] + ...) with linear-in-a[i] cost. Reduces an O(n²) DP to O(n log n) or O(n).

These are out of scope for most interviews — but worth knowing they exist when you hit a DP that’s TLE-ing at n = 10^5.

Quick check

In 0/1 knapsack space-optimized to one row, why must the inner loop iterate capacity in REVERSE?

You see n ≤ 20 and the problem involves choosing a subset of n elements. What DP shape should you try?

Why is the iteration order for interval DP 'by length, not by left endpoint'?

That’s the full DP toolkit — the seven shapes plus these four advanced state forms. Next: basic warm-up questions to drill the templates (climb stairs, min cost stairs, fib with memo, house robber), then the practice set. After DP comes Day 15 — Greedy, the other “make a choice at each step” technique — and the discipline of proving when a greedy choice is actually safe.

Classic Patterns Basic Questions

Finished this page?