Memoization
Naive recursion has a dirty secret: it often recomputes the same subproblem over and over. Memoization is the fix — and it’s the single most powerful trick in your recursion toolbox.
The naive Fibonacci disaster
Here’s the textbook recursive Fibonacci:
def fib(n):
if n <= 1:
return n
return fib(n - 1) + fib(n - 2)It’s beautifully short. It’s also catastrophically slow. Computing fib(40) takes seconds. fib(50) takes minutes. fib(60) would take hours.
Why? Look at the call tree.
Count how many times fib(2) appears. Now imagine fib(40) — there are more than 300 million redundant calls. We’re solving the same tiny subproblems trillions of times.
The recurrence T(n) = T(n-1) + T(n-2) + O(1) is itself the Fibonacci sequence — so the runtime grows like Fibonacci does: O(φ^n) ≈ O(1.618^n), which is essentially O(2^n). Exponential disaster for what should be a 60-cycle problem.
The fix in two lines
The insight: fib(2) always returns 1, no matter how many times we ask. So cache it.
def fib(n, memo=None):
if memo is None: memo = {}
if n <= 1:
return n
if n in memo: # already computed?
return memo[n]
memo[n] = fib(n - 1, memo) + fib(n - 2, memo)
return memo[n]Two new lines. The runtime drops from O(2^n) to O(n). fib(60) now takes microseconds instead of millennia.
That’s memoization: cache the result of every subproblem the first time you compute it; reuse the cache on every later call.
What the call tree looks like now
With memoization, each unique subproblem is computed exactly once. The “tree” collapses to something closer to a linear chain — every call after the first hits the cache and returns immediately.
fib(5)
├── fib(4)
│ ├── fib(3)
│ │ ├── fib(2)
│ │ │ ├── fib(1) → 1 (base)
│ │ │ └── fib(0) → 0 (base)
│ │ └── fib(1) → 1 (cache hit, if fib(1) seen before)
│ └── fib(2) (CACHE HIT — return memo[2] instantly)
└── fib(3) (CACHE HIT — return memo[3] instantly)Total real computations: n+1 (one for each unique input from 0 to n). Each takes O(1) work outside the recursive calls. Total: O(n).
Memoized factorial and Fibonacci in all three languages
Python’s functools.lru_cache is a one-line decorator that turns any pure recursive function into a memoized one. It’s the cleanest way to memoize in any language. C++ and Java need a bit more boilerplate but the idea is identical.
When memoization helps (and when it doesn’t)
Memoization is a free 100x speedup when your recursion has overlapping subproblems — the same call happens many times with the same arguments.
It’s useless (or wasteful) when subproblems don’t overlap:
| Algorithm | Overlapping subproblems? | Memoize? |
|---|---|---|
| Naive Fibonacci | Yes — every call repeats | ✅ |
| Climbing stairs (Fib-shaped) | Yes | ✅ |
| Coin change (count ways) | Yes | ✅ |
| 0/1 knapsack | Yes | ✅ |
| Edit distance | Yes | ✅ |
| Factorial | No — factorial(3) only ever appears once | ❌ |
| Merge sort | No — every subarray distinct | ❌ |
| Binary search | No | ❌ |
| Permutations / Subsets generation | No | ❌ |
The rule of thumb: if you draw the recursion tree and see the same node label appearing in multiple places, memoize.
Two flavors of dynamic programming
Memoization is the top-down approach to dynamic programming:
- Top-down (memoized recursion): start with the original problem, recurse down, cache as you go. The function is the same shape as the recursive definition — just with a cache layer.
- Bottom-up (tabulation): start with the smallest subproblems, build up to the answer in a loop. No recursion, no stack.
For Fibonacci, the bottom-up version looks like this:
Same O(n) time, same O(n) space. No recursion at all — and you can even drop the space to O(1) by keeping only the last two values.
| Approach | Pros | Cons |
|---|---|---|
| Top-down memo | Mirrors the recursive definition. Easy to write. | Stack-frame overhead. Risk of overflow on deep recursion. |
| Bottom-up tab | No recursion, no stack overflow risk. | Need to know the right order to fill the table. Less natural for tree-shaped subproblems. |
For interviews, either is fine — leading with memoized recursion is usually clearer, and you can mention bottom-up as a follow-up optimization.
Memoization checklist
Before adding memoization, verify:
- The function is pure — same inputs always produce the same output, no side effects beyond the cache.
- The subproblem space is bounded — at most
O(n),O(n²), etc. unique inputs, not infinite. - The arguments are hashable — for dictionary-based caches.
- You actually see overlap in the call tree.
If any of those fail, memoization won’t help — or worse, it’ll cache garbage.
A bigger example: counting paths in a grid
How many distinct ways are there to walk from the top-left to the bottom-right of an m × n grid, moving only right or down?
The recurrence: paths(i, j) = paths(i - 1, j) + paths(i, j - 1). Base cases: paths(0, j) = paths(i, 0) = 1.
Without memoization, the recursion tree branches twice per call — exponential. With memoization, every (i, j) is computed once — O(m·n).
This is the doorway to dynamic programming, which we’ll cover properly later in the course. The takeaway for today:
If your recursion is slow because of repeated subproblems, the fix is almost always a cache.
Ready to apply this on real problems? Head to the Practice Questions.