Introduction to Backtracking
Backtracking is the algorithm pattern you use when there’s no clever recurrence, no greedy swap, no DP table — when the only way forward is to actually try the possibilities, but in a way that doesn’t take forever.
Concretely: you grow a partial solution one step at a time. At each step you have a small set of legal choices. You pick one, recurse, and either reach a complete solution (success!) or hit a dead end. When you hit a dead end, you undo the last choice and try the next legal one. That undo step — the “back” in backtracking — is the whole game.
The decision tree
Every backtracking problem implicitly defines a tree of partial solutions. The root is the empty solution. Each child is one more choice on top of its parent. The leaves are either complete solutions or dead ends.
Take generating all subsets of [1, 2, 3]. At every element, we choose to include it or skip it:
Eight leaves, eight subsets. The “tree” exists only in the call stack — you don’t materialize it, you walk it.
This is the universal shape. Backtracking is depth-first walking of a decision tree, with an undo between siblings.
Brute force vs backtracking — the difference
Both try every possibility. The difference is when they detect that a partial solution is doomed.
- Brute force builds a complete candidate, then checks if it’s valid. For an
8 × 8board, that’s64⁸ ≈ 280 trillioncomplete configurations to check for N-Queens. Hopeless. - Backtracking checks validity as it builds. The moment a partial solution can’t possibly extend to a valid full solution, it bails out and tries something else. For N-Queens, this trims the search from trillions to fewer than 2000 nodes.
The technical name for this trick is pruning, and we’ll spend an entire page on it. But the intuition is: fail fast, fail high in the tree, never explore a subtree you can prove is dead.
Without pruning, backtracking is just slow brute force with a fancy name. With pruning, it’s one of the most powerful general-purpose search techniques in your toolbox — solving puzzles in microseconds that brute force can’t finish in a millennium.
When to reach for backtracking
The tell-tale signs:
- “Find all …” — all subsets, all permutations, all valid combinations, all paths. The output is itself a list of solutions, so you have to enumerate them.
- “Does there exist …” — does there exist a valid Sudoku completion? A way to color the graph? A path through the maze? The state space is too irregular for DP, but the partial-state check is cheap.
- “Smallest / largest valid configuration …” — N-Queens with the minimum number of pieces, the shortest word ladder, the smallest Sudoku puzzle with a unique solution.
- Constraints are local — “no two queens on the same diagonal,” “no repeated character,” “every box of nine cells contains 1–9.” Local constraints make pruning cheap and effective.
nis small — usuallyn <= 20for unrestricted problems, sometimesn <= 100for problems with heavy pruning. Ifnis large, the only DP, greedy, or polynomial-time algorithm will work.
If two or three of these match, reach for the template.
Backtracking vs DP vs greedy — pick the right hammer
A common point of confusion. Here’s the disambiguation:
| Technique | When subproblems repeat | When optimal substructure | When you need ALL solutions |
|---|---|---|---|
| DP | Yes — caching pays off | Yes — combine subanswers | Awkward — DP returns ONE answer or a count |
| Greedy | Doesn’t matter — never reconsiders | Yes, plus exchange argument | No — commits early |
| Backtracking | No — every branch is distinct | Not required — just feasibility | Yes — its natural output |
DP and backtracking are not enemies. Memoized backtracking is just DP written top-down with extra branching — when subproblems repeat and you want one optimum, lean toward DP. When they don’t repeat or you need every valid solution, lean toward backtracking.
A worked example end-to-end: generate all subsets
Given a set of distinct integers
nums, return all possible subsets (the power set).
Five questions to set up any backtracking problem:
What’s the partial solution?
A list of the elements picked so far — call it path. Starts empty.
What’s a complete solution?
Once we’ve made a decision about every element of nums, path is one valid subset. We record a copy of it.
What are the choices at each step?
At index i, we have two choices: include nums[i] in path, or skip it.
What’s the “is this legal?” check?
Trivial — there are no constraints on subsets, every choice is legal. Pruning won’t help here.
What’s the un-do?
After recursing with nums[i] included, pop it off path to restore state before trying the skip branch.
The code transcribes the answers:
Time: O(2^n × n) — there are 2^n subsets and copying each one takes O(n).
Space: O(n) recursion stack plus O(2^n × n) for the output.
That’s the whole template. The next page formalizes it into the three-line skeleton you can drop into every problem on this page.
The two outputs: “all of them” vs “the best one”
Backtracking problems split into two flavors:
- Enumerate: return all valid solutions. Code reaches a leaf, records a copy of
path, returns up the tree to find more. Subsets, permutations, combinations, generate-parentheses. - Search: return one solution (or the optimum), then stop. Code returns
trueupward through the call stack as soon as a leaf works. N-Queens with a single solution, Sudoku, word-ladder reachability.
The template is the same for both — what changes is what happens at the leaf:
on success:
enumerate flavor → record path, keep searching
search flavor → return true up the stack, stop exploringKnowing which flavor a problem asks for tells you whether to return early or keep going. Read the problem statement carefully.
Quick check
Next: the template — the three-line skeleton, plus the four canonical shapes (subsets, permutations, combinations, partitions) built on top of it.