Word Ladder hard
Description
Given two words beginWord and endWord, and a wordList, return the length of the shortest transformation sequence from beginWord to endWord such that:
- Only one letter can be changed at a time.
- Every transformed word must exist in the
wordList.
Return 0 if no such sequence exists. (The length counts both endpoints — hit → hot → dot → dog → cog has length 5.)
Examples
> Case 1:
Input: beginWord = "hit"
endWord = "cog"
wordList = ["hot", "dot", "dog", "lot", "log", "cog"]
Output: 5
# hit → hot → dot → dog → cog (length 5)
> Case 2:
Input: beginWord = "hit"
endWord = "cog"
wordList = ["hot", "dot", "dog", "lot", "log"]
Output: 0
# "cog" not in wordList → no valid sequenceConstraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000- All words are lowercase English letters.
Recognizing the graph
The trick is to see the graph:
- Each word is a node.
- Two words are connected by an edge if they differ by exactly one letter.
- The problem asks for the shortest path from
beginWordtoendWord.
“Shortest path on an unweighted graph” — that’s BFS.
The naive way to build the graph
For every pair of words, check if they differ by one letter. That’s O(W² · L) where W = word count, L = word length. For W = 5000, L = 10, that’s 250 million operations. Slow.
The clever way: pattern keys
For each word, generate “patterns” by replacing each character with *:
"hot" → ["*ot", "h*t", "ho*"]
"dot" → ["*ot", "d*t", "do*"]
"dog" → ["*og", "d*g", "do*"]Two words are neighbours iff they share at least one pattern. Build a Map<pattern, [words]> in O(W · L²) time, then look up neighbours in O(L) per word.
For W = 5000, L = 10, this is 500,000 operations. 500Ă— faster.
Code (BFS with pattern keys)
Why clearing patterns[key] after use? Once we’ve processed every word for a given pattern, we never need that pattern again — the visited set will reject those words on future visits anyway. Clearing the list avoids re-scanning it, a small but worthwhile optimization.
The bigger trick: bidirectional BFS
For very large word lists, bidirectional BFS halves the search depth: explore simultaneously from beginWord and endWord, alternating the smaller frontier each step. The two searches meet somewhere in the middle.
If standard BFS explores O(b^d) nodes (branching factor b, distance d), bidirectional BFS does O(2 · b^(d/2)) — exponentially less for large d. It’s the same principle that makes meet-in-the-middle algorithms so powerful.
We won’t implement it here — but if your interviewer says “the input could be huge,” that’s the magic phrase.
Why this problem is the canonical BFS interview question
It’s the perfect test of recognizing a graph hidden in non-graph language:
- The problem says “transform words by changing letters.”
- The candidate has to recognize: that’s just edges. Words are nodes. This is a graph.
- And then: “shortest sequence” + each step costs 1 = BFS.
Spot both leaps and the implementation is mechanical.
Analysis
- Time: O(W · L²) to build patterns + O(W · L²) for BFS (each word is visited once, generating L patterns of length L). Net: O(W · L²).
- Space: O(W · L²) for the pattern map plus O(W) for visited.