Word Ladder hard
Description
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence beginWord -> s_1 -> s_2 -> ... -> s_k such that:
- Every adjacent pair of words differs by a single letter.
- Every
s_ifor1 <= i <= kis inwordList(note:beginWorddoes NOT need to be inwordList). s_k == endWord.
Given two words beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence (including both endpoints), or 0 if none exists.
Examples
> Case 1:
beginWord = "hit", endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
Output: 5
# hit → hot → dot → dog → cog (5 words)
> Case 2:
beginWord = "hit", endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]
Output: 0 # endWord not in dictionaryConstraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000- All words consist of lowercase English letters.
- All words in
wordListare unique.
State design
The implicit graph:
- Vertices = words.
- Edges = between words that differ by exactly one letter.
- All edges have weight 1 → BFS finds shortest distance.
The hard part is edge generation. Building adjacency by comparing every pair of words is O(N² · L) where N = len(wordList) and L = word length — slow for the constraints.
The pattern trick. For each word, enumerate its L “wildcard patterns” by replacing each character with *. Words that share a pattern differ by one letter. Bucket words by pattern in a hash map — adjacency lookup is then O(L).
E.g., "hot" has patterns *ot, h*t, ho*. So does "hat" (*at, h*t, ha*) — they share h*t, so they’re neighbors.
Code
Analysis
- Time:
O(N · L²)—Nwords, each generatesLpatterns, each pattern lookup costsO(L)for substring construction. - Space:
O(N · L²)for the pattern map.
Bidirectional BFS — the speedup
For the largest inputs, bidirectional BFS is dramatically faster: run BFS from both beginWord and endWord simultaneously, alternating which side to expand (always expand the smaller frontier). When the two searches meet, you have the shortest path. The branching factor drops from b^d to 2 · b^(d/2) — a square-root speedup.
Standard interview implementation is the simple one-directional BFS above. Bidirectional is worth mentioning as a follow-up.
The pattern hash is the lesson, not the BFS. Word ladder is a graph problem in disguise; recognizing it as BFS is step one. But the efficient adjacency construction — wildcards bucketed by hash — is the technique you’ll actually carry to other implicit-graph problems (lock combinations, one-character mutations, etc.).
Same skin
- Word Ladder II — return all shortest transformation sequences, not just the length. BFS to find layered distances, then DFS backwards.
- Open the Lock — BFS on 4-digit codes where neighbors differ by
±1in one digit; same shape, smaller state space. - Minimum Genetic Mutation — identical problem, different cover.