🚀 Phases 1–5 are live — Days 1–17 cover the foundations and the algorithmic patterns. See the roadmap →

Word Ladder hard

Description

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence beginWord -> s_1 -> s_2 -> ... -> s_k such that:

  • Every adjacent pair of words differs by a single letter.
  • Every s_i for 1 <= i <= k is in wordList (note: beginWord does NOT need to be in wordList).
  • s_k == endWord.

Given two words beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence (including both endpoints), or 0 if none exists.

Examples

> Case 1:
    beginWord = "hit", endWord = "cog"
    wordList = ["hot","dot","dog","lot","log","cog"]
    Output: 5
    # hit → hot → dot → dog → cog (5 words)
 
> Case 2:
    beginWord = "hit", endWord = "cog"
    wordList = ["hot","dot","dog","lot","log"]
    Output: 0   # endWord not in dictionary

Constraints

  • 1 <= beginWord.length <= 10
  • endWord.length == beginWord.length
  • 1 <= wordList.length <= 5000
  • All words consist of lowercase English letters.
  • All words in wordList are unique.

State design

The implicit graph:

  • Vertices = words.
  • Edges = between words that differ by exactly one letter.
  • All edges have weight 1 → BFS finds shortest distance.

The hard part is edge generation. Building adjacency by comparing every pair of words is O(N² · L) where N = len(wordList) and L = word length — slow for the constraints.

The pattern trick. For each word, enumerate its L “wildcard patterns” by replacing each character with *. Words that share a pattern differ by one letter. Bucket words by pattern in a hash map — adjacency lookup is then O(L).

E.g., "hot" has patterns *ot, h*t, ho*. So does "hat" (*at, h*t, ha*) — they share h*t, so they’re neighbors.

Code

Analysis

  • Time: O(N · L²)N words, each generates L patterns, each pattern lookup costs O(L) for substring construction.
  • Space: O(N · L²) for the pattern map.

Bidirectional BFS — the speedup

For the largest inputs, bidirectional BFS is dramatically faster: run BFS from both beginWord and endWord simultaneously, alternating which side to expand (always expand the smaller frontier). When the two searches meet, you have the shortest path. The branching factor drops from b^d to 2 · b^(d/2) — a square-root speedup.

Standard interview implementation is the simple one-directional BFS above. Bidirectional is worth mentioning as a follow-up.

The pattern hash is the lesson, not the BFS. Word ladder is a graph problem in disguise; recognizing it as BFS is step one. But the efficient adjacency construction — wildcards bucketed by hash — is the technique you’ll actually carry to other implicit-graph problems (lock combinations, one-character mutations, etc.).

Same skin

  • Word Ladder II — return all shortest transformation sequences, not just the length. BFS to find layered distances, then DFS backwards.
  • Open the Lock — BFS on 4-digit codes where neighbors differ by ±1 in one digit; same shape, smaller state space.
  • Minimum Genetic Mutation — identical problem, different cover.