🚀 Phases 1–4 are live — Days 1–13 + Day 17 are fully written. See the roadmap →

Word Ladder hard

Description

Given two words beginWord and endWord, and a wordList, return the length of the shortest transformation sequence from beginWord to endWord such that:

  1. Only one letter can be changed at a time.
  2. Every transformed word must exist in the wordList.

Return 0 if no such sequence exists. (The length counts both endpoints — hit → hot → dot → dog → cog has length 5.)

Examples

> Case 1:
    Input:  beginWord = "hit"
            endWord   = "cog"
            wordList  = ["hot", "dot", "dog", "lot", "log", "cog"]
    Output: 5
    # hit → hot → dot → dog → cog (length 5)
 
> Case 2:
    Input:  beginWord = "hit"
            endWord   = "cog"
            wordList  = ["hot", "dot", "dog", "lot", "log"]
    Output: 0
    # "cog" not in wordList → no valid sequence

Constraints

  • 1 <= beginWord.length <= 10
  • endWord.length == beginWord.length
  • 1 <= wordList.length <= 5000
  • All words are lowercase English letters.

Recognizing the graph

The trick is to see the graph:

  • Each word is a node.
  • Two words are connected by an edge if they differ by exactly one letter.
  • The problem asks for the shortest path from beginWord to endWord.

“Shortest path on an unweighted graph” — that’s BFS.

The naive way to build the graph

For every pair of words, check if they differ by one letter. That’s O(W² · L) where W = word count, L = word length. For W = 5000, L = 10, that’s 250 million operations. Slow.

The clever way: pattern keys

For each word, generate “patterns” by replacing each character with *:

"hot" → ["*ot", "h*t", "ho*"]
"dot" → ["*ot", "d*t", "do*"]
"dog" → ["*og", "d*g", "do*"]

Two words are neighbours iff they share at least one pattern. Build a Map<pattern, [words]> in O(W · L²) time, then look up neighbours in O(L) per word.

For W = 5000, L = 10, this is 500,000 operations. 500Ă— faster.

Code (BFS with pattern keys)

Why clearing patterns[key] after use? Once we’ve processed every word for a given pattern, we never need that pattern again — the visited set will reject those words on future visits anyway. Clearing the list avoids re-scanning it, a small but worthwhile optimization.

The bigger trick: bidirectional BFS

For very large word lists, bidirectional BFS halves the search depth: explore simultaneously from beginWord and endWord, alternating the smaller frontier each step. The two searches meet somewhere in the middle.

If standard BFS explores O(b^d) nodes (branching factor b, distance d), bidirectional BFS does O(2 · b^(d/2)) — exponentially less for large d. It’s the same principle that makes meet-in-the-middle algorithms so powerful.

We won’t implement it here — but if your interviewer says “the input could be huge,” that’s the magic phrase.

Why this problem is the canonical BFS interview question

It’s the perfect test of recognizing a graph hidden in non-graph language:

  • The problem says “transform words by changing letters.”
  • The candidate has to recognize: that’s just edges. Words are nodes. This is a graph.
  • And then: “shortest sequence” + each step costs 1 = BFS.

Spot both leaps and the implementation is mechanical.

Analysis

  • Time: O(W · L²) to build patterns + O(W · L²) for BFS (each word is visited once, generating L patterns of length L). Net: O(W · L²).
  • Space: O(W · L²) for the pattern map plus O(W) for visited.