Word Ladder — DSA-30

Word Ladder hard

Description

Given two words beginWord and endWord, and a wordList, return the length of the shortest transformation sequence from beginWord to endWord such that:

Only one letter can be changed at a time.
Every transformed word must exist in the wordList.

Return 0 if no such sequence exists. (The length counts both endpoints — hit → hot → dot → dog → cog has length 5.)

Examples

> Case 1:
    Input:  beginWord = "hit"
            endWord   = "cog"
            wordList  = ["hot", "dot", "dog", "lot", "log", "cog"]
    Output: 5
    # hit → hot → dot → dog → cog (length 5)
 
> Case 2:
    Input:  beginWord = "hit"
            endWord   = "cog"
            wordList  = ["hot", "dot", "dog", "lot", "log"]
    Output: 0
    # "cog" not in wordList → no valid sequence

Constraints

1 <= beginWord.length <= 10
endWord.length == beginWord.length
1 <= wordList.length <= 5000
All words are lowercase English letters.

Recognizing the graph

The trick is to see the graph:

Each word is a node.
Two words are connected by an edge if they differ by exactly one letter.
The problem asks for the shortest path from beginWord to endWord.

“Shortest path on an unweighted graph” — that’s BFS.

The naive way to build the graph

For every pair of words, check if they differ by one letter. That’s O(W² · L) where W = word count, L = word length. For W = 5000, L = 10, that’s 250 million operations. Slow.

The clever way: pattern keys

For each word, generate “patterns” by replacing each character with *:

"hot" → ["*ot", "h*t", "ho*"]
"dot" → ["*ot", "d*t", "do*"]
"dog" → ["*og", "d*g", "do*"]

Two words are neighbours iff they share at least one pattern. Build a Map<pattern, [words]> in O(W · L²) time, then look up neighbours in O(L) per word.

For W = 5000, L = 10, this is 500,000 operations. 500× faster.

Try it yourself

⌨ Word Ladder — return the shortest transformation length

def ladder_length(beginWord, endWord, wordList):
  # BFS: words are nodes, edges connect words that differ by one letter
  pass

Hint

It's an unweighted shortest-path, so BFS. Two words are neighbours if they differ by exactly one letter — use '*'-pattern keys to find neighbours fast. The length counts both endpoints; return 0 if endWord isn't reachable.

Reveal solution

Code (BFS with pattern keys)

Why clearing patterns[key] after use? Once we’ve processed every word for a given pattern, we never need that pattern again — the visited set will reject those words on future visits anyway. Clearing the list avoids re-scanning it, a small but worthwhile optimization.

The bigger trick: bidirectional BFS

For very large word lists, bidirectional BFS halves the search depth: explore simultaneously from beginWord and endWord, alternating the smaller frontier each step. The two searches meet somewhere in the middle.

If standard BFS explores O(b^d) nodes (branching factor b, distance d), bidirectional BFS does O(2 · b^(d/2)) — exponentially less for large d. It’s the same principle that makes meet-in-the-middle algorithms so powerful.

We won’t implement it here — but if your interviewer says “the input could be huge,” that’s the magic phrase.

Why this problem is the canonical BFS interview question

It’s the perfect test of recognizing a graph hidden in non-graph language:

The problem says “transform words by changing letters.”
The candidate has to recognize: that’s just edges. Words are nodes. This is a graph.
And then: “shortest sequence” + each step costs 1 = BFS.

Spot both leaps and the implementation is mechanical.

Analysis

Time: O(W · L²) to build patterns + O(W · L²) for BFS (each word is visited once, generating L patterns of length L). Net: O(W · L²).
Space: O(W · L²) for the pattern map plus O(W) for visited.

Shortest Path in Binary Matrix Overview

Finished this page?