Floyd-Warshall Algorithm
If Dijkstra is a Swiss Army knife and Bellman-Ford is a sledgehammer, Floyd-Warshall is a rolling pin. It’s the shortest, simplest, most direct way to compute all-pairs shortest paths, and it runs in O(V³) regardless of edge weights (negatives are fine, as long as no negative cycle exists).
It’s also one of the most elegant pieces of dynamic programming you’ll ever see.
The DP formulation
Let dist[k][i][j] = shortest distance from i to j using only vertices {0, 1, ..., k} as intermediates.
Two cases:
kis not on the shortest path fromitoj. Thendist[k][i][j] = dist[k-1][i][j].kis on the path. The path goesi → ... → k → ... → j. Both halves use only intermediates{0, ..., k-1}. Sodist[k][i][j] = dist[k-1][i][k] + dist[k-1][k][j].
Take the min:
dist[k][i][j] = min(dist[k-1][i][j],
dist[k-1][i][k] + dist[k-1][k][j])Base case: dist[-1][i][j] is the direct edge weight (or ∞ if no edge, 0 if i == j).
You can collapse the k dimension — at each k, the values you need on the right-hand side are exactly the ones currently in the table. Floyd-Warshall is the table compressed to 2D, with three nested loops.
The template
Time: O(V³). Space: O(V²) for the matrix.
That’s it. Five lines for the core algorithm. The hardest part is remembering that k is the outermost loop — switch the order and the DP breaks.
k must be outermost. The recurrence updates dist[i][j] using values that incorporate intermediates ≤ k. If i or j is outermost, you’ll be reading and writing the same dist cell within a single “iteration” inconsistently, and the answers will be wrong. Memorize the loop order: k, i, j.
Worked example
Three vertices, direct edges:
0 1 2
0 [ 0 5 ∞ ]
1 [ ∞ 0 3 ]
2 [ 2 ∞ 0 ]After k = 0 (allow 0 as intermediate): nothing changes since row 0 has no edges going inward.
After k = 1: check dist[i][1] + dist[1][j] for every (i, j). dist[0][1] + dist[1][2] = 5 + 3 = 8, less than ∞ — update dist[0][2] = 8.
0 1 2
0 [ 0 5 8 ]
1 [ ∞ 0 3 ]
2 [ 2 ∞ 0 ]After k = 2: dist[1][2] + dist[2][0] = 3 + 2 = 5, less than ∞ — update dist[1][0] = 5. dist[1][2] + dist[2][1] = 3 + ∞ = ∞, no change. Also dist[0][2] + dist[2][0] = 8 + 2 = 10, but dist[0][0] is already 0.
0 1 2
0 [ 0 5 8 ]
1 [ 5 0 3 ]
2 [ 2 7 0 ]Wait, dist[2][1]? Through k = 1 we had nothing, through k = 0 we get dist[2][0] + dist[0][1] = 2 + 5 = 7. Yes — but that needed k = 0 AND k = 1 ordering working together. Tracing the literal DP: dist[2][1] updates at k = 0 to 7, then k = 1 finds nothing more. Final matrix is the one above.
Negative-cycle detection
After running Floyd-Warshall, check the diagonal. If dist[i][i] < 0 for any i, the graph has a negative cycle reachable from and to i — going around that cycle made the “distance from i to itself” negative.
for i in range(n):
if dist[i][i] < 0:
return None # negative cycle existsUnlike Bellman-Ford (which only detects negative cycles reachable from the source), Floyd-Warshall detects them everywhere — useful for problems where the source isn’t fixed.
Path reconstruction
Store a next[i][j] matrix during the DP, recording the next vertex on the path from i to j:
def floyd_warshall_with_path(dist):
n = len(dist)
nxt = [[j if dist[i][j] != float('inf') else -1
for j in range(n)]
for i in range(n)]
for k in range(n):
for i in range(n):
for j in range(n):
if dist[i][k] + dist[k][j] < dist[i][j]:
dist[i][j] = dist[i][k] + dist[k][j]
nxt[i][j] = nxt[i][k]
def path(nxt, i, j):
if nxt[i][j] == -1:
return []
p = [i]
while i != j:
i = nxt[i][j]
p.append(i)
return pO(V²) extra space, O(path length) reconstruction.
When to use Floyd-Warshall
- You need all-pairs shortest distances.
- The graph is small — typically
V ≤ 400or so. AtV = 500, you’re looking at1.25 × 10^8operations, still fast in C++ but borderline in Python/Java. - The graph is dense — when
E ≈ V², the alternative (V runs of Dijkstra) isO(V · (V + E) log V) = O(V³ log V)— Floyd-Warshall wins. - You need negative-cycle detection over all sources/sinks.
- The graph changes and you want to support fast updates — Floyd-Warshall isn’t naturally dynamic, but the structure lends itself to incremental edge-addition variants.
When NOT to use Floyd-Warshall
- The graph is sparse and large.
V = 10000withE = O(V)makesO(V³) = 10^12— infeasible. Run Dijkstra from each source instead. - You only need a single source. Floyd-Warshall computes way more than you need.
The “transitive closure” trick
If you only care whether vertex i can reach vertex j (not the distance), use Floyd-Warshall with + replaced by OR and min replaced by OR:
for k in range(n):
for i in range(n):
for j in range(n):
reach[i][j] = reach[i][j] or (reach[i][k] and reach[k][j])O(V³), same shape. Called Warshall’s algorithm in older textbooks. Useful for any problem framed as “given these direct relationships, who is connected to whom?”
Quick check
Next: a side-by-side decision guide for picking the right algorithm.