Cycle Detection
Topological sort and cycle detection are two sides of the same coin. A directed graph has a topological order if and only if it has no cycle. So every topo-sort algorithm is also a cycle detector, and every directed-cycle detector can be repurposed to attempt a topo sort.
This page covers the two standard approaches plus the technique for reporting the cycle itself, not just its existence.
Approach 1 — Kahn’s: count what got emitted
The shortest cycle detector in this chapter:
def has_cycle_kahn(n, edges):
adj = [[] for _ in range(n)]
indeg = [0] * n
for u, v in edges:
adj[u].append(v); indeg[v] += 1
from collections import deque
q = deque(i for i in range(n) if indeg[i] == 0)
emitted = 0
while q:
u = q.popleft(); emitted += 1
for v in adj[u]:
indeg[v] -= 1
if indeg[v] == 0:
q.append(v)
return emitted < n # if some nodes were never emitted, there's a cycleO(V + E) time, O(V + E) space. The same code as topo sort — we just return the truthy-cycle check instead of the order.
Limitations: doesn’t tell you which nodes are in the cycle. For “report the cycle,” use DFS.
Approach 2 — DFS with three colors
Mark each node WHITE (unvisited), GRAY (on the current DFS stack), or BLACK (fully processed). A back edge — an edge from a GRAY node to another GRAY node — is a cycle.
Time: O(V + E). Space: O(V) recursion + O(V + E) adjacency.
Why three colors and not two on directed graphs?
On an undirected graph, the only way DFS revisits a node is via the edge you came from (the “parent edge”). If you skip the parent and still find a visited node, that’s a cycle. Two states (visited/unvisited) suffice — plus the parent edge to ignore.
On a directed graph, two states fail. Consider:
A -> B -> C
A -> CDFS from A: visit B, then C (from B). Mark both visited. Return to A. Now see edge A → C. C is visited. Is that a cycle? No! It’s a forward edge to an already-finished descendant.
GRAY/BLACK separates the two cases: GRAY = on the current path; BLACK = finished. A back edge to GRAY is a cycle; a forward/cross edge to BLACK is not. Two states can’t tell them apart.
Reporting the actual cycle
Once you detect a cycle, the GRAY ancestor that triggered the detection plus the path from it to the current node forms a cycle. Track parents during DFS:
This is the technique behind real-world cycle reports in tools like npm’s “circular dependency” warnings or make’s circular X <- Y dependency dropped message.
Undirected graphs — a brief cousin
For an undirected graph, the two-color DFS is enough:
def has_cycle_undirected(n, edges):
adj = [[] for _ in range(n)]
for u, v in edges:
adj[u].append(v); adj[v].append(u)
visited = [False] * n
def dfs(u, parent):
visited[u] = True
for v in adj[u]:
if not visited[v]:
if dfs(v, u): return True
elif v != parent: # don't count the parent edge
return True
return False
return any(not visited[i] and dfs(i, -1) for i in range(n))Or, more elegantly, use Union-Find: for each edge (u, v), if find(u) == find(v) then this edge would form a cycle; otherwise union them. O(α(V) × E) time. We cover this on Day 20.
A subtle bug worth knowing
Resetting state after recursion is wrong. In DFS topo sort / cycle detection, once a node becomes BLACK (state = 2), it stays BLACK. Some beginners try to “reset state to 0 on the way back” thinking that’s how backtracking works — but this is graph traversal, not backtracking. Resetting causes the same node to be reprocessed exponentially many times. Never reset state.
Disconnected graphs need the outer loop. A graph can have multiple components. If you only DFS from node 0, you miss everything not reachable from it. Always loop for i in range(n): if state[i] == 0: dfs(i) so you cover every component.
Comparison
| Aspect | Kahn’s (count emitted) | DFS three-color |
|---|---|---|
| Detects cycle? | Yes (emitted < V) | Yes (GRAY descendant) |
| Reports nodes in cycle? | No, needs extra work | Yes — parent pointers reconstruct it |
| Iterative? | Yes | Recursive (or iterative with explicit stack) |
| Stack overflow risk? | No | Yes on deep graphs |
| Time / Space | O(V + E) / O(V + E) | O(V + E) / O(V + E) |
For “does a cycle exist?” problems, Kahn’s is simpler. For “report the cycle” or “find any topo order” problems, either works.
Quick check
Next: applications — build systems, package managers, spreadsheet recalc, instruction scheduling, and the lex-smallest / parallel-depth variations.