Advanced MST Patterns

Kruskal and Prim are the workhorses — but the MST shows up in disguises that look nothing like “connect these nodes.” Cluster a dataset, approximate the traveling salesman, supply water to houses through a “free” source: all MST underneath. This page is the advanced toolkit and the modeling tricks that reveal an MST hiding inside a problem.

The previous two pages covered the two canonical MST algorithms. This page covers the techniques that separate a solid MST foundation from competitive programming fluency: Borůvka’s (the parallel-friendly algorithm), the minimum bottleneck problem, finding the second-best MST, and two classic interview tricks (virtual nodes and the offline Kruskal reversal).

1. Borůvka’s Algorithm

Borůvka’s (1926 — older than both Kruskal’s and Prim’s!) runs in O(E log V) and is uniquely suited to parallel computation.

The idea: In each phase, every component simultaneously finds its cheapest outgoing edge and adds it. After each phase, the number of components at least halves. So after O(log V) phases, the MST is complete.

def boruvka(n, edges):
    parent = list(range(n))
    rank = [0] * n
 
    def find(x):
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x
 
    def unite(x, y):
        px, py = find(x), find(y)
        if px == py: return False
        if rank[px] < rank[py]: px, py = py, px
        parent[py] = px
        if rank[px] == rank[py]: rank[px] += 1
        return True
 
    total = 0
    num_components = n
 
    while num_components > 1:
        # For each component, find cheapest outgoing edge
        cheapest = [-1] * n
        for i, (u, v, w) in enumerate(edges):
            pu, pv = find(u), find(v)
            if pu == pv: continue
            if cheapest[pu] == -1 or edges[cheapest[pu]][2] > w:
                cheapest[pu] = i
            if cheapest[pv] == -1 or edges[cheapest[pv]][2] > w:
                cheapest[pv] = i
 
        for i in range(n):
            if cheapest[i] != -1:
                u, v, w = edges[cheapest[i]]
                if unite(u, v):
                    total += w
                    num_components -= 1
 
    return total

Borůvka’s is the algorithm behind GHK (Gabow-Galil-Spencer-Tarjan) O(E log log V) and the basis for many parallel/distributed MST algorithms where you want to process edges in chunks.

2. Minimum Bottleneck Spanning Tree

A minimum bottleneck spanning tree (MBST) minimizes the maximum edge weight in the tree (rather than the sum).

Find a spanning tree where the heaviest edge is as light as possible.

Key theorem: Every MST is also an MBST. (The proof: if some MST had maximum edge weight > the MBST’s maximum, you could replace that edge to get a cheaper MST — contradicting minimality.)

So running Kruskal’s or Prim’s gives you both the minimum sum spanning tree and the minimum bottleneck spanning tree.

This observation appears in problems about maximum flow in undirected graphs: the maximum flow from s to t equals the maximum “bottleneck path” weight, which is determined by the MST path between s and t.

3. The Second-Best MST

Find the spanning tree with the second-smallest total weight.

The second-best MST differs from the MST in exactly one edge swap: we remove one MST edge and add one non-MST edge. The best such swap gives the second-best MST.

def second_mst(n, edges, mst_edges):
    # For each non-MST edge (u, v, w):
    # The cycle created contains the path from u to v in the MST.
    # Find the maximum-weight MST edge on that path.
    # Swapping that edge for (u, v, w) gives the cost: mst_cost - max_edge + w.
    # Minimize over all non-MST edges.
    pass  # Requires LCA + sparse table for path-max queries — O(E log V) total

This is a hard interview problem (LeetCode 1489 is the “find critical and pseudo-critical edges” version). The key insight is that the optimal swap always involves exactly one non-MST edge.

4. Virtual node trick

Some problems have an implicit “free” connection from any node to a virtual source. For example:

You have n houses. You can connect house i to the water system at cost wells[i], or connect house i to house j at cost pipes[i][j]. Find the minimum total cost to connect all houses to water.

Trick: Add a virtual node 0 representing “the water system.” Add an edge from 0 to each house i with weight wells[i]. Now the problem is a standard MST on n+1 nodes.

Recall the one line that is the trick — the edges that turn “connect to the source” into ordinary graph edges:

python · fill in the blanks0/1 hints

def min_cost_supply(n, wells, pipes):
  # virtual node 0 = "the water source"
  # ??? the virtual-source edges
  for house, neighbor, cost in pipes:
      edges.append((cost, house, neighbor))          # house-to-house pipes
  edges.sort()
  # ... then run plain Kruskal's MST on n + 1 nodes

def min_cost_supply(n, wells, pipes):
    edges = [(wells[i], 0, i + 1) for i in range(n)]      # virtual connections
    for house, neighbor, cost in pipes:
        edges.append((cost, house, neighbor))
    edges.sort()
 
    parent = list(range(n + 1))
 
    def find(x):
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x
 
    total = added = 0
    for w, u, v in edges:
        pu, pv = find(u), find(v)
        if pu != pv:
            parent[pu] = pv
            total += w
            added += 1
            if added == n: break
    return total

This pattern appears in LeetCode 1168 (Optimize Water Distribution — premium), and related problems where one “free” connection is available.

5. Offline Kruskal reversal (minimum spanning forest with deletions)

Some problems add constraints like “must include edge X in the MST” or “edge X is forbidden.” The techniques:

Forced edge: Add the forced edge first (before sorting), then run Kruskal’s normally.
Forbidden edge: Remove it from the edge list and run Kruskal’s normally.
Critical edge detection: Run Kruskal’s with and without each edge to see if the MST weight changes.

For “critical and pseudo-critical edges” (LeetCode 1489):

An edge is critical if removing it increases the MST weight.
An edge is pseudo-critical if forcing it into the MST doesn’t increase the MST weight (but removing it would).

def find_critical_pseudo_critical(n, edges):
    edges = [(w, u, v, i) for i, (u, v, w) in enumerate(edges)]
    edges.sort()
    base_mst = kruskal_weight(n, edges, skip=-1, force=-1)
 
    critical, pseudo = [], []
    for idx in range(len(edges)):
        # Check critical: MST weight increases when edge idx is skipped
        if kruskal_weight(n, edges, skip=idx, force=-1) > base_mst:
            critical.append(edges[idx][3])
        # Check pseudo-critical: MST weight stays same when edge idx is forced
        elif kruskal_weight(n, edges, skip=-1, force=idx) == base_mst:
            pseudo.append(edges[idx][3])
    return [critical, pseudo]

6. MST applications

Problem domain	MST application
Network design	Minimum cost to wire all buildings
Clustering	Remove k−1 heaviest MST edges → k clusters
Image segmentation	Pixels = nodes, color difference = edge weight, MST partitions regions
Approximation algorithms	The MST is a 2-approximation for the Travelling Salesman Problem on Euclidean graphs
Bottleneck path	Maximum flow on an MST path = maximum bottleneck path in original graph
Circuit design	Minimum wire length to connect circuit components

Clustering via MST

Predict first

You've built the MST of a set of points and want to split them into k clusters that are as well-separated as possible. What do you remove from the MST — and why does it work?

Remove the k−1 most expensive edges from the MST → k clusters. This is equivalent to finding the k-clustering that maximizes the minimum inter-cluster distance (complete linkage clustering).

def cluster(n, edges, k):
    # Build MST via Kruskal, then drop k-1 heaviest edges
    mst_edges = kruskal_edges(n, edges)       # sorted ascending
    mst_edges.sort(key=lambda e: e[2])        # sort by weight
    keep = mst_edges[:n - k]                  # keep n-k edges → k components
    return connected_components(n, keep)

Quick check

Why is every MST also a minimum bottleneck spanning tree?

In the virtual node trick for water distribution, why can we run a standard MST algorithm after adding the virtual node?

That completes Day 22 — the two MST algorithms, the theory that proves them, and the modeling tricks that hide an MST inside a problem. Next: Basic Questions to cement them, then the practice set. After MSTs comes Day 23 — Topological Sort, the last graph chapter: ordering tasks with dependencies.

Prim's Algorithm Basic Questions

Finished this page?