Merge Sort

For decades the question was open: can you sort faster than O(n²)? Merge sort is the answer — and its whole power comes from one humble subroutine. If you already have two sorted halves, stitching them into one sorted whole takes only a single linear pass. Get that insight and the O(n log n) falls out almost for free.

The first sort that breaks the O(n²) barrier. Merge sort is divide and conquer in its purest form: split the array in half, sort each half recursively, then merge the two sorted halves.

It’s the algorithm that proved sorting could be done in O(n log n). It’s also the algorithm Python uses for sorting objects (as part of Timsort), and the go-to for sorting data that doesn’t fit in memory.

The idea in three sentences

Divide. Split the array in half.
Conquer. Recursively sort each half.
Combine. Merge the two sorted halves into one sorted whole.

The base case: an array of length 1 is already sorted.

Why merging is fast

Predict first

You have two already-SORTED arrays, total length n, and want one sorted array. What's the fastest you can merge them — and how?

The cleverness of merge sort lives in the merge step. Given two already-sorted arrays of total length n, you can merge them in O(n) time:

Left  = [1, 4, 7]
Right = [2, 3, 8]

Two pointers, one on each array. Always take the smaller of the two heads.

  1 < 2 → take 1.  out = [1]
  4 > 2 → take 2.  out = [1, 2]
  4 > 3 → take 3.  out = [1, 2, 3]
  4 < 8 → take 4.  out = [1, 2, 3, 4]
  7 < 8 → take 7.  out = [1, 2, 3, 4, 7]
  Right has 8 left → append. out = [1, 2, 3, 4, 7, 8]

That’s the whole trick — comparing only the heads, never the middles, gives linear-time merging.

Watch it run

merge_sort([5, 1, 4, 2, 8, 3])

Step 1 / 29 — Start merge sort.

comparingswappingsortedpivotcurrent min

Notice the bottom-up pattern: tiny windows get merged into bigger sorted windows, until the whole array is one big sorted window.

The code

⚠️

Note int mid = lo + (hi - lo) / 2 rather than (lo + hi) / 2. The first form avoids integer overflow when lo + hi exceeds INT_MAX — a real bug that lived in Java’s binary search for nine years before someone noticed.

Recall the merge’s core decision — and the one character that keeps it stable:

python · fill in the blanks0/1 hints

def merge(a, lo, mid, hi):
  left  = a[lo:mid + 1]
  right = a[mid + 1:hi + 1]
  i = j = 0
  k = lo
  while i < len(left) and j < len(right):
      # ??? take the smaller head (and keep equals stable)
          a[k] = left[i]; i += 1
      else:
          a[k] = right[j]; j += 1
      k += 1
  while i < len(left):  a[k] = left[i];  i += 1; k += 1
  while j < len(right): a[k] = right[j]; j += 1; k += 1

Why O(n log n)?

The recursion tree has log n levels (you halve the array each split). At every level, the total merge work across all sub-arrays is O(n) — each element is touched once per level. So total cost is O(n) × O(log n) = O(n log n).

This is the famous recurrence T(n) = 2·T(n/2) + O(n), which Master Theorem (or just induction) solves to Θ(n log n).

Properties

Time: O(n log n) in all cases — best, average, worst. No surprises.
Space: O(n) — we need a buffer of size n for the merge step (the left and right arrays). This is its main downside.
Stable: Yes — using <= in the merge keeps equal elements in their original order.
Adaptive: No, by default — runs the same regardless of input.

When to actually use merge sort

The space cost (O(n) extra memory) is real, so quicksort and heapsort often beat it in benchmarks. But merge sort has three killer features:

Stability. When you need it, you need it. Java’s Collections.sort and Python’s list.sort use merge sort variants for this reason.
Predictable worst case. Always O(n log n) — no pathological inputs. Real-time systems and security-sensitive code that must bound worst-case running time prefer merge sort to quicksort.
External sorting. When data doesn’t fit in memory, merge sort works on disk. Sort chunks in RAM, then merge them in passes. Most “sort 100 GB of data on a 2 GB machine” pipelines use external merge sort.
Easy parallelization. The two recursive halves are independent — perfect for multi-core. parallel_sort in C++ and array.sort with multi-threaded workers in some runtimes are merge-sort-based.

Wrapping up Day 11

Sort	Time (avg)	Space	Stable	Adaptive	Notes
Bubble	O(n²)	O(1)	✓	with flag	Teaching tool, rarely useful.
Selection	O(n²)	O(1)	✗	✗	Minimum swaps; niche.
Insertion	O(n²) / O(n)	O(1)	✓	✓	Best small-array sort.
Merge	O(n log n)	O(n)	✓	✗	Stable, predictable, parallelizable.

Quick check

In the merge step, the comparison is `left[i] <= right[j]` rather than `left[i] < right[j]`. Why does that single character matter?

Merge sort gives you O(n log n) guaranteed and stability — at the cost of O(n) memory. That wraps Day 11. Next: Day 12 — Sorting II, where quicksort trades that guarantee for speed and O(1)-ish space, heapsort gets both, and counting/radix sort break the comparison barrier entirely.

Insertion Sort Overview

Finished this page?