Merge Sort

The first sort that breaks the O(n²) barrier. Merge sort is divide and conquer in its purest form: split the array in half, sort each half recursively, then merge the two sorted halves.

It’s the algorithm that proved sorting could be done in O(n log n). It’s also the algorithm Python uses for sorting objects (as part of Timsort), and the go-to for sorting data that doesn’t fit in memory.

The idea in three sentences

  1. Divide. Split the array in half.
  2. Conquer. Recursively sort each half.
  3. Combine. Merge the two sorted halves into one sorted whole.

The base case: an array of length 1 is already sorted.

Why merging is fast

The cleverness of merge sort lives in the merge step. Given two already-sorted arrays of total length n, you can merge them in O(n) time:

Left  = [1, 4, 7]
Right = [2, 3, 8]

Two pointers, one on each array. Always take the smaller of the two heads.

  1 < 2 → take 1.  out = [1]
  4 > 2 → take 2.  out = [1, 2]
  4 > 3 → take 3.  out = [1, 2, 3]
  4 < 8 → take 4.  out = [1, 2, 3, 4]
  7 < 8 → take 7.  out = [1, 2, 3, 4, 7]
  Right has 8 left → append. out = [1, 2, 3, 4, 7, 8]

That’s the whole trick — comparing only the heads, never the middles, gives linear-time merging.

Watch it run

merge_sort([5, 1, 4, 2, 8, 3])
Step 1 / 29Start merge sort.
501142238435
comparingswappingsortedpivotcurrent min

Notice the bottom-up pattern: tiny windows get merged into bigger sorted windows, until the whole array is one big sorted window.

The code

⚠️

Note int mid = lo + (hi - lo) / 2 rather than (lo + hi) / 2. The first form avoids integer overflow when lo + hi exceeds INT_MAX — a real bug that lived in Java’s binary search for nine years before someone noticed.

Why O(n log n)?

The recursion tree has log n levels (you halve the array each split). At every level, the total merge work across all sub-arrays is O(n) — each element is touched once per level. So total cost is O(n) × O(log n) = O(n log n).

This is the famous recurrence T(n) = 2·T(n/2) + O(n), which Master Theorem (or just induction) solves to Θ(n log n).

Properties

  • Time: O(n log n) in all cases — best, average, worst. No surprises.
  • Space: O(n) — we need a buffer of size n for the merge step (the left and right arrays). This is its main downside.
  • Stable: Yes — using <= in the merge keeps equal elements in their original order.
  • Adaptive: No, by default — runs the same regardless of input.

When to actually use merge sort

The space cost (O(n) extra memory) is real, so quicksort and heapsort often beat it in benchmarks. But merge sort has three killer features:

  1. Stability. When you need it, you need it. Java’s Collections.sort and Python’s list.sort use merge sort variants for this reason.
  2. Predictable worst case. Always O(n log n) — no pathological inputs. Real-time systems and security-sensitive code that must bound worst-case running time prefer merge sort to quicksort.
  3. External sorting. When data doesn’t fit in memory, merge sort works on disk. Sort chunks in RAM, then merge them in passes. Most “sort 100 GB of data on a 2 GB machine” pipelines use external merge sort.
  4. Easy parallelization. The two recursive halves are independent — perfect for multi-core. parallel_sort in C++ and array.sort with multi-threaded workers in some runtimes are merge-sort-based.

Wrapping up Day 11

SortTime (avg)SpaceStableAdaptiveNotes
BubbleO(n²)O(1)with flagTeaching tool, rarely useful.
SelectionO(n²)O(1)Minimum swaps; niche.
InsertionO(n²) / O(n)O(1)Best small-array sort.
MergeO(n log n)O(n)Stable, predictable, parallelizable.

Tomorrow on Day 12 we’ll cover Quick Sort (the practical champion of comparison sorts), Heap Sort (O(1) space and O(n log n) worst case), and the linear-time non-comparison sorts.