🎉 All 30 days are live — the full DSA-30 course, from Big-O to System Design. See the roadmap →

Insertion Sort

Here’s a puzzle: insertion sort is O(n²), asymptotically worse than merge sort and quicksort — yet every major standard library (Python’s Timsort, C++‘s introsort, Rust’s PDQ) calls it internally. Why would the pros deliberately use the “slow” sort? Two reasons hide in this page: it gets faster the more sorted the input is, and at small sizes its constant factors crush the fancy sorts.

This is how you sort a hand of playing cards. Pick up cards one at a time and insert each into its correct spot among the cards you’ve already arranged. The “sorted prefix” grows by one with every card you draw.

It’s the same O(n²) family as bubble and selection — but in practice, insertion sort is much faster. So much faster that every standard library in the world keeps it around for small inputs.

The idea in three sentences

  1. Maintain a sorted prefix of the array. Index 0 is trivially a sorted prefix of length 1.
  2. Take the next element (the “key”) and shift the sorted prefix’s elements right one at a time until the key’s position is found.
  3. Drop the key into place.

Watch it run

insertion_sort([5, 1, 4, 2, 8, 3])
Step 1 / 18Index 0 is trivially sorted.
501142238435
comparingswappingsortedpivotcurrent min

Notice how the green sorted prefix grows by one each iteration. Each element migrates leftward only as far as it has to — that’s why insertion sort gets so cheap on nearly-sorted inputs.

The code

Note how the inner loop shifts rather than swaps. Each shift is one assignment, whereas a swap is three. That tiny constant-factor difference adds up across n² iterations.

Recall the shift that opens a gap for the key:

python · fill in the blanks0/1 hints
def insertion_sort(a):
for i in range(1, len(a)):
key = a[i]
j = i - 1
while j >= 0 and a[j] > key:
# ??? slide the bigger element one slot right
j -= 1
a[j + 1] = key # drop the key into the gap

Why it shines on nearly-sorted data

Predict first
Insertion sort is O(n²) worst case. What's its running time on an array that's ALREADY sorted — and why?

If the input is already sorted, the inner while loop never executes — every key is already in its correct place. The whole algorithm runs in O(n): one pass, no shifts.

More generally, if every element is at most k positions away from its sorted position, insertion sort runs in O(n·k). For k = 1 that’s linear time. This adaptive behavior is rare among sorts — it’s what makes insertion sort indispensable for nearly-sorted inputs.

Real-world example: sorting newly arriving data into an already-sorted list. Each new arrival shifts at most a few positions. Insertion sort handles this in O(n) per insertion; most other sorts would re-sort from scratch in O(n log n).

Properties

  • Time: O(n²) average and worst case. O(n) best case (already sorted).
  • Space: O(1) — in-place.
  • Stable: Yes. The a[j] > key comparison is strict, so equal elements never overtake each other.
  • Adaptive: Yes — runtime scales with how out-of-order the input is.

The “production sort” subroutine

Look inside any modern sort:

  • Timsort (Python, Java for objects): Insertion sort on runs of up to 32–64 elements.
  • Introsort (C++ std::sort): Insertion sort on slices smaller than ~16 elements.
  • PDQ sort (Rust): Insertion sort once partitions get small enough.

They all rely on this fact: for small n (say, n < 32), insertion sort beats quicksort and merge sort in real wall-clock time despite being asymptotically slower. The reason is constant factors — insertion sort has no recursion overhead, no extra memory, and excellent cache locality. The big-O notation hides constants, but at small n, constants are everything.

When you should reach for it

  • Arrays with fewer than ~50 elements.
  • Arrays you know are nearly sorted.
  • As the base case of a divide-and-conquer sort (cut the recursion when slices get small).
  • Online sorting — inserting new elements into an already-sorted collection.

Quick check

Production sorts (Timsort, introsort) switch to insertion sort once a slice is small (n < ~32), even though it's asymptotically O(n²). Why does that beat continuing with quicksort/merge sort there?

That’s the last of the O(n²) sorts — and the one worth keeping. Next: merge sort, the first sort to break the quadratic ceiling with a divide-and-conquer O(n log n) and a rock-solid worst-case guarantee.

Finished this page?