Insertion Sort
This is how you sort a hand of playing cards. Pick up cards one at a time and insert each into its correct spot among the cards you’ve already arranged. The “sorted prefix” grows by one with every card you draw.
It’s the same O(n²) family as bubble and selection — but in practice, insertion sort is much faster. So much faster that every standard library in the world keeps it around for small inputs.
The idea in three sentences
- Maintain a sorted prefix of the array. Index 0 is trivially a sorted prefix of length 1.
- Take the next element (the “key”) and shift the sorted prefix’s elements right one at a time until the key’s position is found.
- Drop the key into place.
Watch it run
Notice how the green sorted prefix grows by one each iteration. Each element migrates leftward only as far as it has to — that’s why insertion sort gets so cheap on nearly-sorted inputs.
The code
Note how the inner loop shifts rather than swaps. Each shift is one assignment, whereas a swap is three. That tiny constant-factor difference adds up across n² iterations.
Why it shines on nearly-sorted data
If the input is already sorted, the inner while loop never executes — every key is already in its correct place. The whole algorithm runs in O(n): one pass, no shifts.
More generally, if every element is at most k positions away from its sorted position, insertion sort runs in O(n·k). For k = 1 that’s linear time. This adaptive behavior is rare among sorts — it’s what makes insertion sort indispensable for nearly-sorted inputs.
Real-world example: sorting newly arriving data into an already-sorted list. Each new arrival shifts at most a few positions. Insertion sort handles this in O(n) per insertion; most other sorts would re-sort from scratch in O(n log n).
Properties
- Time: O(n²) average and worst case. O(n) best case (already sorted).
- Space: O(1) — in-place.
- Stable: Yes. The
a[j] > keycomparison is strict, so equal elements never overtake each other. - Adaptive: Yes — runtime scales with how out-of-order the input is.
The “production sort” subroutine
Look inside any modern sort:
- Timsort (Python, Java for objects): Insertion sort on runs of up to 32–64 elements.
- Introsort (C++ std::sort): Insertion sort on slices smaller than ~16 elements.
- PDQ sort (Rust): Insertion sort once partitions get small enough.
They all rely on this fact: for small n (say, n < 32), insertion sort beats quicksort and merge sort in real wall-clock time despite being asymptotically slower. The reason is constant factors — insertion sort has no recursion overhead, no extra memory, and excellent cache locality. The big-O notation hides constants, but at small n, constants are everything.
When you should reach for it
- Arrays with fewer than ~50 elements.
- Arrays you know are nearly sorted.
- As the base case of a divide-and-conquer sort (cut the recursion when slices get small).
- Online sorting — inserting new elements into an already-sorted collection.
Next up: Merge Sort — the first sort that beats the O(n²) ceiling.