🚀 Phases 1–4 are live — Days 1–13 + Day 17 are fully written. See the roadmap →

K Closest Points to Origin medium

Description

Given an array points where points[i] = [xᵢ, yᵢ] represents a point on the X-Y plane, and an integer k, return the k closest points to the origin (0, 0). The distance is the Euclidean distance: sqrt(x² + y²). The answer may be returned in any order.

Examples

> Case 1:
    Input:  points = [[1, 3], [-2, 2]], k = 1
    Output: [[-2, 2]]
    # sqrt(1+9) = √10 vs sqrt(4+4) = √8. √8 is smaller.
 
> Case 2:
    Input:  points = [[3, 3], [5, -1], [-2, 4]], k = 2
    Output: [[3, 3], [-2, 4]]

Constraints

  • 1 <= k <= points.length <= 10^4
  • -10^4 < xᵢ, yᵢ < 10^4

Intuition — and a key optimization

Same pattern as Kth Largest, but inverted. We want the K smallest distances, so we keep a max-heap of size K:

  • The root is the largest distance among the K closest points so far.
  • A new point with a smaller distance than the root deserves to be in the K-best, so push it and pop the old root.

The optimization that’s basically free: we don’t need the actual sqrt. Comparing √(x₁² + y₁²) vs √(x₂² + y₂²) gives the same answer as comparing x₁² + y₁² vs x₂² + y₂² because sqrt is monotonic for non-negative values. Skipping the square root saves a CPU instruction per comparison and avoids any floating-point fuzziness.

Code

Min-heap or max-heap? When you want the K smallest values, keep a max-heap of size K. The root is the largest of the K-smallest, which is exactly the threshold you compare new candidates against. This inversion catches people: “I want the smallest, why am I using a max-heap?” — because the heap holds winners, and the loser among the winners is what gets evicted.

The pattern generalizes

The exact same shape solves:

  • K Closest Strings (by Levenshtein distance) — same template, different metric.
  • K Cheapest Flights — same template, priority = total cost.
  • K Most Similar Items — anything where “closest” can be quantified.
  • Find the Median in a Stream — uses two heaps, but same idea per half.

You’re not memorizing problems. You’re memorizing one template.

Analysis

  • Time: O(n log k) — n points, each costing log k for the heap operation.
  • Space: O(k) for the heap.

A QuickSelect-based solution can do this in O(n) average without a heap, but it has worst-case O(n²) and is harder to write right under interview pressure.