Find the Duplicate Number medium
Description
Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive, there is exactly one repeated number in nums. Return this repeated number.
You must solve the problem without modifying the array nums and use only constant extra space.
Examples
> Case 1:
nums = [1, 3, 4, 2, 2]
Output: 2
> Case 2:
nums = [3, 1, 3, 4, 2]
Output: 3
> Case 3:
nums = [3, 3, 3, 3, 3]
Output: 3Constraints
1 <= n <= 10^5nums.length == n + 11 <= nums[i] <= n- All the integers in
numsappear only once except for precisely one integer which appears two or more times.
Why the obvious approaches don’t fit
- Sort then scan: modifies the array → violates the constraint.
- Hash set:
O(n)extra space → violates the constraint. - Mark visited by negating
nums[abs(x)]: modifies the array → violates the constraint.
We need something that’s O(n) time, O(1) space, and read-only. The trick: treat the array as a linked list and apply Floyd’s cycle detection.
The reduction
Define a function f(i) = nums[i]. Then iterating f starting from index 0 visits:
0 → nums[0] → nums[nums[0]] → nums[nums[nums[0]]] → ...Each value is in [1, n], so the iteration stays within indices [1, n] after the first step (and 0 is only the start). Since values come from n + 1 slots but map to only n distinct values (1 through n), by pigeonhole two different indices share the same next-value — that’s the duplicate, and it’s the entry point of a cycle in this implicit graph.
So the problem reduces to: find the entry of the cycle in this functional iteration. That’s exactly Linked List Cycle II.
Code
Use do { ... } while (or a while True with a break) for phase 1, not while (slow != fast).
Both slow and fast start at nums[0], so they’re equal at iteration 0. A plain while (slow != fast) would skip phase 1 entirely. The do-while (or while True + break-on-equal AFTER the move) guarantees at least one step before the equality check.
An alternative: binary search on the value space
There’s a slicker (and slower) O(n log n) approach: binary search on the answer. For each candidate value mid, count how many nums[i] <= mid. If that count is > mid, the duplicate is <= mid. Otherwise it’s > mid.
def find_duplicate_bs(nums):
lo, hi = 1, len(nums) - 1
while lo < hi:
mid = (lo + hi) // 2
if sum(1 for x in nums if x <= mid) > mid:
hi = mid
else:
lo = mid + 1
return loO(n log n) time, O(1) space. Slower than Floyd’s but easier to derive on the spot. Knowing both impresses interviewers.
Analysis
- Time:
O(n). Phase 1 and phase 2 each take at mostnsteps. - Space:
O(1).
Same skin
- Linked List Cycle II — same algorithm on actual linked lists.
- Circular Array Loop — variant where we follow
i + nums[i]instead ofnums[i]. - Happy Number — Floyd’s on the “sum of squared digits” function.
- Linked List Cycle (detection only) — phase 1 only.
This problem is a beautiful application of the “if you can frame iteration as a function with a fixed point or cycle, Floyd’s works in O(1) space” pattern. Anywhere you’d otherwise need a hash set to remember visited states, Floyd’s is the constant-space alternative.