Sort Characters by Frequency

Sort Characters by Frequency medium

Description

Given a string s, return a new string where its characters are sorted in decreasing order of frequency. Characters with the same frequency may appear in any order.

Examples

> Case 1:
    Input:  s = "tree"
    Output: "eert"     (or "eetr" — 'e' appears twice, others once)
 
> Case 2:
    Input:  s = "cccaaa"
    Output: "aaaccc"   (or "cccaaa")
 
> Case 3:
    Input:  s = "Aabb"
    Output: "bbAa"     (or "bbaA" — 'A' and 'a' are distinct characters)

Constraints

1 <= s.length <= 5 · 10^5
s consists of uppercase and lowercase English letters and digits.

Approach 1: Counter + sort

The straightforward solution:

Count frequencies.
Sort the (count, char) pairs by count descending.
Emit each character count times.

Time: O(n + k log k) where k is the number of distinct characters. Space: O(k).

Approach 2: Max-heap

Same idea, using a heap instead of sort. The heap version is strictly worse for this problem (no streaming, no top-K, no benefit from O(log k) per pop) — but it’s a useful exercise in the heap pattern.

Approach 3: Bucket sort — O(n)

Frequencies are bounded by n (no character can appear more than n times). Use buckets indexed by count.

def frequency_sort(s):
    counts = Counter(s)
    buckets = [[] for _ in range(len(s) + 1)]    # bucket[i] = chars with count i
    for ch, cnt in counts.items():
        buckets[cnt].append(ch)
    result = []
    for cnt in range(len(buckets) - 1, 0, -1):
        for ch in buckets[cnt]:
            result.append(ch * cnt)
    return "".join(result)

Time: O(n). Space: O(n). Optimal — but only beats Counter+sort when k is comparable to n (i.e., lots of distinct characters with similar counts).

Approach comparison

Approach	Time	Space	When to use
Counter + sort	O(n + k log k)	O(k)	Default — short, clear
Max-heap	O(n + k log k)	O(k)	Practicing heap patterns
Bucket sort	O(n)	O(n)	When you want optimal time

For most interview answers, Counter + sort is the cleanest. Mention bucket sort as the O(n) optimization if asked to do better.

The frequency-count + heap shape generalizes massively. Top K Frequent (day 7), Reorganize String, Task Scheduler, K Most Common Words, Find K Frequent in a Stream — all start with the same Counter(s) line. Once you’ve seen the template, you’ll recognize it at a glance.

Analysis (best approach)

Time: O(n) with bucket sort, O(n + k log k) with sort or heap.
Space: O(n) (bucket) or O(k) (sort / heap).

K Closest Points to Origin Task Scheduler

Finished this page?