πŸš€ Phases 1–4 are live β€” Days 1–13 + Day 17 are fully written. See the roadmap β†’

Sort Characters by Frequency medium

Description

Given a string s, return a new string where its characters are sorted in decreasing order of frequency. Characters with the same frequency may appear in any order.

Examples

> Case 1:
    Input:  s = "tree"
    Output: "eert"     (or "eetr" β€” 'e' appears twice, others once)
 
> Case 2:
    Input:  s = "cccaaa"
    Output: "aaaccc"   (or "cccaaa")
 
> Case 3:
    Input:  s = "Aabb"
    Output: "bbAa"     (or "bbaA" β€” 'A' and 'a' are distinct characters)

Constraints

  • 1 <= s.length <= 5 Β· 10^5
  • s consists of uppercase and lowercase English letters and digits.

Approach 1: Counter + sort

The straightforward solution:

  1. Count frequencies.
  2. Sort the (count, char) pairs by count descending.
  3. Emit each character count times.

Time: O(n + k log k) where k is the number of distinct characters. Space: O(k).

Approach 2: Max-heap

Same idea, using a heap instead of sort. The heap version is strictly worse for this problem (no streaming, no top-K, no benefit from O(log k) per pop) β€” but it’s a useful exercise in the heap pattern.

Approach 3: Bucket sort β€” O(n)

Frequencies are bounded by n (no character can appear more than n times). Use buckets indexed by count.

def frequency_sort(s):
    counts = Counter(s)
    buckets = [[] for _ in range(len(s) + 1)]    # bucket[i] = chars with count i
    for ch, cnt in counts.items():
        buckets[cnt].append(ch)
    result = []
    for cnt in range(len(buckets) - 1, 0, -1):
        for ch in buckets[cnt]:
            result.append(ch * cnt)
    return "".join(result)

Time: O(n). Space: O(n). Optimal β€” but only beats Counter+sort when k is comparable to n (i.e., lots of distinct characters with similar counts).

Approach comparison

ApproachTimeSpaceWhen to use
Counter + sortO(n + k log k)O(k)Default β€” short, clear
Max-heapO(n + k log k)O(k)Practicing heap patterns
Bucket sortO(n)O(n)When you want optimal time

For most interview answers, Counter + sort is the cleanest. Mention bucket sort as the O(n) optimization if asked to do better.

The frequency-count + heap shape generalizes massively. Top K Frequent (day 7), Reorganize String, Task Scheduler, K Most Common Words, Find K Frequent in a Stream β€” all start with the same Counter(s) line. Once you’ve seen the template, you’ll recognize it at a glance.

Analysis (best approach)

  • Time: O(n) with bucket sort, O(n + k log k) with sort or heap.
  • Space: O(n) (bucket) or O(k) (sort / heap).