Group Anagrams

Group Anagrams medium

Description

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An anagram is a word formed by rearranging the letters of another — like "eat" and "tea".

Examples

> Case 1:
    Input:  strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
    Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
 
> Case 2:
    Input:  strs = [""]
    Output: [[""]]
 
> Case 3:
    Input:  strs = ["a"]
    Output: [["a"]]

Constraints

1 <= strs.length <= 10^4
0 <= strs[i].length <= 100
strs[i] consists of lowercase English letters.

The bucketing pattern

Two strings are anagrams iff they have the same multiset of characters. So we need a signature that two anagrams share but non-anagrams don’t.

Two natural choices:

Signature	Cost to compute	Notes
Sort the characters	O(k log k)	`"eat"` → `"aet"`, `"tea"` → `"aet"`
Tuple of 26 character counts	O(k)	`"eat"` → `(1,0,0,0,1,...,1,...)`

Either works. The sorted-string signature is shorter to write; the count-tuple is asymptotically faster.

Once we have the signature, we use a HashMap<signature, List<string>> and bucket each input under its signature.

Code (sort signature)

Time: O(n · k log k) where n = number of strings, k = max string length. Space: O(n · k) for the grouping.

Code (count signature — faster)

For long strings, the character-count signature avoids the O(k log k) sort:

Time: O(n · k) — strictly better than the sort version when strings are long.

⚠️

Why the # separator in Java? Without it, [1, 11] and [11, 1] both become "111" — collisions! The separator #1#11 vs #11#1 keeps the signatures distinct. A subtle bug that’s easy to miss.

What about prime products?

A clever-but-fragile approach: assign each letter a distinct prime (a=2, b=3, c=5, …), then a string’s signature is the product of its letters’ primes. Anagrams have the same product because multiplication commutes.

It works mathematically — but overflows fast. The 26th prime is 101, and 101^20 is gigantic. Don’t use this on real inputs; it’s a fun trivia answer.

Analysis

Approach	Time	Space	Notes
Sort signature	O(n · k log k)	O(n · k)	Simplest
Count signature	O(n · k)	O(n · k)	Fastest for long strings
Prime product (don’t!)	O(n · k)	O(n · k)	Overflows; bad for production

This bucketing pattern generalizes to Group Shifted Strings, Group Words by Pattern, Find Duplicate Subtrees, and any problem of the form “group items that share a canonical signature.”

Happy Number Longest Substring w/o Repeating

Finished this page?