Group Anagrams medium
Description
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An anagram is a word formed by rearranging the letters of another — like "eat" and "tea".
Examples
> Case 1:
Input: strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
> Case 2:
Input: strs = [""]
Output: [[""]]
> Case 3:
Input: strs = ["a"]
Output: [["a"]]Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i]consists of lowercase English letters.
The bucketing pattern
Two strings are anagrams iff they have the same multiset of characters. So we need a signature that two anagrams share but non-anagrams don’t.
Two natural choices:
| Signature | Cost to compute | Notes |
|---|---|---|
| Sort the characters | O(k log k) | "eat" → "aet", "tea" → "aet" |
| Tuple of 26 character counts | O(k) | "eat" → (1,0,0,0,1,...,1,...) |
Either works. The sorted-string signature is shorter to write; the count-tuple is asymptotically faster.
Once we have the signature, we use a HashMap<signature, List<string>> and bucket each input under its signature.
Code (sort signature)
Time: O(n · k log k) where n = number of strings, k = max string length. Space: O(n · k) for the grouping.
Code (count signature — faster)
For long strings, the character-count signature avoids the O(k log k) sort:
Time: O(n · k) — strictly better than the sort version when strings are long.
Why the # separator in Java? Without it, [1, 11] and [11, 1] both become "111" — collisions! The separator #1#11 vs #11#1 keeps the signatures distinct. A subtle bug that’s easy to miss.
What about prime products?
A clever-but-fragile approach: assign each letter a distinct prime (a=2, b=3, c=5, …), then a string’s signature is the product of its letters’ primes. Anagrams have the same product because multiplication commutes.
It works mathematically — but overflows fast. The 26th prime is 101, and 101^20 is gigantic. Don’t use this on real inputs; it’s a fun trivia answer.
Analysis
| Approach | Time | Space | Notes |
|---|---|---|---|
| Sort signature | O(n · k log k) | O(n · k) | Simplest |
| Count signature | O(n · k) | O(n · k) | Fastest for long strings |
| Prime product (don’t!) | O(n · k) | O(n · k) | Overflows; bad for production |
This bucketing pattern generalizes to Group Shifted Strings, Group Words by Pattern, Find Duplicate Subtrees, and any problem of the form “group items that share a canonical signature.”