Using Hash Maps in Real Code

Here’s a problem-solving cheat code: whenever you catch yourself about to write a nested loop that re-scans the data, stop and ask — “what if I could check ‘have I seen this before?’ in O(1)?” That magic spell is a hash map, and it collapses a startling number of O(n²) problems straight to O(n). This page is the four patterns that spell shows up as.

You almost never write a hash map from scratch. Every language ships one, and they’re battle-tested by millions of programs. This page covers the stdlib API in C++, Python, and Java, plus four patterns that solve 90% of interview problems.

The four core operations

Every language exposes the same fundamental API. Here’s the Rosetta stone:

Operation	C++	Python	Java
Create empty	`unordered_map<K, V> m;`	`m = {}`	`Map<K, V> m = new HashMap<>();`
Put / set	`m[k] = v;`	`m[k] = v`	`m.put(k, v);`
Get	`m[k]` (creates if missing!)	`m[k]` (KeyError if missing)	`m.get(k)`
Get with default	`m.count(k) ? m[k] : default`	`m.get(k, default)`	`m.getOrDefault(k, default)`
Contains key	`m.count(k) > 0` or `m.contains(k)`	`k in m`	`m.containsKey(k)`
Delete	`m.erase(k);`	`del m[k]`	`m.remove(k);`
Size	`m.size()`	`len(m)`	`m.size()`
Iterate keys + values	`for (auto& [k, v] : m)`	`for k, v in m.items():`	`for (var e : m.entrySet())`

⚠️

C++ trap: m[k] on a missing key silently inserts a default-constructed value (e.g. 0 for int). Use m.count(k) or m.find(k) first if you don’t want that. Python’s m[k] correctly throws KeyError; Java’s m.get(k) returns null.

Pattern 1: Counting frequencies

By far the most common use. Walk a sequence, build {element: count}. Fill in the update line — the “default to 0 if unseen, then add one” idiom:

python · fill in the blanks0/1 hints

counts = {}
for c in "abracadabra":
  # ??? bump the count, defaulting to 0 for a new key

This single pattern powers Top K Frequent, Sort Characters by Frequency, First Unique Character, Find All Anagrams, Group Anagrams, and many more.

Pattern 2: Seen-set for O(1) “have I seen this?”

A HashSet (or a HashMap you only use the keys of) lets you answer “have I encountered x before?” in O(1).

Set.add(x) returning false for duplicates (Java) and set.add(x) being a no-op (Python) are useful idioms — they fold “check + insert” into one operation.

Pattern 3: Value → original-index map (the Two Sum trick)

A hash map can also map values to indices. This is the trick behind Two Sum and most “find two elements that satisfy X” problems.

Predict first

You're scanning an array for two numbers that sum to a target. Brute force checks every pair — O(n²). What does a value→index hash map let you do instead?

def two_sum(nums, target):
    seen = {}                              # value → index
    for i, x in enumerate(nums):
        complement = target - x
        if complement in seen:
            return [seen[complement], i]
        seen[x] = i
    return []

The pattern generalizes: any time you need to know “is the partner of this element somewhere in the past?”, a value → index map answers in O(1).

Pattern 4: Grouping (bucket by a key signature)

When you need to group items that share a property — anagrams (same sorted letters), points on the same line (same slope from origin), etc. — a hash map keyed by the signature does it in one pass.

Choosing a good signature is the whole art of the bucket pattern. Sort the characters? Use a tuple of counts? Hash the chars in a particular way? Whatever makes “same group” easy to express as a hash key.

HashMap vs HashSet — when to use which

The difference is small but important:

HashSet<T> stores keys only. Use when you only care whether something is present.
HashMap<K, V> stores keys mapped to values. Use when you need additional info per key (counts, indices, lists of items).

In practice, Set is just Map<K, Boolean> with a friendlier API. Pick the one whose API reads cleanest in your problem.

Caveats and edge cases

A few real-world cracks to be aware of:

Iteration order is not guaranteed. unordered_map (C++) and HashMap (Java) iterate in implementation-defined order. Python 3.7+ guarantees insertion order for dict (and Java has LinkedHashMap for this). Don’t rely on hash-map iteration order unless you’ve explicitly checked the language spec.
Concurrent modification is unsafe. Modifying a hash map while iterating over it crashes in Java (ConcurrentModificationException), corrupts in C++, and raises RuntimeError in Python. Snapshot the keys first if you need to delete during iteration.
NaN as a key is weird. NaN != NaN, so you can insert it but never look it up. Avoid.
Custom keys need both hash and equals. Override one without the other and your map will silently misbehave. Java’s IDE-generated hashCode + equals is the safe path. Python uses __hash__ + __eq__.

The mental shortcut for problem-solving

When you’re stuck on a problem, ask:

“If I had a magic spell that could check ‘have I seen this value (or this signature) before?’ in O(1), would this be easier?”

If yes — and it almost always is — that magic spell is a hash map. Add it to your solution; you’ve usually just collapsed a quadratic problem to linear.

Quick check

In C++, you write `if (m[k] > 0)` to check a count for a key that might be missing. What's the subtle bug?

These four patterns cover most day-to-day hash-map work. Next: advanced topics — hashing custom objects, when iteration order or sorted access matters (and you reach past a hash map), and the security/consistency edges.

Collisions & Load Factor Applications

Finished this page?