🎉 All 30 days are live — the full DSA-30 course, from Big-O to System Design. See the roadmap →
Day 18 - TriesVariations & Tricks

Trie Variations & Tricks

The plain trie from the last page solves most string problems. But three variants unlock a much wider set — and one of them (the bitwise trie) is the single highest-leverage trick on this page because it turns a whole class of XOR problems from O(N²) into O(N · 32).

1. Storing a value, not just a flag

The smallest tweak: replace isEndOfWord (a boolean) with a value field. Now your trie is a map keyed by strings, not just a set — insert(key, value) and get(key). This is exactly what Map Sum Pairs and Replace Words need: walk to a node and read the value (or sum the values in a subtree).

TrieNode:
    children : char -> TrieNode
    value    : the payload stored at this key  (None if no word ends here)

2. Compressed trie (radix tree / Patricia trie)

A plain trie wastes a node per character even down long non-branching chains. If you insert "international", that’s 13 nodes in a single line — one child each. A compressed trie merges every non-branching chain into one edge labeled with the whole substring.

A plain trie for {test, team, toast} — note the single-child chains
teamstoast
root   end-of-word   internal
t→e is shared, then it branches (st / am). 'toast' branches at the top (t→o). A compressed trie would collapse 'st'→'t' and 'oast' into single edges, storing 'st', 'am', 'oast' as edge labels instead of one node per char.

The payoff: far fewer nodes (so less pointer overhead and better cache behavior) at the cost of more complex insert (you sometimes have to split an edge when a new word diverges mid-label). This is how IP routing tables and many production string indexes are actually built.

When to mention a radix tree in an interview: when the interviewer pushes on memory, or when keys are long with little branching (file paths, IP prefixes, URLs). You rarely need to implement one under time pressure — but knowing “I’d compress non-branching chains into a radix tree to cut node overhead” is a strong, specific answer to “how would you make this use less memory?“

3. The bitwise trie — the XOR superweapon

This is the variant worth memorizing. Treat each integer as a fixed-length binary string (say 32 bits) and insert it into a trie with an alphabet of just {0, 1}. Now the trie has depth 32 and at most two children per node.

Why is this magical? Because XOR is solved greedily, bit by bit, most-significant first. To find the number already in the set that maximizes x XOR (that number), walk down from the MSB and at each bit, try to go to the opposite bit of x — opposite bits XOR to 1, and a 1 in a higher position beats any combination of lower bits.

With this, Maximum XOR of Two Numbers in an Array drops from the brute-force O(N²) (every pair) to O(N · 32) — insert each number, then for each number greedily find its best partner. The same structure handles “max XOR with a query value,” “count pairs with XOR ≤ k,” and XOR-range problems.

⚠️

The two bitwise-trie gotchas: (1) always insert MSB-first — a higher bit dominates all lower bits, so the greedy choice only works top-down from the most significant bit. (2) Use a fixed bit width (e.g. 31 or 32) so every number is the same “length”; pad with leading zeros. Mixing widths silently misaligns the bits and breaks the greedy comparison.

4. Ternary search tree (TST)

A space-saving middle ground between a BST and a trie: each node stores one character and has three children — less (go left, same position), equal (go down, next position), and greater (go right, same position). It uses far less memory than an array-children trie for large alphabets, at a small speed cost (you binary-search among siblings instead of indexing). Good to name as a memory-conscious alternative; rarely needed in full.

Which variant for which problem

You see…Reach for
”is word / prefix in a set of strings?“plain trie
”map string keys to values / sums”trie with a value field
”long keys, little branching, memory-tight”compressed (radix) trie
maximize / minimize XOR”, “XOR pairs”bitwise trie
”large alphabet, want trie-like queries, less memory”ternary search tree

Quick check

In a bitwise trie, why do you insert and query starting from the MOST significant bit?
Maximum XOR of two numbers in an array is O(N²) brute force. How does a bitwise trie make it O(N·32)?
When is a compressed (radix) trie clearly preferable to a plain trie?

Next: Basic Questions — warm-ups to cement insert/search/prefix, then the practice set.