Case Study — Designing a URL Shortener

Time to run the whole framework end-to-end on one problem. The URL shortener (TinyURL, bit.ly) is the Hello World of system design — small enough to finish in 45 minutes, rich enough to touch caching, sharding, unique-ID generation, and read/write asymmetry. Master this one and you have a template for the rest.

We’ll narrate it exactly as you would on a whiteboard.

Step 1 — Requirements

Functional:

Given a long URL, return a short URL (e.g. https://tiny.co/aZ3kP9q).
Visiting the short URL redirects to the original.
(Optional) custom aliases, expiration, click analytics.

Non-functional:

Read-heavy — far more people click short links than create them.
Low latency — a redirect must feel instant (sub-100 ms).
High availability — a dead link service is useless; we lean AP (a redirect that occasionally serves a slightly stale mapping is fine; being down is not).
Short codes must be unique and not easily guessable in sequence.

Out of scope (stated aloud): user accounts, dashboards, spam detection.

Step 2 — Estimation

Assume 100 M new URLs/day and a read:write ratio of 100:1.

Write QPS: 100 M ÷ 10⁵ s ≈ 1,160/s → peak (×3) ≈ ~3,500/s.
Read QPS: 100 × write ≈ 116,000/s → peak ≈ ~350,000/s. Reads dominate by two orders of magnitude — the entire design optimizes the read path.
Storage: ~500 bytes/record × 100 M/day = 50 GB/day → over 5 years ≈ ~90 TB. Won’t fit on one machine → we’ll shard.
Short codes needed: 100 M/day × 365 × 5 ≈ ~180 B codes over 5 years.

That last number sizes the key space. How long must a code be?

Sizing the short code. Using base62 (a–z A–Z 0–9 = 62 characters):

62⁶ ≈ 56 billion — not enough for 180 B.
62⁷ ≈ 3.5 trillion — comfortably more than 180 B, with headroom.

So 7 characters is the answer. This little calculation — “how many characters do I need?” — is exactly the kind of concrete reasoning interviewers want to see, instead of hand-waving “we’ll generate a random string.”

Step 3 — API

One detail worth a sentence: 301 vs 302. A 301 (permanent) lets browsers cache the redirect — fewer hits on your server, but you lose per-click analytics (the browser skips you next time). A 302 (temporary) routes every click through you (you can count clicks) at the cost of more traffic. “I’d use 302 if analytics matter, 301 if raw redirect speed matters.” — naming the trade-off scores.

Step 4 — Data model

The only read is point lookup by short_code. No joins, no range scans, no transactions. That access pattern points straight at a key-value / NoSQL store that shards on the key.

Step 5 — High-level design

URL shortener — read path optimized, writes go through a key-generation service

Writes get a unique code from the Key Generation Service and store the mapping. Reads (the 100× majority) hit Redis first; on a miss they fall back to the sharded KV store and repopulate the cache. The read path is what we optimized for.

The read path (the hot path):

GET /aZ3kP9q → load balancer → read service.
Read service checks Redis. Cache hit (the common case for popular links) → redirect in ~1 ms.
Cache miss → fetch from the sharded KV store (sharded on short_code, so the lookup goes straight to one node), populate Redis with a TTL, redirect.

The write path:

POST /shorten → write service.
Get a guaranteed-unique 7-char code from the Key Generation Service (more below).
Store short_code → long_url in the KV store. Return the short URL.

Step 6 — Deep dives

This is where the 45-minute interview is won. Three meaty dives:

Deep dive A — generating unique short codes

The hard part isn’t shortening — it’s guaranteeing uniqueness across many write servers without them colliding. Three approaches, with trade-offs:

Option 1 — Hash the long URL (e.g. MD5 → base62, take 7 chars)

Simple and stateless. Problem: hashing can collide (two URLs → same 7 chars), so you must check-and-retry on collision, and the same URL always maps to the same code (sometimes good, sometimes a privacy leak).

Option 2 — Auto-increment counter → base62 encode

A global counter (1, 2, 3, …) base62-encoded gives guaranteed-unique, ever-growing codes. Problem: a single counter is a bottleneck and an SPOF, and sequential codes are guessable (aaaaaab follows aaaaaaa — competitors can scrape your whole namespace).

Option 3 — Key Generation Service (KGS) — the strong answer

A dedicated service pre-generates random unique 7-char keys offline and stores them in a “keys” table split into used and unused. Write servers grab a batch of unused keys, mark them used. Uniqueness is guaranteed at generation time (no runtime collision checks), it’s not sequential (not guessable), and batching keeps the KGS from being a per-request bottleneck.

The KGS makes uniqueness a pre-computation problem instead of a runtime-coordination problem. That reframing — “move the hard guarantee offline so the hot path stays simple” — is a pattern worth carrying to other designs (e.g. pre-allocating ID ranges to servers, à la Twitter’s Snowflake).

Deep dive B — scaling the read path to 350k/s

Cache the hot links. Link popularity follows a power law — a tiny fraction of links get the vast majority of clicks. An 80–90%+ cache hit rate is realistic, so Redis absorbs most of the 350k/s and the DB sees a fraction.
Cache eviction: LRU — keep the currently-popular links, drop the cold ones.
Read replicas behind the cache for misses, and shard the KV store on short_code so each shard owns ~1/N of the keyspace. Use consistent hashing so adding a shard doesn’t reshuffle everything.

Deep dive C — single points of failure & cleanup

No single LB, cache, or DB node should be fatal — run the LB active-passive, replicate Redis, replicate each shard.
Expiration / cleanup: a lazy approach (check expires_at on read, delete if past) plus a background sweeper for cold expired entries avoids a giant scheduled purge.

The finished picture

You started from a vague “design TinyURL” and, by following the script, arrived at: a read-optimized, cache-fronted, sharded key-value design with a key-generation service for collision-free codes — and you justified every box against a requirement or an estimate. That’s a passing interview.

Quick check

Why does the URL shortener design put so much effort into the READ path and comparatively little into writes?

An interviewer pushes: 'Why not just use an auto-incrementing counter, base62-encoded, for the short codes?' What's the best critique?

Why is a key-value / NoSQL store a better fit here than a relational database with joins?

Next: Basic Questions — concept checks to lock in the vocabulary, then the practice set of eight classic design prompts.

Scaling & Trade-offs Low-Level Design (LLD)

Finished this page?