🎉 All 30 days are live — the full DSA-30 course, from Big-O to System Design. See the roadmap →

Design a URL Shortener (TinyURL) easy

This problem gets the full end-to-end treatment in the URL Shortener case study — including the 7-character key-space math and three approaches to unique-code generation. This page is the condensed, interview-pace version you’d actually narrate in 20 minutes. Read both: the case study for depth, this for the rhythm.

The prompt

Turn a long URL into a short one (https://tiny.co/aZ3kP9q) and redirect on visit.

Requirements

  • Functional: shorten a URL; redirect a short URL to the original. (Optional: custom alias, expiry, analytics.)
  • Non-functional: very read-heavy (clicks ≫ creates), low-latency redirect (sub-100 ms), highly available (lean AP), short codes unique and non-sequential.

Estimation

100 M new URLs/day, read:write = 100:1:

  • Writes ≈ 1.2k/s (peak ~3.5k/s); reads ≈ 116k/s (peak ~350k/s) → optimize the read path.
  • ~500 B/record × 100 M/day × 5 yr ≈ ~90 TBshard.
  • base62, 7 chars = 62⁷ ≈ 3.5 trillion codes — plenty.

API

Data model

Only access pattern is get(short_code) -> long_url: no joins, no scans. → key-value store, sharded on short_code.

urls:  short_code (PK) | long_url | created_at | expires_at

High-level design

Read-optimized: cache in front, sharded KV behind, KGS for codes
codewritecheckmissClientLoad BalancerWrite SvcRead SvcKey Gen SvcRedisSharded KV
Read path: LB → read service → Redis (hit ≈1 ms) → on miss, sharded KV + repopulate. Write path: write service → Key Generation Service for a unique code → store.

Deep dives

  • Unique codes: a Key Generation Service pre-generates random unique 7-char keys offline (no runtime collisions, not guessable). Beats a sequential counter (bottleneck + enumerable) and beats hashing (collisions). See the case study for all three.
  • Read scale (350k/s): link popularity follows a power law → 80–90%+ cache hit rate. LRU eviction; read replicas behind the cache; shard on short_code with consistent hashing.
  • 301 vs 302: 301 (permanent) is browser-cached → fewer hits but no analytics; 302 (temporary) routes every click through you → analytics at the cost of traffic.

Analysis

  • Redirect latency: ~1 ms on a cache hit, ~10 ms on a miss.
  • Storage: ~90 TB over 5 years, spread across shards.

Same skin

  • Pastebin — same shape, value is text instead of a URL; store the blob in object storage, metadata in the KV store.
  • Distributed unique ID generation (Snowflake) — the KGS idea generalizes to any “give me a globally unique ID without coordination” problem.
  • Any read-heavy point-lookup service — image/CDN key resolution, feature-flag lookup, session store.