The Framework
The blank whiteboard is the enemy. Beat it with a script — the same six steps, in the same order, every single time. The prompt changes; the process doesn’t. Internalize this and “design YouTube” stops being terrifying and starts being fill-in-the-blanks.
Step 1 — Requirements (don’t skip this, ever)
The fastest way to fail is to start drawing. Spend the first five minutes asking questions. You’re doing two things: scoping the problem to something buildable in 45 minutes, and surfacing the non-functional targets that will drive every later choice.
Clarify functional requirements
What are the 2–3 core features? For “design Twitter”: post a tweet, view a home timeline, follow a user. Explicitly defer the rest out loud — “I’ll leave search, DMs, and ads out of scope unless you’d like me to cover them.” This shows judgment and buys you time.
Pin down non-functional requirements
Ask directly:
- Scale — “How many daily active users? Read-heavy or write-heavy?”
- Latency — “What’s an acceptable p99 — is this interactive (sub-200 ms) or can it be async?”
- Availability vs consistency — “If there’s a network partition, is it worse to be down or to show slightly stale data?” (This is the CAP question in disguise.)
- Durability — “Can we ever lose data?”
State your assumptions
The interviewer often won’t give exact numbers. Make reasonable ones out loud and let them correct you: “I’ll assume 100 M DAU, read:write of 100:1, and that stale reads of a few seconds are acceptable. Sound right?”
The single biggest mistake junior candidates make: drawing boxes in minute one. You’ll design the wrong system fast. A senior candidate is comfortable spending 5+ minutes apparently “doing nothing” but actually defining the problem. The requirements you extract here are the rubric you’ll be graded against — get them wrong and even a beautiful design scores low.
Step 2 — Back-of-the-envelope estimation
Now turn the requirements into numbers. You’re not aiming for precision — you’re aiming for the right order of magnitude, because that’s what decides architecture. 100 QPS fits on one laptop. 1 M QPS needs hundreds of machines, sharding, and caching. The number tells you which world you’re in.
The only three formulas you need
- QPS = (daily actions × actions-per-user) ÷ 86,400 s. Approximate a day as 10⁵ seconds and the division is trivial.
- Storage = objects/day × size/object × retention (days). Multiply out, convert to TB/PB.
- Bandwidth = QPS × size/object. Split read bandwidth from write bandwidth.
And always compute peak, not just average — real traffic spikes 2–10× above the mean. Size for the peak or you fall over at the worst possible moment.
Try it — drag the inputs
Numbers worth memorizing so you can do this in your head:
- 1 day ≈ 86,400 s ≈ 10⁵ s
- 1 million writes/day ≈ ~12 writes/sec average
- Char = 1 byte, a typical short DB row ≈ hundreds of bytes to a few KB, a photo ≈ hundreds of KB to a few MB, a minute of video ≈ tens of MB
- 2ⁱ⁰ = 1 K, 2²⁰ = 1 M, 2³⁰ = 1 B (KB / MB / GB)
- A commodity server: ~10–100 K simple QPS, tens of GB RAM, low-single-digit TB disk
A worked estimate: “design a paste-bin / TinyURL”
Assume 100 M new URLs/day, read:write = 100:1, 500 bytes stored per URL, kept 5 years.
- Write QPS: 100 M ÷ 10⁵ ≈ 1,000 writes/s (peak ×3 ≈ 3,000/s).
- Read QPS: 100× that ≈ 100,000 reads/s (peak ≈ 300,000/s). → reads dominate; this screams “cache + read replicas.”
- Storage: 100 M × 500 B = 50 GB/day → × 365 × 5 ≈ ~90 TB over 5 years. → too big for one machine’s RAM; fits comfortably on a sharded disk store.
Three numbers, and the architecture is already taking shape: heavy reads → cache; 90 TB → shard the database; 100:1 ratio → optimize the read path above all else.
Step 3 — API design
Define the contract between client and server. Keep it small — just the core features. This is where functional requirements become concrete, and it forces you to decide what data flows in and out.
Things interviewers listen for here: sensible HTTP verbs, where pagination goes (always paginate list endpoints — ?cursor=...&limit=20), how you handle auth (an API key or token in the header), and whether reads and writes are cleanly separated.
Step 4 — Data model
What are the entities, and — crucially — what are the access patterns? The access pattern decides SQL vs NoSQL far more than the data shape does.
The line that earns points: “Because the only read is a point lookup by short_code and there are no joins, a key-value/NoSQL store fits better than a relational one — and it shards trivially on the key.” You connected the data model to a storage choice via the access pattern. That’s the move.
Step 5 — High-level design
Now you draw boxes. Start with the simplest thing that satisfies the functional requirements — client → load balancer → app servers → database — then layer in the components that satisfy the non-functional ones you found in step 1.
The narration matters as much as the diagram: “I’m adding the load balancer so I can scale app servers horizontally and survive a server dying. I’m adding the cache because reads outnumber writes 100:1, and a cache hit at ~1 ms beats a DB read at ~10 ms.” Tie. Every. Box. To. A. Requirement.
Step 6 — Deep dives & bottlenecks
The first five steps get you a working design. The last 15 minutes are where you earn the offer: find your own bottleneck and address it. This is the difference between a passing and a strong interview.
Good deep-dive directions to volunteer:
- “The hottest URL could overwhelm one cache node — I’d handle that with consistent hashing plus replication of hot keys.”
- “Generating unique short codes across many servers is a coordination problem — here’s how a key-generation service or a counter-with-base62 avoids collisions.”
- “For the 100k/s read path, a single DB won’t keep up — I’d add read replicas and shard by
short_code.” - Single points of failure — “the load balancer itself is an SPOF; in production I’d run an active-passive pair.”
Drive your own deep dives. Don’t wait to be asked “what’s wrong with this?” — say it first. “The obvious bottleneck here is X; let me address it.” Naming your design’s weakness before the interviewer does is the strongest possible signal: it shows you actually understand the trade-offs, not just the happy path.
The script on one line
Requirements (what + how-well) → Estimate (QPS, storage, bandwidth) → API (the contract) → Data model (entities + access patterns) → High-level design (boxes justified by requirements) → Deep dives (find and fix your own bottleneck).
Quick check
Next: Building Blocks — the Lego bricks (load balancers, caches, CDNs, queues, databases) you’ll snap together in step 5, with the “when and why” for each.