The Framework

The blank whiteboard is the enemy. Beat it with a script — the same six steps, in the same order, every single time. The prompt changes; the process doesn’t. Internalize this and “design YouTube” stops being terrifying and starts being fill-in-the-blanks.

The 6-step interview script — spend your 45 minutes roughly like this

Time-box ruthlessly. The two most-skipped steps — requirements and estimation — are the two that interviewers grade hardest. Don't draw a single box before you've done them.

Step 1 — Requirements (don’t skip this, ever)

The fastest way to fail is to start drawing. Spend the first five minutes asking questions. You’re doing two things: scoping the problem to something buildable in 45 minutes, and surfacing the non-functional targets that will drive every later choice.

Clarify functional requirements

What are the 2–3 core features? For “design Twitter”: post a tweet, view a home timeline, follow a user. Explicitly defer the rest out loud — “I’ll leave search, DMs, and ads out of scope unless you’d like me to cover them.” This shows judgment and buys you time.

Pin down non-functional requirements

Ask directly:

Scale — “How many daily active users? Read-heavy or write-heavy?”
Latency — “What’s an acceptable p99 — is this interactive (sub-200 ms) or can it be async?”
Availability vs consistency — “If there’s a network partition, is it worse to be down or to show slightly stale data?” (This is the CAP question in disguise.)
Durability — “Can we ever lose data?”

State your assumptions

The interviewer often won’t give exact numbers. Make reasonable ones out loud and let them correct you: “I’ll assume 100 M DAU, read:write of 100:1, and that stale reads of a few seconds are acceptable. Sound right?”

⚠️

The single biggest mistake junior candidates make: drawing boxes in minute one. You’ll design the wrong system fast. A senior candidate is comfortable spending 5+ minutes apparently “doing nothing” but actually defining the problem. The requirements you extract here are the rubric you’ll be graded against — get them wrong and even a beautiful design scores low.

Step 2 — Back-of-the-envelope estimation

Now turn the requirements into numbers. You’re not aiming for precision — you’re aiming for the right order of magnitude, because that’s what decides architecture. 100 QPS fits on one laptop. 1 M QPS needs hundreds of machines, sharding, and caching. The number tells you which world you’re in.

The only three formulas you need

QPS = (daily actions × actions-per-user) ÷ 86,400 s. Approximate a day as 10⁵ seconds and the division is trivial.
Storage = objects/day × size/object × retention (days). Multiply out, convert to TB/PB.
Bandwidth = QPS × size/object. Split read bandwidth from write bandwidth.

And always compute peak, not just average — real traffic spikes 2–10× above the mean. Size for the peak or you fall over at the worst possible moment.

Try it — drag the inputs

Back-of-the-envelope — drag the inputs, read off the capacity

Daily active users10 M

Writes per user / day5

Read : write ratio100:1

Size per object1 KB

Peak factor (× average)3×

Writes / day50.00 M

Reads / day5.00 B

Avg write QPS579 /s

Avg read QPS57.9 K /s

Peak write QPS1.7 K /s

Peak read QPS173.6 K /s

New storage / day47.68 GB

Storage @ 5 years84.98 TB

Rule of thumb: 1 day ≈ 86,400 s ≈ 10⁵ s. So "X per day" ÷ 10⁵ ≈ QPS. Reads usually dwarf writes — that read:write ratio is why caches and read-replicas exist.

Numbers worth memorizing so you can do this in your head:

1 day ≈ 86,400 s ≈ 10⁵ s
1 million writes/day ≈ ~12 writes/sec average
Char = 1 byte, a typical short DB row ≈ hundreds of bytes to a few KB, a photo ≈ hundreds of KB to a few MB, a minute of video ≈ tens of MB
2ⁱ⁰ = 1 K, 2²⁰ = 1 M, 2³⁰ = 1 B (KB / MB / GB)
A commodity server: ~10–100 K simple QPS, tens of GB RAM, low-single-digit TB disk

A worked estimate: “design a paste-bin / TinyURL”

Assume 100 M new URLs/day, read:write = 100:1, 500 bytes stored per URL, kept 5 years.

Write QPS: 100 M ÷ 10⁵ ≈ 1,000 writes/s (peak ×3 ≈ 3,000/s).
Read QPS: 100× that ≈ 100,000 reads/s (peak ≈ 300,000/s). → reads dominate; this screams “cache + read replicas.”
Storage: 100 M × 500 B = 50 GB/day → × 365 × 5 ≈ ~90 TB over 5 years. → too big for one machine’s RAM; fits comfortably on a sharded disk store.

Three numbers, and the architecture is already taking shape: heavy reads → cache; 90 TB → shard the database; 100:1 ratio → optimize the read path above all else.

Step 3 — API design

Define the contract between client and server. Keep it small — just the core features. This is where functional requirements become concrete, and it forces you to decide what data flows in and out.

Things interviewers listen for here: sensible HTTP verbs, where pagination goes (always paginate list endpoints — ?cursor=...&limit=20), how you handle auth (an API key or token in the header), and whether reads and writes are cleanly separated.

Step 4 — Data model

What are the entities, and — crucially — what are the access patterns? The access pattern decides SQL vs NoSQL far more than the data shape does.

The line that earns points: “Because the only read is a point lookup by short_code and there are no joins, a key-value/NoSQL store fits better than a relational one — and it shards trivially on the key.” You connected the data model to a storage choice via the access pattern. That’s the move.

Step 5 — High-level design

Now you draw boxes. Start with the simplest thing that satisfies the functional requirements — client → load balancer → app servers → database — then layer in the components that satisfy the non-functional ones you found in step 1.

Start simple, then justify each addition against a requirement

The load balancer is there because we need to scale app servers horizontally. The cache is there because reads dominate writes 100:1. Each box exists to satisfy a specific requirement — never add one 'just because.'

The narration matters as much as the diagram: “I’m adding the load balancer so I can scale app servers horizontally and survive a server dying. I’m adding the cache because reads outnumber writes 100:1, and a cache hit at ~1 ms beats a DB read at ~10 ms.” Tie. Every. Box. To. A. Requirement.

Step 6 — Deep dives & bottlenecks

The first five steps get you a working design. The last 15 minutes are where you earn the offer: find your own bottleneck and address it. This is the difference between a passing and a strong interview.

Good deep-dive directions to volunteer:

“The hottest URL could overwhelm one cache node — I’d handle that with consistent hashing plus replication of hot keys.”
“Generating unique short codes across many servers is a coordination problem — here’s how a key-generation service or a counter-with-base62 avoids collisions.”
“For the 100k/s read path, a single DB won’t keep up — I’d add read replicas and shard by short_code.”
Single points of failure — “the load balancer itself is an SPOF; in production I’d run an active-passive pair.”

Drive your own deep dives. Don’t wait to be asked “what’s wrong with this?” — say it first. “The obvious bottleneck here is X; let me address it.” Naming your design’s weakness before the interviewer does is the strongest possible signal: it shows you actually understand the trade-offs, not just the happy path.

The script on one line

Requirements (what + how-well) → Estimate (QPS, storage, bandwidth) → API (the contract) → Data model (entities + access patterns) → High-level design (boxes justified by requirements) → Deep dives (find and fix your own bottleneck).

Quick check

A prompt says 'design a system handling 50 million writes per day.' Roughly what's the average write QPS, and why does the number matter?

Why estimate for PEAK QPS rather than average QPS?

In the framework, why design the data model's ACCESS PATTERNS before choosing SQL vs NoSQL?

Next: Building Blocks — the Lego bricks (load balancers, caches, CDNs, queues, databases) you’ll snap together in step 5, with the “when and why” for each.

Introduction Building Blocks

Finished this page?