Representing Graphs

Before you run a single graph algorithm, you make one quiet decision — how to store the graph — and it silently sets the memory and speed of everything that follows. For a 10,000-node graph, one choice uses ~100 MB and the other uses ~1000× less. Pick wrong and your perfectly correct algorithm runs out of memory. Pick right and it flies.

A graph is a concept. To run algorithms on it, you need to store it in memory. There are two universal ways — adjacency list and adjacency matrix — and choosing between them is one of the most important decisions in any graph problem.

Let’s use this graph as our running example throughout:

Our running example: 5 nodes, 6 edges

5 nodes · 6 edges

Representation 1: Adjacency list

The most common choice. For each node, store a list of its neighbours.

0: [1, 2]
1: [0, 2, 3]
2: [0, 1, 4]
3: [1, 4]
4: [2, 3]

In code, this is usually a Map<Node, List<Node>> (or simply a list-of-lists indexed by node number).

Recall the build — what’s the one extra line that makes a graph undirected?

python · fill in the blanks0/1 hints

def build_graph(n, edges, directed=False):
  adj = [[] for _ in range(n)]
  for u, v in edges:
      adj[u].append(v)            # u -> v
      if not directed:
          # ??? undirected edges go both ways
  return adj

What an adjacency list costs

Operation	Cost
Add an edge	O(1)
Look up a node’s neighbours	O(degree)
Check if edge (u, v) exists	O(degree of u)
Iterate every edge in the graph	O(V + E)
Total memory	O(V + E)

The “O(V + E) memory” is the key win. For sparse graphs (most real-world graphs are sparse — every node connected to only a few others), this is dramatically smaller than the matrix.

Representation 2: Adjacency matrix

A V × V boolean (or integer) matrix where M[u][v] = 1 if there’s an edge from u to v, and 0 otherwise. For weighted graphs, M[u][v] stores the weight directly (often using infinity for “no edge”).

For our example (undirected):

    0  1  2  3  4
0 [ 0  1  1  0  0 ]
1 [ 1  0  1  1  0 ]
2 [ 1  1  0  0  1 ]
3 [ 0  1  0  0  1 ]
4 [ 0  0  1  1  0 ]

The matrix is symmetric for undirected graphs (because (u, v) and (v, u) are the same edge).

What an adjacency matrix costs

Operation	Cost
Add an edge	O(1)
Check if edge (u, v) exists	O(1)
Look up a node’s neighbours	O(V)
Iterate every edge in the graph	O(V²)
Total memory	O(V²)

The win: edge existence is O(1). The loss: memory is O(V²) regardless of how few edges you have.

Predict first

Your algorithm constantly asks 'is there an edge directly between u and v?' Which representation answers that in O(1) — and what do you pay for it?

When to use which

The trade-off is sharp. Pick based on how dense your graph is:

Sparse graph (m ≈ V or 2V): use an adjacency list. Memory is O(V + E) ≈ O(V), and every algorithm is fast.
Dense graph (m ≈ V²): use an adjacency matrix. The O(1) edge check pays off, and you’re stuck with O(V²) memory anyway.

Concrete rules of thumb:

Use adjacency list when…	Use adjacency matrix when…
Most algorithms (BFS, DFS, Dijkstra, Bellman-Ford)	The graph is dense (`E ≈ V²`)
Memory-constrained or large `V` with sparse edges	You need O(1) “is (u, v) an edge?”
Iterating neighbours is the bottleneck	Algorithms over matrix algebra (Floyd-Warshall)
Default — over 90% of problems	Small graphs (V ≤ 1000) where memory isn’t an issue

Quick math: for V = 10⁴ vertices, an adjacency matrix uses 10⁴ × 10⁴ = 10⁸ cells — about 100 MB even with one byte each. Adjacency list with E = 10⁵ edges uses 10⁵ list entries. 1000× less memory. That’s why adjacency list is the default.

Bonus: edge list

Sometimes the simplest representation is the right one — just a list of edges. Used in some algorithms (Bellman-Ford, Kruskal’s MST, Union-Find based algorithms) that iterate over edges anyway.

[(0, 1), (0, 2), (1, 2), (1, 3), (2, 4), (3, 4)]

No fast neighbour lookup — but if your algorithm only ever loops over edges, this is the cleanest data structure.

Weighted graphs

For weighted graphs, just attach the weight to each entry:

We’ll use this exact pattern for Dijkstra’s algorithm on Day 21.

A common gotcha — node IDs

A subtle question: what is a node? In most LeetCode problems they’re integers 0..n-1, perfect for arrays. But in real problems they might be strings, custom objects, coordinates, or anything else.

Two common approaches:

Map IDs to integers first. Build a Map<NodeID, int> indexing each node from 0 to n-1, then use plain arrays for everything. Best performance, slight extra code.
Use a Map<NodeID, List<NodeID>> directly. Slower (hash lookups), more memory, but reads cleanly when node IDs are strings or tuples.

Pick (1) for tight competitive code, (2) for clarity. Most interview problems are fine with either.

Summary

Representation	Memory	Edge check	Iterate neighbours	Best for
Adjacency list	O(V + E)	O(deg)	O(deg)	Sparse graphs (default)
Adjacency matrix	O(V²)	O(1)	O(V)	Dense graphs
Edge list	O(E)	O(E)	—	Algorithms over edges (Kruskal)

Default choice for almost everything: adjacency list.

Quick check

A graph has V = 10,000 nodes but only E = 30,000 edges (sparse). Which representation, and why?

You can now store a graph the right way for the job. Next: graph types — the special shapes (DAGs, bipartite, trees, complete) whose structure unlocks faster, specialized algorithms.

Introduction Types of Graphs

Finished this page?