Deep Learning · Graphs

Graphs from Scratch: The Foundation Every GNN Stands On

Before message passing, before adjacency matrices — there are nodes, edges, and a handful of properties that govern everything.

June 202610 min readGraphs, Fundamentals, GNN

Most GNN tutorials start in the middle. They hand you an adjacency matrix or a message-passing equation and expect you to already know what a graph is — not the bar-chart kind, the mathematical kind.

This post goes back to the beginning. Not because it's simple, but because the concepts here — degree, neighborhood, paths, connectivity — are the exact vocabulary that every GNN paper uses without defining. Get these right and everything downstream clicks.

A graph is just things and connections. But that simple structure is expressive enough to model molecules, social networks, the internet, and the geometry of spacetime.

What is a graph?

A graph G = (V, E) is two sets:

V — a set of vertices (or nodes): the objects
E — a set of edges: pairs of vertices that are connected

That's the entire definition. No coordinates, no layout, no colors. A graph is pure relationship — who is connected to whom.

Graph definition

G = (V, E), where E ⊆ V × V

The power of this abstraction is its universality. A water molecule is a graph (atoms = nodes, bonds = edges). Your friend network is a graph. A sentence parsed into a dependency tree is a graph. A road map is a graph. The same mathematical machinery works on all of them.

See it, don't just read it

The explorer below lets you switch between four classic graph shapes. Hover any node to see its neighborhood — the set of nodes directly connected to it — and its degree — the count of those connections.

Graph Explorer

Pick a shape, then hover any node to reveal its neighborhood

Graph stats

Nodes

Edges

Hover a node to see its degree and neighbors.

A few things to notice:

Triangle: every node has degree 2, every node is a neighbor of every other node. This is a complete graph on 3 vertices (K₃).
Star: the center node (A) has degree 5; every leaf has degree 1. One node holds all the connectivity. In GNN terms, A would dominate the message flow.
Cycle: every node has exactly degree 2. It's regular — all nodes are structurally identical. A GNN would struggle to distinguish them without positional encodings.
Tree: no cycles. There's exactly one path between any two nodes. Remove any edge and the graph splits in two.

Degree: the first property that matters

The degree of a node is the number of edges touching it. It sounds trivial, but degree is the single most informative local property of a node — and it shows up everywhere in GNN design.

Degree

deg(v) = |N(v)| = number of neighbors of v

Why does degree matter for GNNs? Because the aggregation step in message passing collects information from N(v) — the neighborhood. A node with degree 1 hears from one source. A node with degree 100 hears from a hundred. Without normalization, high-degree nodes overwhelm low-degree ones.

That's exactly why GCN uses the normalized adjacency D⁻½AD⁻½ — the degree matrix D scales messages so that a node with many connections doesn't dominate just because it's popular.

The heatmap below colors each node by its degree. Switch between graph shapes and watch how the degree distribution changes — a star has extreme variance, a cycle has none.

Degree Heatmap

Warmer = more connections. Spot the hub instantly.

Degree distribution

low → high degree

The handshaking lemma

Here's a small theorem that feels obvious once you see it, but shows up constantly in graph theory proofs:

Handshaking lemma

Σ deg(v) = 2|E|

Every edge has two endpoints. So every edge adds exactly 2 to the total degree count. It doesn't matter how the graph is wired — the sum of all degrees is always twice the number of edges.

This means, among other things, that the number of nodes with odd degree is always even. You can't have a graph where exactly one node has an odd number of connections.

Build a graph below and watch the lemma hold in real time:

Graph Builder

Build your own graph and watch the properties change in real time

Live properties

|V| = 3

Nodes

|E| = 1

Edges

Sum deg

Handshaking lemma

Sum of degrees = 2 x |E| = 2 x 1 = 2

Every edge contributes exactly 2 to the total degree count — one for each endpoint. Add edges and watch this hold.

Adeg = 1

Bdeg = 1

Cdeg = 0

Try these experiments:

Add a node with no edges. It contributes degree 0 — the sum stays the same.
Add one edge. The sum jumps by exactly 2.
Try to make the sum odd. You can't. It's always 2|E|.
Build a star. The center gets degree n-1; each leaf gets degree 1. Sum = 2(n-1) = 2|E|. It checks out.

Paths and distance

A path between two nodes is a sequence of edges connecting them. The shortest path (also called the geodesic) is the path with the fewest edges — and its length defines the distance between the two nodes.

Distance

d(u, v) = length of shortest path from u to v

This concept maps directly to GNN depth. After one round of message passing, a node has information from its 1-hop neighbors. After two rounds, 2-hop neighbors. After k rounds, everything within distance k.

The tool below finds shortest paths using breadth-first search (BFS) — the same algorithm that GNNs implicitly follow when stacking layers. Click two nodes and watch the path light up:

Path Finder

Click two nodes to find the shortest path between them (BFS)

Shortest path

Start

—

End

—

Click a node to set the start point.

A few things worth exploring:

In the tree, there's exactly one path between any pair. Trees have no redundancy — lose one edge and you lose reachability.
In the cycle, there are two paths between any pair (going each direction around the ring). The shortest is the one that takes fewer hops.
In the star, the maximum distance between any two non-center nodes is 2 — they all route through A. This makes A a bottleneck in GNN terms: all information between leaves has to funnel through one vector.

Types of graphs

Not all graphs are created equal. Here's the taxonomy that matters for GNNs:

Undirected

Edges go both ways. Friendships, chemical bonds, spatial proximity. Most GNN papers assume this.

Directed

Edges have direction. Citations, web links, dependency graphs. Message passing follows the arrow.

Weighted

Edges carry a number — distance, strength, cost. The GNN can use weights as attention scores or scaling factors.

Attributed

Nodes (and sometimes edges) carry feature vectors — atom type, word embedding, pixel patch. This is the X in H = σ(ÂXW).

Real-world graphs are usually several of these at once. A molecular graph is undirected, attributed (atom type, charge), and sometimes weighted (bond order). A citation network is directed and attributed. The abstractions compose.

Connectivity

A graph is connected if there's a path between every pair of nodes. If not, it breaks into connected components — separate subgraphs that have no edges between them.

This matters for GNNs because message passing can't send information across disconnected components. If your graph has two separate clusters, no amount of layers will let one cluster influence the other.

In practice, real graphs are often mostly connected but have a few isolated nodes or small disconnected subgraphs. Graph datasets handle this by either processing components independently or adding virtual edges.

Why this matters for GNNs

Every concept in this post maps to a GNN design decision:

Degree → why GCN normalizes by D⁻½AD⁻½ instead of raw A
Neighborhood → the receptive field of one message-passing layer
Distance → the number of layers needed to propagate information between two nodes
Connectivity → what a GNN can and cannot learn about — information can't cross gaps
Graph type → whether message passing is symmetric or directional, whether edges carry weights

In the adjacency matrix post, we saw how these relationships get encoded into a matrix. In the message passing post, we saw how a GNN reads that matrix and learns from it. This post is the layer underneath both — the raw graph structure that everything builds on.

Graphs are deceptively simple: just nodes and edges. But from that simplicity comes the power to model nearly any relational structure in the world — and from these basic properties comes the design language of every graph neural network.

Share this post

Back to blog

Deep Learning · Graphs

Graphs from Scratch: The Foundation Every GNN Stands On

Before message passing, before adjacency matrices — there are nodes, edges, and a handful of properties that govern everything.

June 202610 min readGraphs, Fundamentals, GNN

A graph is just things and connections. But that simple structure is expressive enough to model molecules, social networks, the internet, and the geometry of spacetime.

What is a graph?

A graph G = (V, E) is two sets:

V — a set of vertices (or nodes): the objects
E — a set of edges: pairs of vertices that are connected

That's the entire definition. No coordinates, no layout, no colors. A graph is pure relationship — who is connected to whom.

Graph definition

G = (V, E), where E ⊆ V × V

See it, don't just read it

Graph Explorer

Pick a shape, then hover any node to reveal its neighborhood

Graph stats

Nodes

Edges

Hover a node to see its degree and neighbors.

A few things to notice:

Triangle: every node has degree 2, every node is a neighbor of every other node. This is a complete graph on 3 vertices (K₃).
Star: the center node (A) has degree 5; every leaf has degree 1. One node holds all the connectivity. In GNN terms, A would dominate the message flow.
Cycle: every node has exactly degree 2. It's regular — all nodes are structurally identical. A GNN would struggle to distinguish them without positional encodings.
Tree: no cycles. There's exactly one path between any two nodes. Remove any edge and the graph splits in two.

Degree: the first property that matters

The degree of a node is the number of edges touching it. It sounds trivial, but degree is the single most informative local property of a node — and it shows up everywhere in GNN design.

Degree

deg(v) = |N(v)| = number of neighbors of v

That's exactly why GCN uses the normalized adjacency D⁻½AD⁻½ — the degree matrix D scales messages so that a node with many connections doesn't dominate just because it's popular.

The heatmap below colors each node by its degree. Switch between graph shapes and watch how the degree distribution changes — a star has extreme variance, a cycle has none.

Degree Heatmap

Warmer = more connections. Spot the hub instantly.

Degree distribution

low → high degree

The handshaking lemma

Here's a small theorem that feels obvious once you see it, but shows up constantly in graph theory proofs:

Handshaking lemma

Σ deg(v) = 2|E|

Every edge has two endpoints. So every edge adds exactly 2 to the total degree count. It doesn't matter how the graph is wired — the sum of all degrees is always twice the number of edges.

This means, among other things, that the number of nodes with odd degree is always even. You can't have a graph where exactly one node has an odd number of connections.

Build a graph below and watch the lemma hold in real time:

Graph Builder

Build your own graph and watch the properties change in real time

Live properties

|V| = 3

Nodes

|E| = 1

Edges

Sum deg

Handshaking lemma

Sum of degrees = 2 x |E| = 2 x 1 = 2

Every edge contributes exactly 2 to the total degree count — one for each endpoint. Add edges and watch this hold.

Adeg = 1

Bdeg = 1

Cdeg = 0

Try these experiments:

Add a node with no edges. It contributes degree 0 — the sum stays the same.
Add one edge. The sum jumps by exactly 2.
Try to make the sum odd. You can't. It's always 2|E|.
Build a star. The center gets degree n-1; each leaf gets degree 1. Sum = 2(n-1) = 2|E|. It checks out.

Paths and distance

Distance

d(u, v) = length of shortest path from u to v

The tool below finds shortest paths using breadth-first search (BFS) — the same algorithm that GNNs implicitly follow when stacking layers. Click two nodes and watch the path light up:

Path Finder

Click two nodes to find the shortest path between them (BFS)

Shortest path

Start

—

End

—

Click a node to set the start point.

A few things worth exploring:

In the tree, there's exactly one path between any pair. Trees have no redundancy — lose one edge and you lose reachability.
In the cycle, there are two paths between any pair (going each direction around the ring). The shortest is the one that takes fewer hops.
In the star, the maximum distance between any two non-center nodes is 2 — they all route through A. This makes A a bottleneck in GNN terms: all information between leaves has to funnel through one vector.

Types of graphs

Not all graphs are created equal. Here's the taxonomy that matters for GNNs:

Undirected

Edges go both ways. Friendships, chemical bonds, spatial proximity. Most GNN papers assume this.

Directed

Edges have direction. Citations, web links, dependency graphs. Message passing follows the arrow.

Weighted

Edges carry a number — distance, strength, cost. The GNN can use weights as attention scores or scaling factors.

Attributed

Nodes (and sometimes edges) carry feature vectors — atom type, word embedding, pixel patch. This is the X in H = σ(ÂXW).

Connectivity

A graph is connected if there's a path between every pair of nodes. If not, it breaks into connected components — separate subgraphs that have no edges between them.

Why this matters for GNNs

Every concept in this post maps to a GNN design decision:

Degree → why GCN normalizes by D⁻½AD⁻½ instead of raw A
Neighborhood → the receptive field of one message-passing layer
Distance → the number of layers needed to propagate information between two nodes
Connectivity → what a GNN can and cannot learn about — information can't cross gaps
Graph type → whether message passing is symmetric or directional, whether edges carry weights