Paths, Cycles and Connectivity

A graph is a set of dots (vertices) joined by lines (edges). On its own that is just a static picture. This page is about the one thing that brings a graph to life: moving along its edges — hopping from vertex to vertex, always stepping across an edge that is really there.

Almost every question you can ask of a network is secretly a movement question. Can this message reach that server? How many handshakes separate two strangers? Is there a tour of the city that crosses every bridge exactly once? Each is answered by tracing routes across edges. So we need precise words for the different kinds of route — because "a route" turns out to mean four subtly different things.

Four words for four kinds of route

Imagine walking through a network of friendships: each step takes you from one person to a friend of theirs. Depending on the rules you impose on that stroll, it gets a different name.

Every path is a trail, and every trail is a walk — the words get stricter as you go down the list. The length of any of these is simply the number of edges you step across (not the number of vertices). A path visiting five vertices, for instance, A\!-\!B\!-\!C\!-\!D\!-\!E, uses four edges, so it has length four.

Watch a path being traced

Below is a little friendship graph. Step through to trace the route A\!-\!B\!-\!E\!-\!F. Notice how each step lights up exactly one edge, and no vertex is ever used twice — that is what makes it a path. Count the highlighted edges at the end: three of them, so the path has length 3.

Compare that with the walk A\!-\!B\!-\!E\!-\!B\!-\!C. It is a perfectly legal walk (every step crosses a real edge), but it is not a path: the vertex B appears twice. It is not even a trail, because the edge B\!-\!E is used, then re-used on the way back. Same graph, same idea of "moving" — but the rules you obey decide which of the four names your route earns.

Closing the loop: a cycle

A cycle is what you get when a path bites its own tail. Trace A\!-\!B\!-\!E\!-\!F\!-\!A: it is a path all the way around, and the final step returns to the start. No vertex is repeated except the start–end vertex, which is exactly what "closed path" allows.

Here is a fact worth memorising, because it is the single most confused count in graph theory: a path on n vertices has n-1 edges, but a cycle on n vertices has n edges. The extra edge is the one that swings you back home. Our cycle above touches four vertices (A,B,E,F) and, sure enough, uses four edges.

Connected graphs and their components

Now the payoff. A graph is connected if there is a path between every pair of vertices — you can get from anywhere to anywhere by walking along edges. If some pair has no route between them, the graph is disconnected, and it falls apart into connected components: the separate islands, each connected inside itself but with no edges bridging to the others.

The graph above has two connected components. You can walk freely within the left triangle, and freely within the right cluster, but no walk — of any length, however winding — crosses the gap between them. A connected graph is exactly a graph with one component.

Once you can move, you can measure. The distance between two vertices is the length of the shortest path connecting them — the fewest edges you must cross to get from one to the other. A vertex is at distance 0 from itself; neighbours are at distance 1. And if two vertices lie in different components, no path exists at all, so their distance is taken to be \infty.

The famous claim of "six degrees of separation" says that any two people on the planet are joined by a chain of at most six acquaintances. In our language: build the giant graph whose vertices are all humans and whose edges are "knows personally", and the distance between almost any two vertices is surprisingly small — around six. Huge networks with a few long-range links have startlingly short paths; mathematicians call these small-world graphs.

Mathematicians play their own version. Your Erdős number is your distance, in the "co-authored a paper with" graph, to the wildly prolific Paul Erdős. Erdős himself is 0; his hundreds of direct collaborators are 1; their collaborators are 2; and almost every working mathematician sits within a distance of five or six. It is shortest-path distance, dressed up as a badge of honour.

The classic slip is treating walk, trail and path as synonyms. They are not — they are a ladder of increasing strictness. A walk may repeat anything. A trail bans repeated edges. A path bans repeated vertices. Keep the test in mind: repeated a street? Then it is at most a walk. Repeated a place (but no street)? A trail, not a path.

The other trap: connected does not mean complete. "Connected" only asks that some path links each pair of vertices — a single long chain of vertices is connected even though most pairs are not directly joined. "Complete" is the far stronger demand that every pair share a direct edge. Every complete graph is connected, but the reverse is wildly false.

A worked count

Suppose a message is relayed along the route A\!-\!C\!-\!F\!-\!C\!-\!D in some network. What kind of route is it, and how long?

But this walk does prove something useful: since a walk exists from A to D, they lie in the same component, so a path between them exists too (throw away the wasteful loop C\!-\!F\!-\!C and you are left with the path A\!-\!C\!-\!D of length 2). In fact this is a general truth: whenever a walk joins two vertices, a path joins them as well — just cut out every loop.