IQ - Measuring Graph Intelligence
Cite this work
Show citation formats
Kennon Stewart (2025). IQ - Measuring Graph Intelligence.@inproceedings{iq_2025,
title={IQ - Measuring Graph Intelligence},
author={Kennon Stewart},
year={2025},
}The Problem of Relative Complexity
“Cities are too complex” is not a statement of fact. It’s a refusal to specify what kind of complexity we mean. And it’s every statistician’s excuse for ignoring urban informatics.
ML researchers are only slightly better, but still routinely train models across cities as if they were exchangeable samples from a single data-generating process. This assumption is almost never stated, but infects our analyses: in graph embeddings trained jointly on Paris and Detroit, in navigation heuristics evaluated on Manhattan and Istanbul, in retrieval systems that treat “a city” as a homogeneous object.
This is indefensible.
Urban road networks are not generated by a single stochastic process. Paris is the partial output of centralized intervention (Haussmann). Istanbul is an accretion process constrained by topography, empire, and path dependence. Lagos, Prague, Manhattan—each reflects a distinct historical prior. Treating them as IID graph samples is not simplification; it is model misspecification.
The obvious objection follows: of course cities differ. The non-obvious failure is that we lack a principled way to measure how they differ structurally, independent of geography, scale, or visual intuition.
So we ask a sharper question:
If we strip a city of geometry and retain only adjacency—only topology—can we quantify how surprising its local structure is relative to its own global baseline?
This is a hypothesis, not a metaphor. And hypotheses deserve estimators.
Cities as Messages, Not Maps
Let a city’s road network be an undirected graph
An agent traversing this network does not experience the city “all at once.” It encounters induced subgraphs—local neighborhoods revealed incrementally. For any subset S \subset V, let G[S] denote the induced subgraph.
We treat each as a finite message describing local structure.
The question becomes informational:
How many bits are required to describe this subgraph, and how does that compare to what we would expect if the city were structurally uniform?
This reframes network information as a coding problem. And this is something that we can test.
The Null: Erdős–Rényi as a (Simple) Universal Baseline
We need a reference model. This defines what we consider to be a “random” state under our null hypothesis. Put simply: if we want to tell whether a fuzzy image depicts a cat, we’ll probably need to compare it with other images we know to be cats. Against this assumption that a cat has certain features defined by our null model (ears, fur, (usually) 4 legs), how much does this fuzzy picture align with our expectations?
For a subgraph with we assume an Erdős–Rényi model , where edges occur independently with probability p.
Under this model:
The corresponding Shannon code length (negative log-likelihood) is:
This is not an aesthetic choice. ER is the maximum entropy model subject to a single constraint (edge density). Any deviation from it must be explained by structure.
Absolute vs. Relative Description Length
A dense subgraph is not necessarily complex. Density is not structure.
So we compute two descriptions: • Local model: fit to the subgraph. • Global model: use the city-wide density
Naively, the local model always fits better—it is optimized for that subgraph. MDL demands that we pay for this flexibility.
Using the standard stochastic complexity / BIC penalty (one parameter, sample size ):
The penalized MDL divergence is then:
Interpretation:
Is the local structure informative enough to justify building a special model for it?
Asymptotically, up to finite-sample correction.
This is structural surprise, not geometry, not aesthetics.
Traversal, Not Snapshots: The IQ Profile
Complexity is scale-dependent. A cul-de-sac looks simple until it opens into a maze.
For a root node v, define the radius-r ego graph: .
We define the Local Complexity Profile (IQ) as: where expectation is taken over randomly sampled seed nodes.
As , this profile must converge to zero: the local view becomes the city itself.
The rate of convergence is the signal.
How this Shows Up on Actual Maps
We validate the estimator against synthetic families:
- Grids: slow convergence—loops must be “learned.”
- Scale-free networks: rapid convergence—hubs expose global structure quickly.
- Erdős–Rényi graphs: no convergence to zero—residual randomness persists.
These behaviors are not cosmetic. They correspond exactly to the traversal experience of an agent with limited horizon. And the Erdos-Renyi test was our strongest validation yet. When we control for a stable amount of randomness, we intentionally create noise that our measure can’t compress. This appears on our MDL curves as a single curve that hovers—never quite meeting-the absolute 0.
Even applied to real cities, the ordering is stable.
London’s IQ dominates Detroit’s and San Francisco’s at nearly all radii. Manhattan collapses quickly. Lagos exhibits persistent divergence. These results survive node subsampling (à la Lunde) resampling, seed variation, and traversal depth.
Crucially, the measure ignores geometry entirely.
Downtown Detroit looks visually chaotic. Topologically, it is compressible. London looks medieval—and remains structurally surprising at every scale. This is not a bug. It’s the point of the measure.
It is epistemic effort: the number of bits an agent must expend to reconcile what it sees locally with what it expects globally.
For humans, this manifests as disorientation. For algorithms—navigation, retrieval, Graph-of-Thought, RAG—it manifests as cost.
Complexity, in this sense, is not subjective. But our intuitions about it often are.
Why This Matters
Most ML systems on cities quietly assume structural stationarity. Our results demonstrate—quantitatively—that this assumption fails within cities, not just between them.
One critique worth surfacing explicitly: ER is a blunt null. Degree-corrected or configuration-model baselines would isolate higher-order motifs more cleanly. We’re working on expanding the metric for other reference models to describe complexity at a finer grain.
But as a first-principles estimator of relative structural surprise under traversal, the construction is sound, falsifiable, and—most importantly—computable.
We object to the characterization cities as “too complex.” They are heterogeneously informative.
And information, at least, we know how to measure.