Math Behind IQ

Nov 6, 2025

By Kennon Stewart

Entropy as a Supermartingale, Stopping Times, and Query Complexity

Setting and Notation

We answer a query qq about a latent correct answer YYY\in\mathcal Y. At time tt, the agent has an evidence graph

Gt=Φ(r1:t,q),G_t=\Phi(r_{1:t},q),

and its uncertainty is

Ht;=;H(Yq,Gt).H_t ;=; H(Y\mid q,G_t).

Let the filtration (the information the agent has seen so far) be

Ft;=;σ(q,a1:t,r1:t).\mathcal F_t ;=; \sigma\big(q,a_{1:t},r_{1:t}\big).

Because GtG_{t} is a deterministic function of (q,r1:t)(q, r_{1:t}), both GtG_t and HtH_t are Ft\mathcal F_t-measurable. (For background on stopping times and measurability of stopped/optional objects, see the standard definitions and results in the general theory of stochastic processes.)

Assumptions

  • A (no systematic misinformation): Conditioning on Ft\mathcal F_{t}, the expected entropy after incorporating the next result does not increase: E[H(Yq,Gtrt+1),,Ft]\leH(Yq,Gt).\mathbb{E} \left[ H \big(Y \mid q,G_{t}\cup{r_{t+1}}\big),\big|,\mathcal F_t\right]\leH(Y\mid q,G_t).
  • B (causality): The policy chooses ata_t using only current information, i.e., ata_t is Ft\mathcal F_t-measurable.

Theorem 1 — Entropy is a Supermartingale

Statement. Assume A–B and 0Ht<0\le H_t<\infty. Then (Ht)t0(H_t)_{t\ge 0} is a supermartingale w.r.t. (Ft)(\mathcal F_t):

E[Ht+1Ft]Ht.\mathbb{E}[H_{t+1}\mid \mathcal F_t]\le H_t.

Proof of Theorem 1

Because GtG_t is Ft\mathcal F_t-measurable, so is Ht=H(Yq,Gt)H_t=H(Y\mid q,G_t). By Assumption A,

E ⁣[H(Yq,Gtrt+1)Ft]H(Yq,Gt)=Ht,\mathbb{E}\!\left[H(Y\mid q, G_t\cup r_{t+1})\mid \mathcal F_t\right]\le H(Y\mid q,G_t)=H_t,

and by definition Ht+1=H(Yq,Gt+1)=H(Yq,Gtrt+1)H_{t+1}=H(Y\mid q,G_{t+1})=H(Y\mid q,G_t\cup r_{t+1}). Therefore E[Ht+1Ft]Ht\mathbb{E}[H_{t+1}\mid\mathcal F_t]\le H_t. ■

Stopping at Sufficient Confidence

Fix ε>0\varepsilon>0 and define the hitting time

auε:=inf{t0:Htε}. au_\varepsilon := \inf\{t\ge 0: H_t\le \varepsilon\}.

Because HtH_t is Ft\mathcal F_t-measurable, the event {Htε}Ft\{H_t\le\varepsilon\}\in\mathcal F_t, hence τε\tau_\varepsilon is a stopping time. (Stopped/optional objects at stopping times are measurable in the expected σ\sigma-fields.)

Under standard integrability/regularity (boundedness or bounded increments suffices in our discrete setup), optional stopping for supermartingales yields

E[Hτε]E[H0],extandsinceHτεε,E[H0]εE[H0]E[Hτε].\mathbb{E}[H_{\tau_\varepsilon}]\le\mathbb{E}[H_0],\qquad ext{and since } H_{\tau_\varepsilon}\le \varepsilon,\quad \mathbb{E}[H_0]-\varepsilon\le\mathbb{E}[H_0]-\mathbb{E}[H_{\tau_\varepsilon}].

(For the discrete-time optional sampling/sampling inequality, see Doob’s theorem in Pascucci’s Chapter 6; it is stated for sub/martingales and used analogously for supermartingales with signs adjusted. )

From Information Gain to Cost

Define the conditional expected entropy drop at step tt:

Δt:=Ht1E[HtFt1]0(by Theorem 1).\Delta_t := H_{t-1}-\mathbb{E}[H_t\mid \mathcal F_{t-1}] \ge 0 \quad\text{(by Theorem 1)}.

Summing to τε\tau_\varepsilon and taking expectations,

E[H0Hτε]=E ⁣[t=1τεΔt]E[H0]ε.(*)\mathbb{E}\big[H_0-H_{\tau_\varepsilon}\big] = \mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}\Delta_t\right] \ge \mathbb{E}[H_0]-\varepsilon. \tag{*}

Let each action have cost c(at)0c(a_t)\ge 0. Suppose the per-step “bits-per-cost” is bounded by a channel constant Γ\Gamma in expectation:

E[ΔtFt1]ΓE[c(at)Ft1].\mathbb{E}[\Delta_t\mid\mathcal F_{t-1}] \le \Gamma\,\mathbb{E}[c(a_t)\mid\mathcal F_{t-1}].

Then

E ⁣[t=1τεΔt]ΓE ⁣[t=1τεc(at)].\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}\Delta_t\right] \le \Gamma\,\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}c(a_t)\right].

Combining with ((*)) gives the effort lower bound

E ⁣[t=1τεc(at)]E[H0]εΓ\boxed{ \mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}c(a_t)\right] \ge \frac{\mathbb{E}[H_0]-\varepsilon}{\Gamma} }

Proof notes (why optional stopping is legitimate)

  • We’re in discrete time, (Ht) is bounded below, and (\tau\varepsilon) is (by design) a bounded/finite stopping time under any admissible policy that either hits (\varepsilon) or halts with abstention. Under these mild conditions, the optional sampling/optional stopping statements we used are standard.

Query Complexity

Definition (Query complexity at tolerance ε\varepsilon).

C(q;ε):=infπ admissibleEπ[t=1τεc(at)].C(q;\varepsilon) := \inf_{\pi\ \text{admissible}} \mathbb{E}_\pi\Bigg[\sum_{t=1}^{\tau_\varepsilon}c(a_t)\Bigg].

Taking infimum over policies in ((†)) yields a first-principles lower bound on (C(q;\varepsilon)).

Sigma-Algebras & Filtrations (quick refresher)

  • A sigma-algebra lists the events you can measure; a filtration (Ft)(\mathcal F_t) is a growing family of such sigma-algebras capturing information over time. (Formal definitions and the role of optional/predictable σ\sigma-fields are standard; see, e.g., the development of optional processes and their measurability at stopping times.)
  • Stopping times are random times whose occurrence is knowable from the past: {τt}Ft\{\tau\le t\}\in\mathcal F_t. (Definition and basic properties; also how stopped processes inherit measurability.)
  • Optional sampling/Doob: for (super/sub)martingales, conditional expectations at stopping times behave as expected; see the discrete statement used repeatedly in practice.

Proof of “τε\tau_\varepsilon is a stopping time”

Because HtH_t is Ft\mathcal F_t-measurable, {Htε}Ft\{H_t\le\varepsilon\}\in\mathcal F_t. The event {τεt}\{\tau_\varepsilon\le t\} equals st{Hsε}Ft\bigcup_{s\le t}\{H_s\le\varepsilon\}\in\mathcal F_t, so τε\tau_\varepsilon is a stopping time (and standard section/projection results ensure the usual measurability behaviors at τ\tau).


Takeaways

Supermartingale spine

(H_t) drops in conditional expectation under non-deceptive retrieval.

Stopping-time policy

“Stop when (H_t\le \varepsilon)” is well-posed and analyzable.

Effort lower bound

Expected cost (\ge (\mathbb E[H_0]-\varepsilon)/\Gamma).

Composable with graphs

Combine the bits-per-cost bound with graph-theoretic lower bounds (e.g., Steiner-style) to capture structural hardness as well as informational hardness.


Proof of Optional Sampling we rely on

(Discrete) Optional Sampling Theorem. If MM is a (sub/super)martingale and στ\sigma\le \tau are stopping times with standard integrability/boundedness, then

E[MτFσ] {=Mσ(martingale)Mσ(submartingale)Mσ(supermartingale).\mathbb{E}[M_\tau\mid \mathcal F_\sigma]\ \begin{cases} = M_\sigma & \text{(martingale)}\\ \ge M_\sigma & \text{(submartingale)}\\ \le M_\sigma & \text{(supermartingale)}. \end{cases}

See Pascucci, Ch. 6 (discrete case); analogous statements reappear in continuous time with usual conditions.



Derivation — Cost Lower Bound from Supermartingale

Proof

Let Δt=Ht1E[HtFt1]0\Delta_t=H_{t-1}-\mathbb{E}[H_t\mid\mathcal F_{t-1}]\ge 0. Summing to τε\tau_\varepsilon and taking expectations gives

E[H0Hτε]=E ⁣[t=1τεΔt]E[H0]ε.\mathbb{E}[H_0-H_{\tau_\varepsilon}] =\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}\Delta_t\right] \ge \mathbb{E}[H_0]-\varepsilon.

If E[ΔtFt1]ΓE[c(at)Ft1]\mathbb{E}[\Delta_t\mid\mathcal F_{t-1}]\le \Gamma\,\mathbb{E}[c(a_t)\mid\mathcal F_{t-1}] for some Γ>0\Gamma>0, then

E ⁣[t=1τεΔt]ΓE ⁣[t=1τεc(at)],\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}\Delta_t\right] \le \Gamma\,\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}c(a_t)\right],

so

E ⁣[t=1τεc(at)]E[H0]εΓ.\mathbb{E}\!\left[\sum_{t=1}^{\tau_\varepsilon}c(a_t)\right] \ge \frac{\mathbb{E}[H_0]-\varepsilon}{\Gamma}.