Reading

Mo' Memory, Mo' Problems: Stream-Native Machine Unlearning

Abstract

We introduce the Memory Pair framework: the first algorithm to learn and unlearn *continuously* on streaming data. It achieves logarithmic regret with deletions, constant memory via online L-BFGS, and $(\varepsilon,\delta)$-certified guarantees—enabling GDPR compliance without expensive retrains.

Published: 1/27/2025Authors: Kennon Stewart

Cite this work

Show citation formats
APA
Kennon Stewart (2025). Mo' Memory, Mo' Problems: Stream-Native Machine Unlearning.
BibTeX
@inproceedings{memory_pairs_2025, title={Mo' Memory, Mo' Problems: Stream-Native Machine Unlearning}, author={Kennon Stewart}, year={2025}, }

ML models today must adapt on live data streams and support the right to be forgotten.

The Memory Pair framework couples a learner and unlearner into a single stream-native algorithm.

Achieves O(lnT)\mathcal{O}(\ln T) regret with certified deletions, using constant-memory online L-BFGS.

Extends model lifespan before retraining, lowering cost and enabling GDPR

compliant unlearning.

Data Moves Quickly. Models are Catching Up.

People have gotten used to ML models retraining on a giant frozen dataset. You collect a pile of data, you train the model, you ship it, you repeat this next quarter or next year.

That entire worldview is dying.

Modern ML is moving to streams. Phones, cars, watches, sensors, AR glasses, point of sale terminals. They emit new data constantly. And the model isn’t supposed to retrain from scratch every night. It is supposed to update itself continuously.

AI’s Biggest Threat Yet

So here’s the real stress fracture:

Continuous ML is colliding with privacy regulation.

GDPR includes a “right to be forgotten.” Deleting the raw data isn’t enough: the ML model itself must remove that data’s effect.

That sounds philosophical, but it’s not. It’s math. And today almost no commercial system can do this efficiently.

Most approaches say: if one user deletes their data → start over and retrain a fresh model. This is a fantasy at scale. It’s too slow. It’s too expensive. It’s too carbon-intensive. And in streaming settings it’s literally impossible, because there is no frozen master dataset anymore.

An algorithm with privacy built-in.

Our work introduces the Memory Pair, a new design where the system learns and unlearns as two sides of the same coin.

Every time the model learns from a new sample, it stores just enough “curvature memory” to later invert that update if that sample must be removed.

This is the crucial philosophical shift.

Learning and unlearning must be symmetric.

Instead of treating unlearning as a special fix-up step, we treat it as a native operation — a negative update that runs through the exact same mathematical machinery.

This symmetry gives three properties at once:

  1. accuracy improves over time (regret grows only like log T, which is minimal)
  2. the model footprint stays constant (via online L-BFGS)
  3. unlearning is mathematically certifiable (ε,δ-style guarantees)

You get GDPR-grade privacy without throwing out your model and starting over.

You also get a model that can live longer before a full retrain is necessary, which in practice means dramatically lower cost.

Case Study: Apple’s Bet on Private AI

Look at Apple. When Europe tightened privacy law, Apple didn’t just slap a consent pop-up on Safari. They quietly redesigned the architecture of how models are trained. They invested in federated learning and differential privacy, and now the majority of learning happens on the phone itself.

Everybody else is going to have to follow that path.

The US will regulate privacy. It’s already happening piecemeal (California, Virginia). When enforcement becomes federal, companies running huge cloud ML stacks will have to unlearn individual users on demand.

If your ML stack still assumes batch retraining, this will break you.

Streaming is the only stable architecture in the long-term. And streaming without symmetric unlearning is a dead end.

What we actually show

We prove that if you pair the learner and unlearner tightly, you can do:

– accurate online learning on live streams – instant certified deletion – without blowing out memory or compute – without retraining from scratch

The deletion capacity (how many unlearn requests you can absorb) becomes a live odometer. If the odometer says “stop,” the model can halt updates and trigger a minimal retrain.

This creates a safety valve that makes the system certifiably compliant.

Pragmatic translation

This is a mechanism for future-proofing ML systems.

We are not promising magic. We are showing a path for how to make the systems we deploy next year — actually tenable five years from now.

The model becomes a living stream organism: learning, forgetting, and maintaining integrity in real-time.

Privacy is no longer an afterthought. It becomes a first-class constraint.

Where this goes next

There is a wide frontier here. Better capacity formulas. Better odometer designs. Extensions to federated systems and non-Euclidean models (graphs, manifolds, etc).

But the core point is now clear:

Privacy and streaming are not enemies.

There is a mathematically coherent way to do ML that is adaptive, efficient, and compliant — simultaneously.

The correct future architecture is not batch → it is live, paired, symmetric, stream-native.

Our key idea: treat learning and unlearning as paired, symmetric operations. Insertions and deletions both run through the same lightweight quasi-Newton machinery. The result is fast, memory-efficient, and certifiable.


Stream-Native Learning

The Memory Pair is an ordered pair of algorithms:

Why Pairing Matters


Contributions

First online algorithm coupling a learner AA and unlearner Aˉ\bar{A} into a unified memory pair.

Tighter regret guarantees under strong convexity

RT=O ⁣(G2λlnT)R_T = \mathcal{O}\!\left(\tfrac{G^2}{\lambda}\ln T\right)

Adaptive bounds on sample complexity and deletion capacity via an AdaGrad

style statistic StS_t.

(ε,δ)(\varepsilon,\delta)-certified unlearning built into a live odometer that halts when guarantees would break.


Theoretical Highlights

  • Static regret: Tightens to O(lnT)O(\ln T) under strong convexity.
  • Dynamic regret: Adds a drift term GPTG P_T, adapting linearly to stream nonstationarity.
  • Capacity bounds: Deletion capacity and sample complexity scale with T\sqrt{T} in the worst case, but adapt on smoother streams.
Theorem (sketch): Logarithmic regret with deletions

With λ\lambda-strong convexity and online L-BFGS updates, cumulative regret with mm certified deletions satisfies:

RT(m)=O ⁣(G2λlnT  +  mGσstep)R_T(m) = O\!\left(\tfrac{G^2}{\lambda}\ln T \;+\; m\,G\,\sigma_{\text{step}}\right)


Experiments

We simulate drifting data streams with interleaved insert and delete events.

Matches the lnT\ln T theoretical curve and then stabilizes.

Adapts gracefully to stream drift while keeping static regret logarithmic.

Deletions maintain accuracy and certified indistinguishability from retraining.


Conclusion

The Memory Pair shows that privacy and efficiency can coexist in online learning. By tightly coupling a learner and unlearner:

  • We achieve certified, instant unlearning.
  • We extend model lifespan before retraining.
  • We open paths toward federated, privacy-preserving systems.