Robust Streaming Against Low-Memory Adversaries

Source: arXiv:2511.01769 · Published 2025-11-03 · By Omri Ben-Eliezer, Krzysztof Onak, Sandeep Silwal

TL;DR

This paper studies robust streaming algorithms under a novel adversarial model where the adversary is adaptive but memory constrained—either memoryless (no persistent memory) or low-memory (a small bounded amount of persistent memory). Classic robust streaming adversaries memorize the entire interaction history, resulting in exponential blowups in space complexity for problems like turnstile Fp-moment estimation relative to oblivious streaming. The authors investigate if limiting the adversary’s memory power reduces this gap. They establish fundamental properties of these bounded-memory adversaries, showing that even memoryless or low-memory adversaries can generate streams with high flip number (many significant changes) and high density (many nonzero coordinates), making many existing robust streaming algorithms ineffective. Against these bounded-memory adversaries, the authors design new robust streaming algorithms for a broad class of order-invariant problems that achieve polylogarithmic space complexity, closing the gap significantly compared to prior work. Their results include both deterministic algorithms for deterministic adversaries and probabilistic algorithms for randomized adversaries with bounded persistent memory. This work opens a new perspective on adversarial robustness where the adversary’s computational power is better matched to the algorithm’s. Experimentally and theoretically, their algorithms substantially reduce complexity compared to prior best-known robust algorithms for F2-estimation, achieving space logarithmic in stream length and universe size versus polynomial.

Key findings

A deterministic streaming algorithm tracks any function f with range {0} ∪ [1, α] up to (1+ε) multiplicative approximation against deterministic memoryless adversaries using O((log α)/ε · log(mn)) bits of space (Theorem 1.1 / 4.2).
If the adversary has k bits of persistent memory, the space complexity scales to O(2^k · (log α)/ε · log(mn)) (Theorem 1.1).
Memoryless randomized adversaries can generate streams with flip number Ω(m^{1-c}) and density Ω(m^c) for any 0 < c < 1, showing these adversaries can produce highly challenging streams for robust estimation (Theorem 1.2 / 5.1).
Existing robust streaming algorithms for F2-estimation require roughly Õ(m^{0.4}) space, exponentially more than oblivious algorithms; the authors’ approach achieves space polylogarithmic in m and n for bounded-memory adversaries.
For randomized memoryless adversaries, any order-independent oblivious streaming algorithm using M(ε) space can be robustified with space M(ε/3)·O((log α)/ε · log m) to achieve high-probability multiplicative approximation (Theorem 1.3 / 6.4).
When the randomized adversary has k bits of persistent memory, the space requirement increases by a factor O(2^k) in the robust algorithm, i.e., M(ε/3)·O(2^k (log α)/ε log m).
A new τ-Stream adversary model introduced to reduce memory-bounded randomized adversaries to switching between τ pre-generated streams, allowing union bounding over streams and efficient robustification.
The paper’s computation-paths-like technique yields simple constructions of robust algorithms against bounded-memory adversaries for a wide class of order-invariant problems beyond F2-estimation.

Threat model

The adversary is fully adaptive but constrained to having at most k bits of persistent memory maintaining state across the stream. It observes the streaming algorithm’s last output estimate at each round and uses unlimited working memory erased every step plus unbounded fresh randomness. It cannot store entire interaction history (contrast standard robust streaming adversaries that have unbounded memory). The adversary aims to adaptively generate stream updates (insertions and deletions) to cause the streaming algorithm to fail its approximation guarantees.

Methodology — deep read

Threat model & assumptions: The adversary is adaptive but constrained in internal memory. Specifically, it always sees the previous output estimate from the streaming algorithm (estimate memory), has unbounded working memory wiped every round, and limited (k-bit) persistent memory retained across rounds. Fresh randomness is available per round but not stored. Deterministic adversaries correspond to no persistent memory and no randomness. This addresses the question: can robust streaming be efficient if adversaries are computationally constrained in memory?
Data: The data model is a streaming setting with m updates to a frequency vector over universe size n. Updates are insertions or deletions (+1/-1). The adversary constructs the stream adaptively based on the latest estimate of the algorithm and its internal memory state. The goal is to approximate functions f over this frequency vector at every timestep.
Architecture / algorithm: For deterministic memoryless adversaries, the authors maintain the exact frequency vector in sparse form for coordinates seen, compute f exactly, then round output to values from an ε-net covering [1, α]. The size of the ε-net is O(ε^{-1} log α), bounding the number of distinct outputs seen by adversary, which in turn limits the unique stream items introduced. This yields low space usage. For randomized adversaries, the authors introduce the τ-Stream adversary model, where the adversary switches between τ predetermined streams. Then, starting from an oblivious order-invariant streaming algorithm using space M(ε), they robustify through a construction inspired by computational paths, increasing space by roughly a factor of τ (linear in 2^k). This uses a median-of-estimates approach and union bounds over choice of streams.
Training regime: N/A as this is a theoretical algorithms paper. Complexity bounds are analyzed mathematically and theorems proved.
Evaluation protocol: Results are theoretical, proven rigorously with worst-case guarantees. The authors provide space complexity bounds, approximation factors, and adversarial power characterizations. They analyze adversarial streams’ flip number and density to argue existing approaches fail and motivate their algorithms. They compare space complexities with prior works [BEO22, WZ24]. No empirical experiments or distribution shift evaluations are included.
Reproducibility: The paper provides formal definitions and proofs but no accompanying code or datasets. The results are fully theoretical. The constructions are described with sufficient detail for replication by researchers with background in streaming algorithms. The authors rely on well-established subroutines like ε-nets and union bounds. The underlying oblivious algorithms and definitions used are standard in the streaming literature.

Concrete example: Consider a deterministic memoryless adversary with zero persistent memory trying to foil F2-estimation. The algorithm tracks the frequency vector exactly in sparse form. By rounding outputs to an ε-net with O(ε^{-1} log α) points, the adversary only sees a small range of outputs and can only induce a limited number of unique stream updates. Thus the streaming algorithm maintains only a small set of coordinates over time, using O((log α)/ε · log(mn)) bits, achieving deterministic robust estimates continuously. This contrasts with previous approaches where adversaries memory of entire history forced huge state and space blowups.

Technical innovations

Introduction of a bounded-memory adversary model for robust streaming limiting adversary persistent memory to k bits, balancing power between adversary and algorithm.
Demonstration that memoryless and low-memory adversaries can still generate streams with high flip number and density, ruling out many previous robust streaming techniques.
A deterministic streaming algorithm exactly maintaining sparse frequency vectors and rounding to an ε-net to robustly approximate any function against deterministic memoryless adversaries with polylog space.
Development of a τ-Stream Adversary Model to reduce randomized bounded-memory adversaries to switching among τ predetermined streams, enabling robustification of oblivious streaming algorithms with controlled space blowup.
Construction of robust streaming algorithms against memoryless and low-memory adversaries for broad classes of order-invariant problems, extending beyond F2-estimation.

Baselines vs proposed

Ben-Eliezer, Eden, Onak (BEO22): space complexity for robust F2-estimation = Õ(m^{0.4}) vs proposed deterministic memoryless adversary algorithm: O((log α)/ε · log(mn)) bits (exponentially better in m).
Woodruff and Zhou (WZ24): space Õ(m^{0.4}) for robust F2-estimation vs authors’ algorithm: polylogarithmic in m and n for bounded-memory adversaries.
Order-independent oblivious streaming algorithm baseline uses M(ε) space and robustified algorithm uses M(ε/3) · O(2^k (log α)/ε log m) space against randomized adversary with k bits persistent memory.

Limitations

The model restricts adversary memory but still allows unbounded working memory wiped every round and unlimited fresh randomness, which may not capture all real adversaries.
Results are primarily theoretical upper bounds; no empirical validation or experiments demonstrating practical performance or constants.
The space complexity for adversaries with persistent memory k grows exponentially in k, so results do not scale well for large adversary memory.
The work does not fully close the question of attaining polylogarithmic space robust Fp-moment estimation against arbitrary adaptive adversaries, especially for Fp with p ≠ 2.
Open whether advanced existing algorithms such as Woodruff-Zhou can be optimized specifically to the bounded-memory adversary model to achieve comparable or better space.
Adversarial model assumes memory bounded only on persistent memory, not computational time or adversarial knowledge beyond outputs.

Open questions / follow-ons

Can the Woodruff-Zhou or related robust algorithms be adapted or optimized to achieve polylogarithmic space complexity against memoryless and low-memory randomized adversaries?
Is it possible to close the remaining exponential gap between oblivious and robust streaming for Fp-moment estimation for larger p or full adaptive adversaries with bounded persistent memory?
How do other resource bounds (e.g., limited computational power or time) on the adversary interact with bounded memory constraints to affect robust streaming?
Can these bounded-memory adversary models be combined with or extended to meanfully model practical bot or CAPTCHA adversaries in applied bot-defense systems?

Why it matters for bot defense

For practitioners engaged in bot defense or CAPTCHAs, this work highlights that adversaries with limited memory capacity—even when adaptive—may be less powerful than traditionally assumed. This suggests designing defense mechanisms (like streaming anomaly detectors) that rely on outputs with limited leakage may achieve robust approximations efficiently if the adversaries cannot maintain extensive internal state. The bounded-memory adversary model formalizes how adversaries constrained in their memory make adaptive attacks more tractable to defend against with polylogarithmic space. However, the adversary still can adaptively respond to algorithm outputs with fresh randomness and limited storage, so defenses must carefully consider what information is leaked at each step. This research implies that defending against stateless or low-memory adaptive adversaries is fundamentally easier than defending against fully stateful ones, which could guide CAPTCHA challenge design and monitoring techniques focusing on memory-restricted attacker models.

Cite

bibtex

@article{arxiv2511_01769,
  title={ Robust Streaming Against Low-Memory Adversaries },
  author={ Omri Ben-Eliezer and Krzysztof Onak and Sandeep Silwal },
  journal={arXiv preprint arXiv:2511.01769},
  year={ 2025 },
  url={https://arxiv.org/abs/2511.01769}
}

Robust Streaming Against Low-Memory Adversaries ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Baselines vs proposed ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​