Quantifying Trade-Offs Between Stability and Goal-Obfuscation

Source: arXiv:2605.06630 · Published 2026-05-07 · By Yixuan Wang, Dan Guralnik, Warren Dixon

TL;DR

This paper attacks a fundamental but underappreciated vulnerability of Lyapunov-stable autonomous agents: any agent converging reliably toward a goal is intrinsically legible to a passive Bayesian observer, because the contractive geometry of the basin of attraction concentrates posterior belief over the agent's latent intent parameters (goal location, radius, arrival time). Prior work by the same group demonstrated empirically that a Rao-Blackwellized particle filter (RBPF) adversary can rapidly recover an agent's true intent from noisy position measurements under nominal Lyapunov control. This paper asks the companion synthesis question: how should an agent choose its control input to actively frustrate that inference while still completing its mission?

The paper formalizes intent privacy as a joint control problem over the agent's physical state and the adversary's RBPF information state z = (particle intents θ, state estimates x̂, weights ω). Rather than treating privacy as a penalty term on physical trajectories, the authors treat the RBPF weight vector as a controlled Markov process and apply the probabilistic control barrier function (PCBF) framework of Mestres et al. (2025) to the belief-state dynamics. The central technical challenge is that the RBPF update decomposes into two stochastically independent steps — a Bayesian weight update and a resampling step — each requiring a separate probabilistic barrier analysis before they can be composed into a single finite-horizon privacy guarantee.

The main result is a composite PCBF theorem (Theorem 2) showing that, for every information state z with barrier value above a computable threshold β, there exists a control input achieving the PCBF multiplicative decay condition with failure probability δ_f = 1 − (1−δ_1)(1−δ_2), where δ_1 and δ_2 are the per-step failure tolerances for the Bayesian update and resampling steps respectively. A joint feasibility analysis then characterizes when the privacy constraint and a tracking-error envelope constraint can be simultaneously satisfied, revealing that tight tracking requirements can render the joint problem infeasible, while sufficiently relaxed envelopes recover feasibility. The paper is entirely theoretical; no numerical experiments or hardware validation are presented.

Key findings

Lyapunov stability is structurally incompatible with intent privacy: the contractive NHIM geometry of any Lyapunov basin concentrates an RBPF observer's posterior over the agent's latent intent triple θ* = (x*, r*, t*) at a rate that increases as the agent converges, making task-optimal trajectories the most legible.
A KL-divergence lower bound H(z) ≥ C − Σ_ν log S_ν(z) is computable online from the RBPF information state alone (Theorem 1, inherited from Wang et al. 2025b), where S_ν(z) is a weighted exponential kernel sum over particles; this time-independent form makes it directly usable as a control barrier function.
The Bayesian update barrier change is bounded as |b(z♯) − b(z)| ≤ 3B(z,y), where B(z,y) is controlled by the Lipschitz constant L(z) = ‖Δ⁻¹‖ D(z) (particle cloud diameter times inverse observation noise covariance), and the agent steers observations toward the Chebyshev center ȳ(z) of the particle cloud to minimize leakage (Lemma 1).
The resampling barrier change is bounded using Hoeffding's inequality applied to the N' ≤ N_0 reinitialized particles: P[b(z') ≥ b(z♯) − Δ_r | z♯] ≥ 1 − δ_2, where ε(δ_2) = √(1/(2N_0) log(3/δ_2)) and Δ_r is computable from the prior π over Θ. Crucially, this bound is entirely independent of the agent's control input and dynamics (Lemma 2).
The composite PCBF (Theorem 2) composes the two additive bounds multiplicatively: for z ∈ C^β_I (b(z) ≥ β > Δ_tot = Δ_b + Δ_r), there exists u such that P[b(F_I(z,u,η)) ≥ α·b(z)] ≥ 1 − δ_f, with α = 1 − Δ_tot/β and δ_f = 1 − (1−δ_1)(1−δ_2), using the independence of observation noise ξ and resampling noise ζ.
Joint feasibility between the privacy PCBF constraint and the tracking-error envelope ϱ(t) is governed by the interpolation parameter μ ∈ [0,1] between the Chebyshev-center privacy controller u_p and the reference-tracking controller; feasibility requires β > min{A_1, B_1} + Δ_r (equation 19), and is recovered whenever ϱ grows fast enough to avoid conflicting lower bounds on μ.
The resampling bound's independence from agent dynamics (noted explicitly in the paper) is identified as a key structural property motivating extensions of the framework to more complex, non-fully-actuated agent dynamics.

Threat model

The adversary is a computationally capable but strictly passive observer operating in R² or R³. The adversary knows the functional form of the agent's Lyapunov-stabilizing closed-loop dynamics (parameterized by the unknown intent θ) and the disturbance bound d̄, and maintains N ≫ 0 weighted particles updated via an RBPF with Gaussian observation likelihood N(x̂^(i), Δ) at each discrete observation time t_k. The adversary does not know θ* a priori and seeks to infer it from the sequence of noisy position measurements {y_k}. The adversary cannot influence the agent's trajectory, cannot change its own update rate or particle count adaptively, cannot communicate with other observers, and cannot perform side-channel attacks. The agent models the adversary's filter state internally as a proxy for inference risk but cannot directly observe the adversary's actual belief state or verify whether the adversary has converged. The analysis does not address adversaries using inverse reinforcement learning, maximum entropy IRL (Ziebart et al. 2008), or inverse planning (Ramírez & Geffner 2009), though these are cited as related threat models in the introduction.

Methodology — deep read

THREAT MODEL & ASSUMPTIONS. The adversary is a passive observer who receives noisy position measurements y_k of the agent at discrete times t_k and runs an RBPF to infer the agent's intent triple θ* = (x*, r*, t*) — goal center, goal radius, and desired arrival time. The adversary knows the agent's closed-loop motion model (Lyapunov-stabilizing dynamics up to a bounded disturbance ‖d‖ ≤ d̄), the observation noise distribution N(0, Δ), and the prior π over Θ. The adversary does not act; the analysis is one-sided. The agent knows θ* and models the adversary's RBPF internally as a proxy for inference risk. Critically, the agent does not have access to the adversary's actual belief state; it maintains an internal simulation of the RBPF. The agent's dynamics are simplified to a fully actuated differential inclusion ẋ ∈ u + d̄B (bounded unknown disturbance on control input), making the physical control design trivial so that the paper can concentrate all technical effort on the belief-state analysis.

DATA & SETUP. There is no empirical dataset. The theoretical setup posits: agent state space Rⁿ (n ∈ {2,3}); intent domain Θ = RB × [r_min, r_max] × [T_min, T_max]; N ≫ 0 particles initialized with uniform weights ω_0^(i) = 1/N, positions sampled uniformly from workspace Ω, radii from [r_min, r_max], times from [T_min, T_max]. Observations are generated as y_k = x(t_k) + ξ_k, ξ_k ~ N(0, Δ). The agent's control is piecewise constant between RBPF update times, so the continuous-time dynamics reduce to a discrete-time problem synchronised with the RBPF update cadence Δt.

ARCHITECTURE / ALGORITHM. The RBPF maintains N particles {(θ^(i)k, x̂^(i)k, ω^(i)k)}. Each update cycle has three stages: (1) Propagation: Euler-integrate each particle's predicted state x̂^(i-)k = x̂^(i) + Δt·f(x̂^(i)) + ξ_k with a Kalman-style covariance update. (2) Bayesian weight update: ω^♯(i)k ∝ ω^(i) · p(y_k | x̂^♯(i)k), where likelihoods are Gaussian N(x̂^♯(i)k, Δ). (3) Resampling (if N_eff ≤ N_0): retain the N_eff highest-weight particles, replicate each proportionally, reinitialize the remaining N' ≤ N_0 particles from prior π. The information state z = (θ, x̂, ω) is treated as a controlled Markov process with transition F_I(z, u, η) = R(F♯(z, u, ξ), ζ). The barrier function is b(z) = H(z) − γ, where H(z) = C − Σ_ν log S_ν(z) is the KL-divergence lower bound from Theorem 1. The novel module is the privacy controller u_p: it sets x = ȳ(z), the Chebyshev center of the particle cloud {x̂^(i)}, minimizing B(z,y) and thereby minimally informing the observer. The actual control is a convex interpolation u_b = μ·u_p + (1−μ)·(x(t) − x_k) between the privacy controller and the reference-tracking controller, with μ ∈ [0, μ_max(k)] where μ_max enforces the tracking envelope ϱ(t_{k+1}).

TRAINING REGIME. Not applicable — this is a control-theoretic paper with no learned parameters. All results are analytical. Key parameters are: δ_1 (failure tolerance for Bayesian update step), δ_2 (failure tolerance for resampling step), β (barrier threshold, must exceed Δ_tot = Δ_b + Δ_r), α = 1 − Δ_tot/β (PCBF decay rate), N_0 (resampling threshold), and σ_x, σ_r, σ_t (adversary localization/timing precision parameters in the kernel γ_ν).

EVALUATION PROTOCOL. The paper is entirely theoretical. The evaluation consists of: (a) Lemma 1 — a one-step additive bound on b(z♯) − b(z) for the Bayesian update, proven via Lipschitz analysis of the log-likelihood ratio and a chi-squared tail bound (Laurent & Massart 2000) for the observation noise norm; (b) Lemma 2 — a one-step additive bound on b(z') − b(z♯) for the resampling step, proven via Hoeffding's inequality applied to the N' reinitialized particles' kernel contributions; (c) Theorem 2 — composition of (a) and (b) into a PCBF condition using the independence of ξ and ζ and the sublevel set restriction to C^β_I; (d) Joint feasibility analysis (Section 5) — algebraic analysis of when μ satisfying both the PCBF condition and the tracking envelope constraint (eq. 16) simultaneously exists, expressed as inequality constraints on μ. No baselines, no ablations, no numerical experiments, no statistical tests, no cross-validation.

CONCRETE END-TO-END EXAMPLE. Consider a planar agent (n=2) with N=1000 particles, Δt=0.1s. At time t_k, the information state z holds particle positions x̂^(i) clustered within a diameter D(z). The agent computes ȳ(z) (Chebyshev center of the cloud) and sets u_p = (ȳ(z) − x_k)/Δt. If the tracking envelope allows (μ_max > 0), the agent sends u_b = μ·u_p + (1−μ)·(x_{ref}(t_{k+1}) − x_k)/Δt. The observation y_{k+1} lands near ȳ(z) with high probability, so the likelihood ratios r^(j)(y) are nearly equal across particles, keeping S_ν(z♯) ≈ S_ν(z) and b(z♯) ≈ b(z) (privacy maintained). If N_eff drops below N_0, resampling fires: N' new particles are drawn from π, contributing random kernel values Ȳ_ν bounded above by E_π(Ȳ_ν) + ε(δ_2) with probability 1−δ_2/3 per ν, so b(z') ≥ b(z♯) − Δ_r with probability 1−δ_2.

REPRODUCIBILITY. No code, no simulation, no data released. All results are mathematical theorems and lemmas. The paper explicitly states that 'further work developing numerical and hardware experiments is necessary for validation.'

Technical innovations

Intent privacy is reformulated as a control problem on the RBPF information state z = (θ, x̂, ω) rather than as an auxiliary penalty on physical trajectories, treating the KL-divergence lower bound H(z) directly as a control barrier function — prior work (Wang et al. 2025b) only measured leakage without synthesizing privacy-preserving controllers.
Separate PCBF results are derived for the Bayesian update step (Lemma 1) and the resampling step (Lemma 2) of the RBPF, exploiting the stochastic independence of observation noise ξ and resampling noise ζ to compose them into a single composite PCBF (Theorem 2) via a union bound — the resampling barrier bound (Lemma 2) is notable for being completely independent of the agent's control input, a structural property enabling future extension to richer agent dynamics.
The Chebyshev center of the particle cloud ȳ(z) is identified as the optimal observation target for the privacy controller: steering the next observation toward ȳ(z) minimizes the Lipschitz-bounded change in log-likelihood ratios across particles, thereby minimizing information leakage increment per step.
A joint feasibility characterization (Section 5) exposes the trade-off between privacy and tracking via the interpolation parameter μ, reducing the question of simultaneous constraint satisfaction to a set of explicit scalar inequalities (eq. 19 and eq. 16), quantifying precisely when tight tracking envelopes render goal obfuscation infeasible.
The additive-to-multiplicative PCBF conversion (eq. 18) via sublevel set restriction C^β_I provides a general technique applicable to barrier functions analyzed additively, extending the PCBF framework of Mestres et al. (2025) to the composed RBPF update setting.

Limitations

No numerical simulations or hardware experiments are presented; the paper explicitly acknowledges this gap and defers validation to future work, making all guarantees purely theoretical with no empirical grounding.
The agent dynamics are intentionally oversimplified to a fully actuated differential inclusion ẋ ∈ u + d̄B; real autonomous agents (underactuated robots, vehicles with nonholonomic constraints, UAVs) would invalidate the direct applicability of Lemma 1, and the authors acknowledge that extending to complex dynamics requires further RBPF convergence results.
The adversary is modeled as strictly passive — it observes trajectories but does not adapt its sampling strategy, change its update rate, or respond to detected obfuscation behavior. An active adversary that increases N, adjusts N_0, or changes Δt could likely defeat the privacy guarantees.
The agent is assumed to have perfect internal knowledge of the adversary's RBPF information state z (particle positions, weights) to compute ȳ(z) and S_ν(z); this is unrealistic in deployment since the agent cannot directly observe the adversary's filter state, and no estimation mechanism for z is provided.
No RBPF convergence rates are derived, which the paper itself identifies as the critical missing piece for establishing whether obfuscation and practical (asymptotic) stability of tracking errors can ever be jointly achieved — the current feasibility result only applies over finite horizons with relaxed envelopes.
The KL lower bound H(z) used as the barrier is a conservative proxy for true information leakage H(z); tightness of this bound is not characterized, so the privacy guarantee may be significantly looser than the true KL divergence warrants, potentially over-constraining the controller unnecessarily.
The threshold β must satisfy β > Δ_tot as an initialization condition, but Δ_tot depends on A_1 and B_1 which are suprema over all z ∈ Z — these may be very large or even infinite in practice depending on particle cloud geometry, making the theory difficult to instantiate with concrete numbers.

Open questions / follow-ons

Can RBPF convergence rates be characterized analytically — specifically, at what rate does the particle distribution concentrate around θ* under Lyapunov control, and how does this rate interact with the minimum tracking envelope growth rate needed to keep the joint feasibility constraints satisfiable?
How should the agent estimate the adversary's information state z = (θ, x̂, ω) in practice, given that the agent cannot observe the adversary's filter directly, and how do estimation errors in z propagate into the PCBF guarantees?
Can the framework be extended to active adversaries who adapt their RBPF parameters (N, N_0, Δt, Δ) in response to detected obfuscation behavior, or to adversaries using ensemble methods or deep learning-based intent inference rather than RBPF?
Does the Chebyshev-center privacy controller remain optimal (in the sense of minimizing information leakage per step) for more complex agent dynamics with nonholonomic constraints, or does the optimal obfuscation target shift in ways that require a different geometric characterization?

Why it matters for bot defense

The paper's core insight — that any agent whose behavior is contractively goal-directed is intrinsically legible to a Bayesian observer, and that this legibility is sharpest precisely when behavior is most task-optimal — is structurally relevant to bot detection. Bots completing web tasks (solving CAPTCHAs, navigating flows, filling forms) tend to exhibit goal-directed, low-variance trajectories that are statistically distinguishable from human behavior. The RBPF inference framework described here is analogous to behavioral biometric models that maintain a weighted hypothesis set over user intent and update posteriors from interaction observations (mouse movements, keystroke timing, scroll patterns). The paper's formalism quantifies exactly why 'efficient' bot behavior is also 'legible' behavior, and provides a theoretical lens for understanding why adversarial bots that attempt to mimic human trajectory variance face a fundamental stability-versus-legibility trade-off.

For bot-defense practitioners, the most actionable implication is the dual-use nature of the framework: the RBPF intent-inference architecture described in Wang et al. (2025b) and formalized here could in principle be deployed as a real-time detector, maintaining particle posteriors over user intent and flagging sessions where belief concentrates too rapidly (indicating bot-like goal-directedness). Conversely, the paper characterizes exactly what a sophisticated bot would need to do to evade such a detector — inject trajectory variance (equivalent to increasing μ toward the obfuscation controller) — and shows that doing so necessarily degrades task completion efficiency (tight tracking envelopes become infeasible). This trade-off is directly useful for hardening CAPTCHA and behavioral challenge design: challenges that require precise, efficient completion create legible bots, while challenges tolerant of noisy completion paths may be easier for obfuscating bots to pass. The lack of numerical experiments and the simplified agent dynamics model limit immediate engineering application, but the theoretical structure is sound and worth monitoring as the authors produce the promised simulation and hardware follow-up work.

Cite

bibtex

@article{arxiv2605_06630,
  title={ Quantifying Trade-Offs Between Stability and Goal-Obfuscation },
  author={ Yixuan Wang and Dan Guralnik and Warren Dixon },
  journal={arXiv preprint arXiv:2605.06630},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.06630}
}

Quantifying Trade-Offs Between Stability and Goal-Obfuscation ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​