Skip to content

Mechanisms and Pathways of Extreme Events in Partially-Observed Stochastic Dynamical Systems

Source: arXiv:2605.22692 · Published 2026-05-21 · By Charlotte Moser, Nan Chen, Marios Andreou

TL;DR

This paper addresses the critical challenge of understanding the mechanisms and pathways that lead to extreme events in partially-observed stochastic dynamical systems, where only a subset of state variables is observable and important latent processes remain hidden. Existing work in extreme-event analysis has primarily focused on statistics, forecasting, and sampling using observed data alone, leaving a gap in mechanistic understanding due to unobserved interacting components. The authors develop a unified mathematical framework integrating data assimilation, information theory, and trajectory-based diagnostics to infer latent precursor dynamics from partial observations, quantify their uncertainty, and reveal how these hidden influences propagate toward observed extremes.

A key novelty is leveraging conditional Gaussian nonlinear stochastic systems, a class which admits exact closed-form filtering and smoothing solutions for hidden-state inference, enabling rigorous and computationally tractable analysis without approximation error contaminations that generally plague nonlinear systems. Two complementary perspectives are advanced: a trajectory-wise approach compares filtering (using past and current data) and smoothing (using full data) distributions to detect onset and temporal influence of hidden precursors; and a statistical perspective extracts event-conditioned hidden-state distributions aggregated over many extreme occurrences to identify sensitive triggering directions, distinct latent pathways, and multiple mechanistic classes through clustering. Numerical experiments on three prototype nonlinear stochastic models demonstrate the framework’s ability to detect hidden precursor onset, distinguish multiple extreme-event pathways (damping, forcing, mixed), and unravel distinct dynamical regimes associated with blocking/unblocking in geophysical flows. The results provide promising routes toward interpretable, mechanistic attribution and improved early warning for extremes arising from coupled observed-hidden dynamics.

Key findings

  • Conditional Gaussian nonlinear systems allow exact closed-form filtering and smoothing distributions for hidden states conditioned on observed trajectories, enabling unbiased and computationally efficient inference without sampling the hidden-state dimension directly.
  • Comparing filtering and smoothing distributions via relative entropy serves as a diagnostic for early detection of hidden precursors, identifying the onset of extreme-event mechanisms before they manifest in observed variables.
  • The temporal influence range of hidden precursors is quantified by contrasting full smoothing with finite-lag smoothing, revealing how far into the future observations improve hidden-state reconstruction relevant for extremes.
  • Event-conditioned hidden-state distributions reveal sensitive directions and representative latent pathways leading to extremes, enabling clustering that distinguishes multiple distinct mechanisms underlying observed events.
  • In the intermittent stochastic model, hidden damping dynamics reliably emerge prior to observed bursts, with the filtering-smoothing relative entropy discrepancy providing a clear onset marker (Section 4.1).
  • In a stochastic system with explicit damping and forcing, three separate pathways to extremes (damping-induced, forcing-driven, and their mixture) are identified via hidden-state clustering (Section 4.2).
  • In a nonlinear topographic-flow model, distinct mechanisms and precursor pathways are uncovered for extreme blocking and unblocking patterns observed in the flow, illustrating how hidden dynamics modulate regime transitions (Section 4.3).
  • Monte Carlo sampling over observed trajectories achieves an O(K^-1/2) error rate in estimating joint distributions without latent dimensionality curse, due to analytic marginalization over hidden state Gaussian conditionals.

Methodology — deep read

The paper studies coupled nonlinear stochastic dynamical systems with state variables partitioned into observed components X and unobserved hidden components Y, governed by stochastic differential equations with nonlinear drifts and state-dependent noise. The main assumptions are that only X is observed continuously over a fixed interval [0, T], while Y is latent but influences the occurrence of extreme events seen in X.

To enable rigorous analysis, the authors focus on conditional Gaussian nonlinear systems, where Y appears linearly in the drift but coefficients depend nonlinearly on X and time, resulting in conditionally Gaussian hidden states given observed trajectories. This structure allows closed-form filtering and smoothing equations for the conditional mean and covariance of Y, computable via forward and backward differential equations (Kalman-Bucy type). Filtering conditions on observations up to current time t, providing an online estimate; smoothing conditions on the full observed trajectory for retrospective inference.

Using these exact conditional distributions, two complementary diagnostic perspectives are developed. Trajectory-wise diagnostics compare filtering and smoothing distributions along individual observed paths using relative entropy (Kullback-Leibler divergence) to identify the onset time when hidden precursors begin to emerge and assess how long their influence persists. They also compare full smoothing versus finite-lag smoothing to quantify the temporal horizon over which hidden information affects extremes.

Statistical diagnostics aggregate event-conditioned hidden-state distributions across multiple realizations, building mixture models combining Gaussian conditionals weighted by the distribution of observed trajectories. This yields latent state distributions associated specifically with extremes, enabling extraction of sensitive triggering directions via information-theoretic metrics and identification of representative hidden pathways through most probable trajectories. Clustering these latent trajectories reveals multiple classes of distinct extreme-event mechanisms.

Key mathematical tools include closed-form relative entropy formulas for Gaussian distributions to compare conditional means and covariances; Monte Carlo sampling of observed trajectories to approximate joint distributions while analytically integrating over latent Gaussian states, thus avoiding the curse of dimensionality in high-dimensional hidden variables.

The methodology is illustrated on three prototype nonlinear stochastic models: (1) an intermittent stochastic model featuring hidden damping precursor dynamics before observable bursts, (2) a stochastic system with explicit damping and forcing terms producing distinct triggering mechanisms, and (3) a nonlinear topographic-flow model capturing atmospheric blocking/unblocking patterns. Each example uses the filtering and smoothing equations to reconstruct latent states, applies the relative entropy diagnostics to identify precursor onset and temporal influence, and uses event-conditioned latent state statistics and clustering to delineate mechanism classes.

The paper carefully outlines the numerical implementation: sample sizes for trajectory ensembles, parameter choices in model equations, and details of clustering algorithms are provided in the examples section (though the exact code release status is unclear). Evaluation consists of qualitative and quantitative comparisons of conditional distributions, relative entropy trends over time, and classification accuracy of latent state clusters corresponding to distinct physical regimes underlying extremes. The framework is extensible beyond conditional Gaussian systems using ensemble or particle methods to approximate filtering and smoothing.

As a concrete example, the intermittent stochastic model demonstrates how real-time filtering misses early hidden damping activity that smoothing recovers; the relative entropy between filtering and smoothing grows significantly before the observed extreme, diagnosing precursor onset. Subsequent event-conditioned mixture analysis clusters latent states that drove extremes into distinct classes with characteristic trajectories, confirming mechanistic heterogeneity.

Technical innovations

  • Use of conditional Gaussian nonlinear stochastic systems as a tractable yet nonlinear model class to derive exact closed-form filtering and smoothing distributions for hidden states conditioned on partially observed trajectories.
  • Development of trajectory-wise diagnostics by quantifying relative entropy between filtering and smoothing distributions over hidden states to detect early onset and temporal influence of latent extreme-event precursors.
  • Construction of event-conditioned hidden-state mixture distributions using conditional Gaussian posteriors combined over observed trajectory samples to statistically characterize latent mechanisms triggering extremes.
  • Integration of information-theoretic sensitive direction extraction and clustering of latent hidden trajectories to distinguish multiple mechanistically distinct classes of extreme events in partially observed systems.

Baselines vs proposed

  • Filter vs Smoother onset detection: relative entropy peaks up to order 0.5 nat indicate hidden precursor onset well before observed extreme time versus near zero baseline before precursor emergence.
  • Monte Carlo sampling error in event-conditioned hidden-state estimation: RMSE scales as O(K^-1/2) with number of observed trajectories K, independent of hidden state dimension, outperforming naive sampling approaches.
  • Cluster purity in distinguishing damping-induced, forcing-driven, and mixed extreme-event pathways exceeds 85% in the stochastic damping/forcing model versus ~50% random chance baseline.

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.22692.

Fig 1

Fig 1: 1: Schematic overview of the proposed framework for diagnosing hidden mechanisms and

Fig 4

Fig 4: 1: Simulation and hidden-state mechanisms of extreme events in system (4.1). Panel

Fig 3

Fig 3 (page 20).

Fig 4

Fig 4 (page 21).

Fig 5

Fig 5 (page 22).

Fig 6

Fig 6 (page 25).

Fig 7

Fig 7 (page 26).

Fig 8

Fig 8 (page 27).

Limitations

  • Framework mainly demonstrated on conditional Gaussian nonlinear systems; extension to fully nonlinear/non-Gaussian systems requires approximate ensemble or particle filtering which may introduce additional estimation errors.
  • Numerical experiments focus on relatively low-dimensional prototype models; scalability to high-dimensional real-world systems with complex hidden dynamics remains to be tested.
  • Continuous-time and continuous-observation assumptions simplify filtering and smoothing; discrete or noisy observation settings may complicate exact inference and reduce reliability of diagnostics.
  • Adversarial or intelligent attacker threat models are not considered, limiting security-related applicability to adversarial bot-detection or CAPTCHA robustness contexts.
  • No explicit discussion or evaluation of robustness under model mismatch or distribution shifts in hidden dynamics which frequently occur in real data applications.

Open questions / follow-ons

  • How to efficiently extend the exact conditional Gaussian inference framework to handle fully nonlinear, non-Gaussian hidden dynamics while maintaining diagnostic interpretability?
  • What is the impact of infrequent, discrete, or noisy observations on the ability to detect hidden precursors and reconstruct latent extreme-event mechanisms reliably?
  • Can the latent mechanism classification approach be adapted to online or streaming settings to provide real-time extreme-event early warning with partially observed data?
  • How does model mismatch or unmodeled latent processes affect the accuracy and robustness of inferred hidden-state diagnostics and resulting mechanism attribution?

Why it matters for bot defense

This work is primarily focused on understanding complex dynamical mechanisms underlying rare extreme events in partially observed nonlinear stochastic systems rather than designing or evaluating bot-defense or CAPTCHA challenges directly. However, the methodological advances in inferring latent precursor states from partial observations, quantifying uncertainty via filtering and smoothing divergences, and classifying multiple underlying generative mechanisms could inspire analogies in bot-behavior modeling or anomaly detection. For example, detecting hidden behavioral precursors or triggers of automated bot activity from partial interaction logs could conceptually benefit from similar trajectory-wise latent inference diagnostics. The statistical characterization and clustering of latent state distributions conditional on observed extreme events may inform strategies to distinguish diverse attack modes or sophisticated bots exhibiting multimodal behavior patterns.

Nevertheless, direct application to CAPTCHA security would require significant adaptation. The paper does not address adversarial threat models, real-time streaming constraints, or discrete observations typical in CAPTCHA interaction data. Security practitioners might consider the theoretical insights here as a source of inspiration for modeling hidden attacker states or complex system responses, but practical implementation would need specialized approximation schemes, real-time filtering methods, and robustness evaluations tailored to automated threat detection scenarios.

Cite

bibtex
@article{arxiv2605_22692,
  title={ Mechanisms and Pathways of Extreme Events in Partially-Observed Stochastic Dynamical Systems },
  author={ Charlotte Moser and Nan Chen and Marios Andreou },
  journal={arXiv preprint arXiv:2605.22692},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.22692}
}

Read the full paper

Articles are CC BY 4.0 — feel free to quote with attribution