Skip to content

Existence Precedes Value: Joint Modeling of Observational Existence and Evolving States in Time Series Forecasting

Source: arXiv:2606.13571 · Published 2026-06-11 · By Yifan Hu, Hongzhou Chen, Peiyuan Liu, Yiding Liu, Zewei Dong, Jiang-Ming Yang

TL;DR

This paper addresses a critical limitation in time series forecasting for highly incomplete and irregular real-world data: the common but unrealistic assumption that future valid observation timestamps are known in advance during inference. Existing methods model irregular historical data well but do not predict if a future observation will actually be present, limiting practical utility. The authors propose a new forecasting paradigm that jointly models observational existence (whether a valid measurement will occur) and evolving state values (the measurement value itself) in a unified framework called Timeflies. Timeflies consists of dual streams—an observation stream modeling missingness patterns and a value stream modeling state evolution—and three novel modules to encode reliability, use observation-guided attention, and jointly predict existence probabilities and values. They also introduce Shadow, a benchmark combining diverse public and real-world industrial datasets exhibiting natural missingness, and a new metric Observation-Value Joint Entropy (OVJE) that captures the joint predictability of existence and value. Extensive experiments demonstrate that Timeflies consistently outperforms state-of-the-art methods across various missingness regimes and forecasting horizons, particularly improving reliable forecasting for medium-to-high missing data scenarios. This work highlights that explicitly modeling future observability is crucial for forecasting real-world irregular time series.

The result is a first-of-its-kind approach moving beyond imputation or continuous latent modeling by reformulating forecasting as a dual hierarchical task. The proposed architecture and use of natural missingness datasets provide a comprehensive evaluation and strong empirical validation. Ablations confirm the importance of carefully modeling patch reliability, observation-conditioned attention, and training objectives. This study sets a new direction by treating missingness not as noise to remove but as informative temporal signals guiding both existence inference and state prediction, improving forecasting robustness and practical relevance significantly.

Key findings

  • Timeflies achieves a 22.4% relative reduction in the Observation-Value Joint Entropy (OVJE) metric compared to the strongest baseline OLinear across missingness regimes on the Shadow benchmark (Table 1).
  • Compared to leading Transformer and Linear baselines, Timeflies reports 0.303 MAE in high missingness regimes vs 0.328 for second best (OLinear), showing robust accuracy under up to 89% missing data.
  • Even when trained without the observation existence classification head, Timeflies outperforms baselines on pure value regression by 7.8% lower MAE under medium missingness, demonstrating the inductive value of modeling missingness patterns (Table 2).
  • Ablation removing mask-aware normalization increases MSE from 1.362 to 1.517 in low missingness, highlighting the critical role of excluding missing values during normalization (Table 3).
  • Removing the observation-conditioned attention module drops AUC for observability prediction from 0.805 to 0.732 and OVJE from 0.611 to 0.724 in high missingness, showing the importance of cross-stream interaction in attention (Table 3).
  • The reliability-aware patch embedding allows weighted fusion of observation and value streams, ensuring attention prioritizes reliable patches and suppresses noisy sparse regions.
  • Timeflies correctly assigns high probabilities to future missing events in case studies, confirming accurate prediction of observational existence complements value forecasting (Figure 3).
  • The Shadow benchmark covers 31 datasets across 6 domains and 16 global regions with missing ratios ranging from <0.1% to >89%, enabling comprehensive evaluation across realistic sparsity and time horizons.

Threat model

n/a — This paper does not target adversaries or security threats but rather addresses a fundamental modeling issue in forecasting irregular and incomplete time series where the system or data-generating process controls missingness patterns. The assumption is that the future observation timestamps are unknown rather than adversarially manipulated.

Methodology — deep read

The paper introduces a new forecasting framework modeling time series forecasting as a joint problem of future observability inference and value prediction. The central assumption challenged is that future timestamps of valid observations are not known a priori; instead, the model must predict if data will even exist.

  1. Threat model and assumptions: The adversary is not explicitly defined since this is a modeling paper, but the problem setting assumes irregular time series with missingness driven by dormancy, delays, or event-driven sampling. The model does not assume oracle knowledge of future observation timestamps and focuses on natural missingness.

  2. Data: The authors build the Shadow benchmark with 31 datasets: 15 public datasets from GIFT-Eval and 16 proprietary industrial e-commerce datasets with hourly frequencies. Missingness is naturally occurring (non-random) ranging from <0.1% to >89%, reflecting sensor cycles, holidays, and outages. Historical input length varies per dataset; splits and preprocessing are described in Appendix.

  3. Architecture and algorithm: Timeflies employs a two-stream Transformer-based architecture: an observation stream and a value stream. Inputs are a triplet (X, M, I) where X is historical observed values, M is the missingness mask, and I is the log-distance to last missing event. Inputs are divided into non-overlapping patches which are embedded separately into value tokens and observation tokens. Core modules are:

  • Reliability-aware patch embedding: Computes patch-level reliability scoring from missing ratios and intervals and modulates the injection of observation embeddings into value embeddings accordingly.
  • Observation-guided value attention: Self-attention is performed independently on value and observation tokens, producing semantic and structural attention maps respectively (Aval, Aobs). These are fused with a reliability-based attention map (Arel) weighting the influence of observation attention. This observation-aware attention controls how missingness patterns bias value dependencies.
  • Dual prediction head: Flattens streams to the prediction horizon and jointly predicts future observation probability ˆO and values ˆY. The final value prediction is gated by the observation probability via a sigmoid on ˆO.
  1. Training: Loss is a combination of mean squared error (MSE) for value regression only at observed future points weighted by O, plus Focal Binary Cross-Entropy loss on ˆO to handle class imbalance in existence prediction. The total loss is Lval + eta * Lobs. Mask-aware normalization is applied to avoid statistics distortion due to missing values.

  2. Evaluation protocol: Experiments span multiple forecast horizons and four missingness regimes (none, low, medium, high). Performance metrics include MSE and MAE (value), AUC (observation prediction), and a novel Observation-Value Joint Entropy (OVJE) that jointly measures accuracy of value prediction weighted by existence prediction. Comparisons are against strong baselines representing Transformer, CNN, and Linear architectures adapted for missingness.

  3. Reproducibility: Code and data (Shadow benchmark) are publicly released at https://github.com/ant-intl/Timeflies enabling exact replication. Detailed architecture, losses, and input processing are fully documented.

A concrete example: For an input historical sequence with missing intervals, the model embeds patches with reliability scores, applies cross-stream attention fusing value and missingness patterns weighted by patch reliability, then jointly outputs future observation existence probabilities and predicted values gated by existence likelihood. This eliminates naive assumptions of known future observation times and grounds predictions in realistic observability dynamics, improving robustness under irregular missing patterns.

Technical innovations

  • Formulating time series forecasting as a joint inference problem of future observational existence and state value prediction rather than assuming known future observation timestamps.
  • Designing a dual-stream Transformer architecture with an observation stream modeling missingness dynamics and a value stream modeling state evolution, coupled via observation-aware attention.
  • Introducing a reliability-aware patch embedding mechanism that weighs the injection of observation signals into value tokens based on the local missingness ratio and interval patterns, filtering noisy sparse data.
  • Proposing the Observation-Value Joint Entropy (OVJE) metric to jointly quantify the coupled predictability of existence inference and value estimation.
  • Curating Shadow, a large-scale benchmark combining multiple public and proprietary datasets with natural and non-random missing patterns to rigorously evaluate irregular time series forecasting.

Datasets

  • Shadow Benchmark — 31 datasets combining 15 public datasets from GIFT-Eval benchmark and 16 proprietary industrial e-commerce datasets — publicly available with repository
  • GIFT-Eval datasets — 15 datasets from public source GIFT-Eval [44] — included in Shadow
  • Proprietary e-commerce datasets — 16 datasets with hourly transaction volumes across global regions — source: production industrial platform (not public)

Baselines vs proposed

  • OLinear: OVJE = 0.762 vs Timeflies: OVJE = 0.611 (22.4% relative reduction) under high missing (Table 1)
  • OLinear: MAE = 0.328 vs Timeflies: MAE = 0.303 under high missing (Table 1)
  • FEDformer: OVJE = 1.307 vs Timeflies: 0.611 under high missing, significant improvement (Table 1)
  • Timeflies (no observation head): MAE = 0.294 vs OLinear = 0.311 under high missing (Table 2)
  • PatchTST: MSE = 1.687 vs Timeflies: MSE = 1.546 under high missing (Table 1)
  • Crossformer: AUC = 0.654 vs Timeflies: 0.805 under high missing (Table 1)
  • Ablation w/o mask-aware normalization: MSE increases from 1.362 to 1.517 (Table 3)
  • Ablation w/o observation-conditioned attention: AUC drops from 0.805 to 0.732 (Table 3)

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2606.13571.

Fig 1

Fig 1: Evolution of Forecasting Paradigms for Time Series. (a) Grid-Restricted Paradigm:

Fig 2

Fig 2: The overall architecture of Timeflies. (1) Reliability-Aware Patch Embedding refines

Fig 3

Fig 3: Visualization of Timeflies on a medium-irregular sample from ecom_BM. (a) Forecast-

Fig 4

Fig 4: (a) Sensitivity of Timeflies to the missingness loss weight η across datasets spanning

Fig 5

Fig 5 (page 8).

Fig 6

Fig 6 (page 8).

Fig 7

Fig 7 (page 9).

Fig 8

Fig 8 (page 9).

Limitations

  • While the framework and code are released, several proprietary datasets remain private, limiting full reproducibility on all Shadow data.
  • The evaluation focuses on naturally missing data but does not include controlled adversarial missingness or perturbations to test model robustness against adversarial attacks.
  • Timeflies relies on patch-based embedding and Transformer attention; computational cost and scalability for very high-frequency or extremely long sequences require further investigation.
  • The model assumes missing intervals and mask as inputs; settings where missingness depends on external latent factors or complex event triggers may require extension.
  • OVJE is a novel metric combining existence and value but may need broader community validation and testing on other time series domains for general applicability.

Open questions / follow-ons

  • How well does joint observability and value modeling generalize to time series with exogenous covariates or complex causal dependencies?
  • Can the observation existence modeling be extended to multi-variate and multi-modal time series with asynchronous observations?
  • How to effectively scale the Timeflies architecture to ultra-long horizon forecasting with very sparse data while maintaining computational efficiency?
  • Would adversarial training against synthetic missingness or perturbations improve robustness and generalization of existence-value joint forecasting?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners, Timeflies introduces a paradigm relevant whenever sequential irregular data must be forecast with uncertain future presence — e.g., user behavioral logs, intermittent signals, or event-driven access patterns. Traditionally, forecasting models assume availability or uniform sampling frequency; Timeflies explicitly predicts if a future data point will appear before forecasting its value. This is directly applicable to bot-detection from irregular telemetry where one must infer if a valid user action occurs before assessing its properties. Additionally, the reliability-aware attention and joint prediction strategy provide a blueprint to integrate missingness as signal rather than noise, improving robustness and reducing false positives due to noisy or sparse observations. The new OVJE metric offers a holistic evaluation that could guide CAPTCHA challenge scheduling or adaptive sampling strategies balancing observability and predictive confidence. However, practitioners should consider scalability and domain-dependent input construction before deployment. Overall, this work encourages moving beyond naive imputation to directly modeling observational existence mechanisms for more practically relevant time series predictions.

Cite

bibtex
@article{arxiv2606_13571,
  title={ Existence Precedes Value: Joint Modeling of Observational Existence and Evolving States in Time Series Forecasting },
  author={ Yifan Hu and Hongzhou Chen and Peiyuan Liu and Yiding Liu and Zewei Dong and Jiang-Ming Yang },
  journal={arXiv preprint arXiv:2606.13571},
  year={ 2026 },
  url={https://arxiv.org/abs/2606.13571}
}

Read the full paper

Last updated:

Articles are CC BY 4.0 — feel free to quote with attribution