A Scalable Nonparametric Continuous-Time Survival Model through Numerical Quadrature

Source: arXiv:2605.16208 · Published 2026-05-15 · By Chaeyeon Lee, Sehwan Kim, Hyungrok Do

TL;DR

This paper addresses the key challenge of modeling flexible continuous-time survival data in high-dimensional settings without relying on parametric hazard assumptions or time discretization. The authors propose QSurv, a deep learning framework that directly parameterizes the instantaneous hazard function as a continuous function of time and covariates and estimates the cumulative hazard integral using Gauss-Legendre numerical quadrature. This achieves high-order accurate integral approximations with only a modest number of quadrature nodes, allowing efficient end-to-end training with standard backpropagation and stochastic optimization. Additionally, they introduce a time-conditioned low-rank adaptation (Time-LoRA) technique that dynamically modulates the penultimate layer weights of a backbone network via low-rank updates conditioned on time, enabling rich temporal hazard dynamics without repeated expensive forward passes. Theoretical results provide approximation error bounds for the quadrature-based cumulative-hazard estimates directly relating to training objective approximation error. Empirical experiments on synthetic benchmarks, real-world tabular clinical datasets, and high-dimensional medical imaging datasets show that QSurv delivers competitive predictive accuracy and improved instantaneous hazard estimation compared to eight state-of-the-art baselines representing Cox-based, discrete-time, parametric, and ODE-based continuous-time survival models.

Key findings

QSurv achieves lowest or near-lowest L1 error on instantaneous hazard function estimation across six parametric synthetic survival distributions, e.g., 0.0258 ± 0.0180 for Gompertz compared to 0.0262 ± 0.0174 for SODEN and higher errors for others (Table 1).
QSurv trains approximately 10x faster than SODEN in simulation (6 seconds vs 60 seconds) while achieving comparable hazard estimation accuracy.
Using K=16 Gauss-Legendre quadrature points yields high-order accurate cumulative hazard integrals with bounded approximation error proportional to temporal smoothness and K factorial terms (Thm 3.1).
Time-LoRA reduces computational overhead during training by caching backbone computations and applying time-dependent low-rank adaptations only at penultimate layer, avoiding K full forward passes per example.
QSurv obtains top or near-top performance on nine real-world datasets spanning tabular clinical and medical imaging data on time-dependent concordance index (up to 0.7456 ± 0.1046), Integrated Brier Score, and Integrated Binomial Log-Likelihood (Table 2).
The discrete-time D-calibration goodness-of-fit test p-values indicate no significant calibration violation for QSurv across datasets, unlike some baselines.
The quadrature-based negative log-likelihood objective forms a stable and differentiable training proxy with direct approximation error bounds, enhancing theoretical reliability.
QSurv’s hazard-based modeling enables interpretable time-local risk function estimation capturing complex non-proportional hazard effects seen in clinical applications.

Methodology — deep read

The work focuses on modeling the conditional hazard function λθ(t|x) over continuous time t given covariate vector x, under right-censoring typical in survival analysis (with observed pairs (O, δ) where O = min(T, C) and δ = event indicator). In this setting, the objective is the negative log-likelihood based on instantaneous hazard and integral over cumulative hazard Λθ(t|x) = ∫0t λθ(s|x) ds.

Threat model & assumptions: The model assumes conditional independence of event and censoring times given covariates (T ⊥ C | X). The adversary is not considered explicitly (not a security paper), but the model must handle censored data and complex time-varying hazard functions.
Data: Experiments involved synthetic data generated from various parametric hazard distributions (Exponential, Weibull, Gamma, Gompertz, Log-normal, Log-logistic) with nonlinear covariate effects, plus nine real-world datasets — six tabular (FLCHAIN, FRAMINGHAM, GBSG, METABRIC, NWTCO, SUPPORT2) and three high-dimensional medical imaging datasets (COVID-19-NY, C4KC-KiTS, BraTS). Dataset sizes and splits follow domain benchmarks. Data preprocessing details are standard for survival analysis but specifics are in appendix.
Architecture / algorithm: QSurv models the log-hazard function fθ(x, t), parameterized by a neural network, with output hazard λθ(t|x) = exp(fθ(x, t)) ensuring positivity. To compute the cumulative hazard integral Λθ(t|x), Gauss-Legendre quadrature with K fixed nodes is used to approximate the integral efficiently: Λ̂θ(t|x) ≈ (t/2) ∑k=1K wk λθ(t τk | x), where nodes τk are roots of Legendre polynomials shifted to [0,1]. This yields a differentiable estimator for use in minibatch gradient descent. Time-conditioned LoRA modulates the weights at the penultimate network layer with low-rank time-dependent perturbations to capture non-stationary hazard dynamics for expressive backbones (ResNets, DenseNets, Transformers) efficiently without repeated full forward passes.
Training regime: Models are trained by minimizing the negative log-likelihood approximated using the quadrature scheme, employing minibatch stochastic gradient descent with random hyperparameter search and multiple random seeds for stability. The typical node count K=16 balances accuracy with computational cost. Implementation supports parallel evaluation of the network at quadrature nodes.
Evaluation protocol: Performance is assessed using time-dependent concordance index (Ctd), Integrated Brier Score (IBS), Integrated Binomial Log-Likelihood (IBLL) over full and quantile-based follow-up horizons to measure discrimination, calibration, and likelihood. Calibration is further checked by discrete-time D-calibration tests. Comparisons are made against CoxCC, CoxTime, DeepHit, NnetSurv, MDN, DeSurv, SODEN baselines representing classical and state-of-the-art deep survival methods, including ODE- and quadrature-based approaches. Ablations on K and Time-LoRA presence are reported.
Reproducibility: Code and detailed model implementations are publicly released at the authors' GitHub repository (https://github.com/hyungrok-do/qsurv). The datasets are publicly available or from prior benchmarks. Hyperparameter details and additional experiments are provided in the appendix.

A concrete example: To train QSurv on a batch of samples, each batch instance contains covariates x_i, observed time o_i, and event indicator δ_i. For each sample, the model performs a forward pass fθ(x_i, o_i) to get the log-hazard at observed time, then evaluates fθ(x_i, o_i τ_k) at K quadrature nodes to approximate cumulative hazard Λ̂_i via vectorized Gauss-Legendre weights. The negative log-likelihood is computed as L = −(1/B) ∑ δ_i fθ(x_i, o_i) − Λ̂_i, then backpropagated to update θ in standard minibatch SGD. Time-LoRA modulates the penultimate layer weights efficiently during these quadrature node evaluations.

Technical innovations

Application of Gauss-Legendre numerical quadrature to approximate the cumulative hazard integral in continuous-time survival modeling, enabling high-order accurate and scalable end-to-end training without time discretization or ODE solvers.
Introduction of time-conditioned low-rank adaptation (Time-LoRA) to efficiently modulate neural network weights at the penultimate layer, capturing non-stationary time dynamics with low computational overhead.
Theoretical derivation of explicit approximation error bounds for the quadrature estimates of cumulative hazard and resulting likelihood, relating smoothness and node count K to training objective accuracy.
Scalable integration of deep continuous-time hazard modeling with high-dimensional data backbones such as ResNets and Transformers through modular temporal adaptation.

Datasets

Synthetic parametric survival data — size varies per distribution — generated per simulation protocols
FLCHAIN — ~7,544 samples — public clinical tabular dataset
FRAMINGHAM — ~4,698 samples — public clinical tabular dataset
GBSG — ~2,000 samples — breast cancer clinical data, public
METABRIC — ~1,981 samples — breast cancer genomic and clinical features, public
NWTCO — ~3,349 samples — public clinical dataset
SUPPORT2 — ~9,105 samples — clinical mortality dataset, public
COVID-19-NY — medical imaging and clinical — size not explicitly stated
C4KC-KiTS — medical imaging renal cancer dataset
BraTS — brain tumor imaging dataset

Baselines vs proposed

CoxCC: instantaneous hazard L1 error for Gamma distribution = 0.9012 ± 0.0921 vs QSurv = 0.1718 ± 0.0859
SODEN: instantaneous hazard L1 error for Gompertz = 0.0262 ± 0.0174 vs QSurv = 0.0258 ± 0.0180
DeSurv: time-dependent concordance (COVID-19-NY full horizon) = 0.7420 ± 0.0923 vs QSurv = 0.7456 ± 0.1046
DeepHit: Integrated Brier Score (COVID-19-NY full horizon) = 0.0708 ± 0.0209 vs QSurv = 0.0716 ± 0.0155
NnetSurv: time-dependent concordance (COVID-19-NY median horizon) = 0.6793 ± 0.1175 vs QSurv = 0.7951 ± 0.0908
SODEN: training time on simulations ~60s vs QSurv ~6s
QSurv shows superior calibration p-values (D-cal) compared to DeepHit and NnetSurv on several datasets

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.16208.

Fig 1

Fig 1: Predicted instantaneous hazard functions on COVID-19-NY across representative risk

Fig 2

Fig 2: Convergence of approximation error and training efficiency on two simulation scenarios.

Fig 3

Fig 3: Instantaneous hazard, cumulative hazard, and survival functions for simulation scenario 1.

Fig 4

Fig 4: Instantaneous hazard, cumulative hazard, and survival functions for simulation scenario 2.

Fig 5

Fig 5: ResNet-18 backbone architecture used for medical imaging survival modeling. The original

Fig 6

Fig 6: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for

Fig 7

Fig 7: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for

Fig 8

Fig 8: True vs. predicted survival, cumulative hazard, and instantaneous hazard functions for

Limitations

Quadrature approximation accuracy depends on smoothness of the hazard function; rapidly varying hazards may require more nodes, increasing computation.
Time-LoRA modulates only penultimate layer, potentially limiting temporal adaptation expressivity compared to full time-entangled architectures.
Experiments mostly focus on right-censoring and standard survival settings; performance under competing risks or other censoring types untested.
No explicit adversarial robustness or distribution shift evaluation; generalization in highly heterogeneous clinical populations remains to be validated.
High-dimensional imaging experiments use pre-specified backbones; end-to-end training with time conditioning may require architectural tuning not fully explored.
Limited interpretability analysis beyond instantaneous hazard curves; deeper clinical validation and causal inference remain open.

Open questions / follow-ons

Can the Gauss-Legendre quadrature-based approach be extended to incorporate competing risks or multi-state survival processes within the same framework?
How sensitive is QSurv to extreme hazard variability or abrupt hazard changes that challenge the smoothness assumption underlying quadrature error bounds?
Could Time-LoRA be generalized to modulate deeper backbone layers or combined with other time-conditioning schemes for improved temporal expressivity?
What are the implications of using QSurv in longitudinal or dynamic prediction settings where covariates themselves evolve over time?

Why it matters for bot defense

QSurv’s core contribution lies in scalable modeling of complex continuous-time hazard functions via efficient numerical quadrature in neural survival models. For bot-defense and CAPTCHA practitioners, this work demonstrates a method to accurately estimate instantaneous event risks without discretizing time or imposing strong parametric hazard assumptions. Such techniques can inform risk scoring models that dynamically characterize temporal behaviors or failure probabilities in security applications. Furthermore, the Time-LoRA mechanism offers an impactful way to condition neural models on continuous parameters efficiently, potentially valuable for adapting defenses to evolving threat timelines without excessive cost. The theoretical grounding and error bounds could provide confidence when deploying continuous-time risk models in production environments that require explainability and stable optimization. However, the direct survival context differs from classical CAPTCHA challenges, so leveraging its core innovations would involve re-framing time-dependent user or bot behavior modeling under continuous hazard estimation. Overall, QSurv highlights numerical and architectural strategies to incorporate continuous-time information at scale that might inspire more nuanced adaptive bot detection tools.

Cite

bibtex

@article{arxiv2605_16208,
  title={ A Scalable Nonparametric Continuous-Time Survival Model through Numerical Quadrature },
  author={ Chaeyeon Lee and Sehwan Kim and Hyungrok Do },
  journal={arXiv preprint arXiv:2605.16208},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.16208}
}

A Scalable Nonparametric Continuous-Time Survival Model through Numerical Quadrature ​

TL;DR ​

Key findings ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​