P-K-GCN: Physics-augmented Koopman-enhanced Graph Convolutional Network for Deep Spatiotemporal Super-resolution

Source: arXiv:2606.19303 · Published 2026-06-17 · By Xizhuo, Zhang, Zekai Wang, Fei Liu, Bing Yao

TL;DR

This paper addresses the challenging problem of spatiotemporal super-resolution (SR) for complex nonlinear dynamical systems defined on irregular geometries, with a focus on reconstructing high-resolution cardiac electrodynamics from sparse low-resolution measurements. Traditional data-driven SR methods often lack physical constraints, and physics-informed learning methods have difficulty handling complex spatial domains and nonlinear temporal dynamics simultaneously. To overcome these issues, the authors propose the Physics-augmented Koopman-enhanced Graph Convolutional Network (P-K-GCN), a novel framework combining continuous spline-based graph convolutional networks for spatial encoding over irregular meshes with Koopman operator theory to linearize and stabilize nonlinear temporal evolution in a compact latent space. Additionally, a physics-based loss term derived from governing PDEs is integrated as regularization to enforce physical consistency and improve robustness.

Empirical results on a real-world 3D cardiac electrophysiology dataset demonstrate that P-K-GCN substantially outperforms state-of-the-art baseline models in reconstruction accuracy and temporal coherence. The authors also provide a rigorous theoretical analysis proving that their physics augmentation and Koopman regularization reduce the hypothesis space complexity and tighten generalization error bounds, explaining the observed gains. The combination of geometry-aware continuous GCNs, Koopman latent dynamics, and physics constraints represents a substantial advance for ill-posed SR tasks on irregular spatiotemporal domains with nonlinear dynamics.

Key findings

P-K-GCN reduces the total reconstruction error (REtotal) on cardiac electrodynamics by a significant margin compared to five baselines, achieving superior high-resolution recovery.
Continuous spline-based graph convolution effectively models spatial dependencies on irregular 3D heart mesh geometries, handling geometric heterogeneity better than standard discrete convolutions.
Incorporating the Koopman operator in the latent space linearizes nonlinear temporal dynamics, stabilizing long-horizon temporal predictions and improving coherence across reconstructed frames.
Physics-augmented loss enforcing PDE residuals reduces Rademacher complexity of the hypothesis space, mathematically guaranteeing stricter generalization bounds and error mitigation.
The hierarchical graph coarsening and refinement supports multi-scale feature extraction, improving reconstruction fidelity while preserving anatomical continuity.
Ablation studies show that removing physics loss or Koopman dynamics degrades performance, demonstrating the synergistic role of both in the framework.
Sparse, noisy low-resolution sensor measurements typical in cardiac applications can be accurately upsampled to physically consistent high-resolution dynamics by the method.
Theoretical analysis proves that physics augmentation reduces model capacity to physically consistent hypotheses, mitigating overfitting risks inherent in deep SR models.

Threat model

The problem setting assumes an observer with access only to sparse, noisy low-resolution measurements of spatiotemporal dynamics on an irregular geometry; there is no explicit adversarial threat. The goal is to reconstruct physically consistent high-resolution dynamics despite ill-posed inverse inputs, rather than to defend against malicious attackers or data tampering.

Methodology — deep read

The authors design a novel P-K-GCN architecture composed of several components targeting spatiotemporal SR on irregular geometry. The threat model assumes sparse, noisy low-resolution (LR) observations of a nonlinear dynamical system evolving on an irregular 3D spatial domain (cardiac surface mesh). Adversarial perturbations or data corruption are not explicitly discussed; the challenge is robust reconstruction from ill-posed inverse inputs.

Data consists of LR spatiotemporal graph signals Q_l ∈ R^{C×N_s×N_t} measured at sparse nodes N_s over time instances N_t, with ground truth high-resolution (HR) signals covering a denser mesh N_s* used for supervised training and evaluation. Noise is modeled additively, and standard preprocessing includes normalization and temporal blocking.

Spatial dependencies are encoded using a continuous spline-based graph convolutional network (GCN) following SplineCNN principles. Each spatial graph node represents mesh vertices, and edge attributes encode normalized 3D displacements. B-spline basis functions parameterize continuous convolutional kernels over the edge attributes (pseudo-coordinates), enabling kernel weights that adapt smoothly to irregular geometry. Residual connections stabilize training. Hierarchical graph coarsening with Graclus clustering generates multi-resolution representations, and symmetric unpooling reconstructs HR features.

For temporal modeling, the method adopts Koopman operator theory. Latent states Z(t) output by the graph encoder are flattened to vectors z(t), evolved forward linearly via a trainable matrix K (Koopman operator approximation), and decoded back to HR spatial fields. This operator-theoretic approach linearizes otherwise nonlinear temporal dynamics, enabling stable and coherent long-term prediction.

Training jointly optimizes encoder E_θ, decoder D_θ, and Koopman matrix K through a multi-component loss. The data-driven loss L_d enforces LR measurement consistency by mapping HR predictions back to LR space via a fixed projection matrix P_{h→l}. Two terms evaluate static reconstructions and latent Koopman predictions over temporal blocks. A physics loss L_phy uses PDE residuals from a reaction-diffusion model on the manifold, computed with Laplace–Beltrami operators for spatial derivatives on the curved surface, imposing Neumann boundary conditions. The total loss L = L_d + w_phy L_phy enforces both empirical fidelity and physical consistency.

Optimization iterates through four phases: spatial encoding and direct reconstruction, Koopman latent advancement, computing combined loss, and gradient-based parameter updates. Algorithm 1 details this procedure with batch size B, learning rate η, and physics weight w_phy.

Evaluation metrics include total reconstruction error (RE_total) and visual accuracy on cardiac voltage (u) and recovery variable (v) fields, compared against five baselines including standard graph neural networks and physics-informed models. Ablations test impact of components such as physics augmentation and Koopman operator. Data splits ensure temporal consistency and hold out observations for validation.

A theoretical analysis rigorously derives error bounds showing how the physics loss constrains hypothesis space, reducing Rademacher complexity, while the Koopman operator linearization caps temporal error expansion. This explains improved SR generalization under data scarcity and noise.

The authors provide detailed network architecture, hyperparameters, and training settings for reproducibility. However, code and datasets are not explicitly stated as released. The paper uses real cardiac electrophysiology data from prior simulation studies but does not disclose public dataset availability.

As a concrete example, the pipeline takes a block of sparse LR voltage maps, applies the spline-based GCN encoder to produce latent embeddings, propagates temporal dynamics linearly with the learned Koopman matrix, reconstructs dense HR voltage fields via decoder, and backprojects for loss computation. Physics-based soft constraints act as a regularizer during training, tightening solutions to physically plausible states. This combined approach enables significantly improved spatiotemporal super-resolution on complex irregular cardiac domains.

Technical innovations

Continuous spline-based graph convolution operator parameterized by B-spline kernels over edge pseudo-coordinates, enabling geometry-aware spatial feature extraction on irregular 3D meshes.
Integration of Koopman operator theory to linearize nonlinear temporal dynamics in the latent space of graph encoder outputs, stabilizing long-horizon spatiotemporal prediction.
Physics-augmented loss leveraging PDE residuals on the manifold via Laplace–Beltrami operators, imposing physically consistent constraints to reduce ill-posedness and improve robustness.
Theoretical analysis proving physics-informed regularization reduces hypothesis space complexity (Rademacher complexity), tightening error bounds for better generalization in spatiotemporal SR.

Datasets

3D cardiac electrodynamics dataset — size unspecified but includes sparse low-resolution measurements and corresponding high-resolution simulations on irregular 3D heart geometry — source not public

Baselines vs proposed

Baseline GCN model: total reconstruction error REtotal = higher than proposed (exact numbers not quoted) vs P-K-GCN lower REtotal
Baseline physics-informed NN without Koopman: accuracy degraded compared to P-K-GCN, demonstrating Koopman benefits
ConvLSTM baseline: lower temporal stability and higher error than P-K-GCN
Removing physics loss in P-K-GCN: performance drops significantly, verifying physics augmentation utility
Removing Koopman operator: worsens temporal consistency and increases error
Detailed numbers not explicitly provided in source summary, qualitative bar chart in Fig 5 shows substantial improvements

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2606.19303.

Fig 1

Fig 1: Flowchart of the proposed methodology for spatiotemporal super-resolution.

Fig 2

Fig 2: illustrates the P-K-GCN architecture, which processes spatiotemporal inputs on a

Fig 3

Fig 3: Visual comparison of the reconstructed transmembrane potential (𝑢) at time step

Fig 4

Fig 4: Visual comparison of the reconstructed recovery variable (𝑣) at time step 35 under

Fig 5

Fig 5: Bar chart comparing 𝑅𝐸total of our P-K-GCN framework against benchmark

Limitations

Experimental evaluation focused only on cardiac electrodynamics; generalization to other PDE systems with different physics or geometries remains untested.
Dataset provenance and size are not fully disclosed, hindering reproducibility and broader benchmarking.
No explicit adversarial or out-of-distribution robustness evaluation; assumption of zero-mean measurement noise without malicious perturbations.
Relies on known physical PDE models for physics-based loss; application to unknown or partially known dynamics is unclear.
Computational cost and scalability on larger meshes or longer time horizons not detailed, potential bottlenecks in practical deployment.

Open questions / follow-ons

Can the P-K-GCN framework be extended to other physical domains and PDE systems with different governing equations or boundary conditions?
How does the method perform under varying noise levels, sensor sparsity, or out-of-distribution testing data?
Can the Koopman operator approximation be improved through adaptive or nonlinear latent space embeddings for more complex dynamics?
Is it possible to incorporate uncertainty quantification or probabilistic physics constraints to assess confidence in super-resolved reconstructions?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners, this paper presents a sophisticated approach to reconstructing high-fidelity spatiotemporal data over irregular domains by combining graph convolution with physics-informed constraints and operator-theoretic temporal modeling. Although the domain is scientific computing rather than bot detection, the demonstrated strategy of encoding domain geometry explicitly with continuous graph convolutions and enforcing physics-based regularization could inspire analogous robust feature extraction or temporal coherence constraints in behavioral modeling for bot detection.

The Koopman-based technique for linearizing nonlinear temporal progression may parallel methods in CAPTCHA analysis that aim to understand and predict human or bot interaction sequences over time. More broadly, the integration of explicit physical priors to tame the ill-posed nature of super-resolution could inform the design of defense mechanisms that incorporate known invariants or constraints from interaction metadata or client environments, enhancing the robustness of feature reconstruction under sparse or noisy observation conditions common in bot defense telemetry.

Cite

bibtex

@article{arxiv2606_19303,
  title={ P-K-GCN: Physics-augmented Koopman-enhanced Graph Convolutional Network for Deep Spatiotemporal Super-resolution },
  author={ Xizhuo and Zhang and Zekai Wang and Fei Liu and Bing Yao },
  journal={arXiv preprint arXiv:2606.19303},
  year={ 2026 },
  url={https://arxiv.org/abs/2606.19303}
}

P-K-GCN: Physics-augmented Koopman-enhanced Graph Convolutional Network for Deep Spatiotemporal Super-resolution ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​