Econstellar: An Open-Source AI-Augmented Research Engine for Computational Financial Econometrics
Source: arXiv:2606.05705 · Published 2026-06-04 · By Avishek Bhandari
TL;DR
Econstellar addresses the significant challenge in computational financial econometrics of enabling accessible, credible, and reproducible empirical research. Traditional high-quality econometric analysis is computationally heavy and seldom delivered in a form that outsiders can verify or extend without extensive setup. Econstellar is an open-source, publicly accessible research engine that runs seventeen publication-grade econometric methods from a standard web browser, with computation placed on appropriate hardware (CPU rather than GPU) to handle irregular workloads like nearest-neighbor transfer entropy estimation efficiently. An AI assistant interprets but never generates numerical results, ensuring every quantitative claim is reproducible and grounded in computation. The engine runs the exact same code used to produce the authors' published research figures and provides live reproducibility with parameter variation and provenance tracking.
Key findings
- The engine exposes seventeen econometric methods uniformly applied to return series to respect non-stationarity of price levels.
- Computed directed wavelet-quantile contagion profiles for USA→India returns match the per-scale transfer-entropy gains precisely as published, demonstrating live reproducibility.
- A systemic-risk index derived from directed-flow primitives achieves an AUC of 0.915 for COVID-19 crisis early warning with one-day lead on US equity data, below but comparable to contemporaneous VIX (0.947).
- On trade-policy-induced stress in Indian equities, an augmented index improves crisis classification AUC from 0.531 (India VIX) to 0.581, statistically significant with p=0.030.
- Sandboxed compute engine enforces network isolation, timed ephemeral R subprocesses, and a finite method registry to prevent arbitrary code execution or data exfiltration.
- Rate-limiting on public endpoints caps usage to prevent resource exhaustion, including per-IP and global concurrency limits.
- AI analyst chooses registered analyses and interprets results after running them live; it does not hallucinate quantitative values.
- The architecture leverages CPU rather than GPU due to irregular, memory-latency-bound k-d-tree searches in transfer entropy estimation that resist SIMD acceleration.
Threat model
The adversary is an anonymous public user who can invoke the compute API with arbitrary method names and parameters within a pre-approved registry. They cannot upload arbitrary code or execute commands beyond the finite set of reviewed analyses. Network egress from compute subprocesses is disabled, subprocesses are ephemeral with restricted permissions, and rate limits prevent resource exhaustion attacks. The system aims to prevent code injection, data exfiltration, and privilege escalation while tolerating denial-of-service attempts mitigated by throttling.
Methodology — deep read
Threat model: The adversary is an anonymous public caller who can send HTTP requests to invoke computations. The system prevents code injection or execution beyond a carefully reviewed finite set of registered econometric methods via a parameterized API. Network egress is blocked, subprocesses are ephemeral with no persistent write access beyond scratch space, and wall-clock timeouts and rate limits mitigate denial-of-service attacks.
Data: The engine primarily operates on a stored panel of daily equity return series from 18 G20 markets spanning 2006–2026. Stationarity preprocessing enforces a discipline applying all methods on log-returns since price levels are I(1).
Architecture: The public facing surface is a static single-page JavaScript app served from GitHub Pages that dynamically reads the catalog of 17 supported R methods exposed via HTTP endpoints from a sandboxed Node.js compute engine. Each method corresponds to a rigorously implemented econometric primitive or composite thereof (e.g., transfer entropy, wavelet variance, community detection) encapsulated in R code called as isolated subprocesses without network access. Immense emphasis is placed on reproducibility: code versions, timestamps, parameters, and data vintages are all recorded with results. An AI assistant layer interfaces with a large language model to select which method to call and generates human-readable explanations but never fabricates data—results displayed to the user are always outputs of verified computations.
Training regimen: Not applicable as this is a deployed research-engine system rather than a trainable ML model. The AI assistant uses pretrained models (Gemini 2.5) called via Google Vertex AI but only to interpret and generate text, not quantitative results.
Evaluation: The engine's statistical outputs were compared against published values from the author's prior peer-reviewed research and reproducibility was verified live at the public endpoint. Key results included exact numerical replication of a multi-scale contagion profile (soch_profile method) and systemic-risk index classification stats versus standard VIX benchmarks, using metrics such as AUC and p-values reported in the referenced publications. The system discloses per-method provenance with every computation for audit.
Reproducibility: The system code (compute engine, workbench, AI analyst, R packages sochcontagion, contagionchannels, ManyIVsNets) is fully open-source under permissive licenses on GitHub. Live demos are available. The compute API accepts only validated calls to a reviewed registry of 17 methods, and all computations run sandboxed subprocesses that do not allow arbitrary code injection or network leakage. Results include permanent permalinks for exact replication. The underlying G20 equity data is proprietary but handled carefully with documented provenance.
Example workflow: A user opens the public portal (a static SPA) and selects the "unit_root" stationarity method for India returns. The SPA issues a POST request to the sandboxed compute engine API with method="unit_root" and series="India". The engine spawns an ephemeral R subprocess to compute Augmented Dickey-Fuller and KPSS statistics on the return series, outputting results with a provenance stamp including engine version, data timestamp, parameters, etc. The SPA renders the verified numeric outputs and the AI analyst may generate a natural-language explanation interpreting the statistics. The user can export results or modify parameters for further runs. This single call corresponds exactly to the calculation published for India in Table 2. By design, at no point does the AI produce or guess numbers—the engine runs the computation exactly, isolated and reproducible.
Technical innovations
- An architecture that places irregular, latency-bound computations (like nearest-neighbor k-d-tree transfer entropy) on CPUs rather than GPUs, matching hardware to workload for efficient public serving.
- A sandboxed compute engine exposing a finite registry of validated econometric analyses with no code upload or network egress, enforcing strong security for an open public API.
- An AI-augmented interpretation layer that selects live analyses to run and explains results, but never fabricates quantitative data, ensuring full reproducibility and auditability.
- A reproducibility-first design exposing live, parameterized econometric computations as permanent web-service endpoints combined with a dependency-free JS SPA workbench and provenance metadata.
- Integration of a live financial news intelligence pipeline (NEURICX) with geocoded, classified feeds stored longitudinally in BigQuery to contextualize econometric outcomes.
Datasets
- Stored G20 equity daily log-returns panel — 18 markets, 2006–2026 — proprietary panel hosted internally
- GDELT global news index articles — live feed from external source, cached internally
Baselines vs proposed
- COVID-19 crisis early warning AUC: VIX = 0.947 vs systemic-risk index = 0.915
- India trade-policy stress AUC: India VIX = 0.531 vs augmented systemic-risk index = 0.581 (p = 0.030)
- Reproduced sochcontagion profile USA→India at 4 scales: engine value 0.039 vs published 0.039
Limitations
- The equity panel dataset is proprietary and not openly shared, limiting outsider data replication.
- The system currently supports 17 econometric methods; the full 18-market nearest-neighbor transfer entropy workload is planned but not yet implemented.
- Evaluation focuses on replicating previously published results; adversarial robustness or stress testing of compute endpoints is not discussed.
- The AI interpretation relies on a single pretrained large language model without ablation or alternative model comparisons.
- Quality and currency of live news intelligence depends on upstream GDELT availability and may be stale during rate limits.
- The system assumes returns are stationary and prices are non-stationary without addressing regime changes or structural breaks explicitly.
Open questions / follow-ons
- How to extend the system to support full exact nearest-neighbor transfer entropy across the entire 18-market panel on planned dedicated HPC nodes?
- Can the AI-augmented interpretation be improved with multi-modal grounding, incorporating live news trends or external macroeconomic variables?
- What robustness measures or anomaly detection could be introduced to handle adversarial or malformed input sequences?
- How generalizable is this architecture to other high-compute econometric or scientific reproducibility tasks outside financial contagion?
Why it matters for bot defense
Econstellar illustrates a rigorous approach to integrating heavy backend computation of statistical measures with interactive AI-driven interpretation in a publicly accessible and reproducible manner. For bot-defense or CAPTCHA practitioners, some architectural themes are relevant: the use of a sandboxed execution environment with a fixed registry of calls mitigates risks from arbitrary input injection, rate limiting enforces resource control, and isolating AI-generated text from numerical claims ensures auditability and prevents hallucinated or manipulated responses. While the domain is financial econometrics rather than bot detection itself, these design principles—strong sandboxing, provenance tracking, parameterized APIs, and grounding AI outputs in reproducible computation—can inspire secure bot-defense architectures that require combining automated analysis with explainability. Additionally, the pattern of offloading irregular workloads to hardware suited to pointer-chasing tasks rather than GPUs may inform efficient design of real-time verification or anomaly detection systems that process complex, branching computational graphs.
Cite
@article{arxiv2606_05705,
title={ Econstellar: An Open-Source AI-Augmented Research Engine for Computational Financial Econometrics },
author={ Avishek Bhandari },
journal={arXiv preprint arXiv:2606.05705},
year={ 2026 },
url={https://arxiv.org/abs/2606.05705}
}