AESTRA II: Generative Spectral Modeling of the Sun as a Star for Precise Radial Velocities

Source: arXiv:2606.13574 · Published 2026-06-11 · By Yan Liang, Joshua N. Winn, Peter Melchior, Sicong Lu, Quang H. Tran

TL;DR

This paper addresses the fundamental challenge in detecting Earth-analog exoplanets through extreme-precision radial velocity (EPRV) measurements: the contamination of stellar spectra by complex spectral variability arising from stellar activity, micro-telluric absorption, and instrumental systematics. The authors apply AESTRA, a generative spectral modeling framework, to real Sun-as-a-star observations collected by the NEID spectrograph. AESTRA empirically decomposes observed spectra into three physically motivated components—stellar line-shape variability, micro-telluric absorption, and continuum fluctuations—without relying on external templates or precomputed atmospheric models. After removing telluric and continuum contributions, AESTRA extracts a low-dimensional latent representation encoding stellar activity signatures, which is then jointly optimized with candidate planetary Doppler signals to separate astrophysical RV variations from stellar activity-driven noise.

The paper evaluates AESTRA’s performance on a large dataset of high-resolution NEID solar spectra spanning multiple years. In 500 blind, period-agnostic injection-recovery tests of single-planet signals with semi-amplitudes from 0.1 to 0.7 m/s and orbital periods from 2.5 to 400 days, AESTRA recovers 238 planets at zero false alarm rate, including 13 planets below 0.3 m/s amplitude. This substantially outperforms traditional cross-correlation function (CCF)-based activity indicator detrending approaches, which recover 9 planets and none below 0.5 m/s. The results demonstrate that the generative spectrum decomposition and learned stellar activity latent space yield a significant improvement in sensitivity to Earth-analog Doppler signals, while simultaneously modeling and suppressing complex telluric and instrumental spectral variations.

Key findings

AESTRA decomposes NEID solar spectra into three components—stellar line-shape variability, micro-telluric absorption, and continuum variability—without external templates, successfully separating Earth-frame tellurics and stellar-frame activity signatures.
The learned telluric component correlates extremely tightly (correlation coefficient > 0.999) with precipitable water vapor estimates from NEID pipeline line-by-line atmospheric modeling (PWV), validating the atmospheric interpretation.
After removing telluric and continuum components, AESTRA merges 42 echelle orders into a cleaned spectrum preserving stellar activity variability for radial velocity inference.
In 500 single-planet injection-recovery tests (period range 2.5–400 days, semi-amplitudes 0.1–0.7 m/s) calibrated to zero false positives, AESTRA recovers 238 planets including 13 with K < 0.3 m/s, versus 9 planets recovered by traditional CCF-based activity-indicator detrending and none below 0.5 m/s.
NEID solar RV stability achieves roughly 0.37 m/s over long term, placing it within range of Earth-analog Doppler amplitude signals targeted by AESTRA.
AESTRA models and corrects for airmass-dependent telluric contamination, seasonal and instrumental wavelength shifts (e.g., due to a wildfire-induced instrument change), and continuum shape variations simultaneously in a joint framework.
The use of a low-dimensional latent stellar activity representation preserves differential line-profile distortions across different wavelengths and orders better than single activity indicators or Gaussian process regression on 1D RV time series.
The empirical wavelength alignment and spectral decomposition model accounts for instrument state transitions and enables unified treatment of multi-year heterogeneous solar spectroscopic data.

Threat model

The adversary is the complex mixture of time-variable spectral contaminants present in high-precision radial velocity data from Sun-like stars, including quasi-static and time-variable telluric absorption features fixed in the Earth's frame, smooth continuum instrumental variability, and intrinsic stellar activity-driven line-shape changes that induce spurious apparent Doppler shifts. These effects mask and confound the relatively subtle Doppler signals imposed by Earth-analog exoplanets. The adversary cannot be assumed to be perfectly known or modeled by external templates; instead, it is empirically learned from the data. The approach assumes access to multi-epoch, high-resolution spectra with millions of pixels and sufficient wavelength coverage, but cannot remove systematic errors that are degenerate with planetary Doppler shifts if not spectrally distinct.

Methodology — deep read

The authors approach the problem under a threat model where the primary limitations are complex spectral variability from stellar magnetic activity, micro-telluric absorption by Earth's atmosphere, and instrument systematics, all mixing with true planetary Doppler shifts in the observed NEID spectra. The adversarial challenge is to separate these overlapping effects without perfect external templates.

The data consist of 233,211 NEID solar spectra (Sun-as-a-star) from 2020–2024, reduced via the NEID pipeline (v1.4.2) providing 122 echelle orders per exposure with wavelength calibration, flux uncertainties, and quality metadata. After applying extensive quality cuts—removing times of instrument anomalies, solar eclipses, high airmass (> 2.5), cloud passages indicated by irradiance drops, low-irradiance or low-data-count days, and days with large RV scatter or offsets—the sample is reduced to 72,449 high-quality spectra spanning 521 clear observing days. A subset of ~30,000 spectra is selected randomly for computational feasibility.

Preprocessing involves blaze function division, normalization by median flux, masking of problematic pixels, and barycentric correction to place the spectra on a common stellar rest-frame grid. To handle instrumental changes (notably caused by the 2022 Kitt Peak wildfire), an empirical low-order cubic spline wavelength correction is learned and applied to the pre-fire spectra to align them with the post-fire data.

For each individual order, the observed spectrum y_obs,i is modeled as y_model,i = (1 - y_t,i) * (1 + y_c,i) * (y_star_bar + delta_y_star,i) + b_i, decomposing into telluric absorption y_t,i, continuum variability y_c,i, a trainable, time-independent stellar template y_star_bar, time-varying stellar line-shape distortions delta_y_star,i, and a scalar offset b_i. The model encodes the observed spectra with a shared neural encoder into a low-dimensional latent space and uses separate decoder branches to reconstruct telluric, continuum, and stellar components.

Structural constraints enforce that tellurics are Earth-frame absorption features with narrow morphologies and variable depths, continuum is smooth multiplicative low-frequency variations, and stellar line-shape variability is broader and localized around absorption lines in the stellar frame. The telluric decoder includes a learned broadening kernel describing line profiles. The entire decomposition and wavelength correction are jointly optimized using reconstruction loss with regularization terms that prevent degenerate solutions.

After training, only the telluric and continuum components are removed from the spectra to produce cleaned, activity-preserving per-order spectra. Orders with unreliable telluric corrections (determined by broadening kernel morphology) or strong residuals are excluded. The cleaned, telluric- and continuum-corrected spectra from 42 orders covering 4300–6230 Å are merged onto a common wavelength grid for downstream analysis.

Next, the cleaned spectra are encoded into a low-dimensional “stellar activity latent vector” that captures correlated line-profile variations. An activity-driven apparent radial velocity (RV) is inferred jointly with candidate planetary Doppler signals using a supervised model linking the latent space representation to RV perturbations.

For validation, 500 blind single-planet injection–recovery tests are performed using the real solar dataset. Synthetic planetary signals with randomized orbital periods (2.5 to 400 days) and semi-amplitudes (0.1 to 0.7 m/s) are injected into the spectra before analysis. Detection thresholds are calibrated to yield zero false positive discoveries. Candidate detections are compared against the injection parameters to count recoveries only when the detected signals match the injected planets. Traditional CCF-based activity-indicator detrending methods are also run for comparison.

The training and optimization details (e.g., architectures, loss terms, optimizers, hyperparameters) are thoroughly described in appendices, though specific epochs and hardware details are not explicitly stated. Code release and reproducibility for AESTRA are not mentioned directly, and the dataset is partially publicly available via NEID Solar Feed archives.

A concrete example is shown where an individual NEID exposure is decomposed into its telluric, continuum, and stellar components; the telluric component depth correlates with water vapor measurements; and removing telluric and continuum features enables cleaner detection of stellar activity-driven spectral variations that affect RV measurements.

Technical innovations

Joint generative spectral decomposition model separating telluric absorption, continuum variability, and stellar line-shape variability without external atmospheric or stellar templates, using shared encoder and component-specific decoders.
Empirical modeling of both time-variable micro-telluric line profiles and instrumental continuum fluctuations alongside stellar variability within a unified framework, enabling disentanglement of overlapping spectral sources.
Low-dimensional latent representation learned directly from cleaned, merged multi-order spectra to capture differential stellar activity spectral variability beyond traditional activity indicators or Gaussian processes applied to RV time series.
Blind, period-agnostic injection-recovery methodology calibrated for zero false positives to rigorously evaluate sensitivity to Earth-analog Doppler signals down to 0.1 m/s semi-amplitude in realistic solar data.

Datasets

NEID Solar Feed — approx. 233,000 raw spectra (down to ~72,000 after quality cuts) — publicly available at https://neid.ipac.caltech.edu/search_solar.php

Baselines vs proposed

Traditional CCF-based activity-indicator detrending: 9 planets recovered in injection-recovery test vs AESTRA: 238 planets recovered at zero false positive rate
Traditional CCF-based detrending: no planets recovered below K=0.5 m/s vs AESTRA: 13 planets recovered with K < 0.3 m/s

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2606.13574.

Fig 1 (page 1).

Fig 1: Raw spectral variability in NEID solar observations. Top: CCF radial velocities from a representative spectral order

Fig 2: Overview of the Æstra workflow. (a) Each echelle order is decomposed into telluric absorption, continuum variability,

Fig 3: Example telluric components extracted by the spectral decomposition model. Gray curves show observed NEID solar

Fig 4: Examples of the spectral decomposition into three components capturing distinct sources of variability. Top: The

Fig 5: Correlation between the leading stellar la-

Fig 6: Detection quality score for candidate signals in

Fig 7: Empirical calibration of the FAP threshold for

Limitations

Computational expense limits training to a subset (~30,000) of the full high-quality spectra (~72,000), so not fully leveraging all available data.
Instrumental and environmental changes after the Kitt Peak wildfire require empirical wavelength corrections; this ad hoc approach may not generalize to all instrumental systematics.
No explicit adversarial test on worst-case stellar activity scenarios or non-solar-type stars, limiting assessment of generalizability beyond Sun-as-a-star data.
Reduced spectral range (4300–6230 Å) excludes redder and bluer orders contaminated by tellurics or low S/N, potentially losing Doppler information available in full NEID spectra.
No public release of code or trained weights described, limiting immediate reproducibility and independent verification of results.

Open questions / follow-ons

How well does AESTRA generalize to other stellar types, including M dwarfs and more active stars with different line-profile variability?
Can the method be extended to jointly model multiple planets and disentangle overlapping Doppler signals in multi-planet systems?
What are the impacts of longer-term instrumental drifts or unmodeled systematics not captured by current wavelength corrections on model robustness?
How does AESTRA performance compare with upcoming machine-learning or physics-based RV extraction methods on other extreme-precision spectrographs like ESPRESSO or EXPRES?

Why it matters for bot defense

While this work is focused on extreme-precision radial velocity measurements for exoplanet detection, the core challenge and methodology are closely analogous to the problem faced in bot-defense spectral signal disentanglement: separating overlapping, subtle signals (planetary Doppler shifts) from complex, quasi-stationary confounding effects (stellar activity, atmospheric absorption, instrumental distortions). For CAPTCHA practitioners, AESTRA demonstrates the power of generative, physically-informed latent variable models that can jointly model multiple overlapping variability sources without relying exclusively on external templates. It highlights the benefits of leveraging high-dimensional raw data rather than compressed summary statistics to retain maximal information content for signal detection. The injection-recovery testing strategy calibrated for zero false positives is an instructive paradigm for rigorous evaluation of detection algorithms under realistic noise conditions. Further, the approach used to isolate Earth-frame versus stellar-frame variability via domain knowledge constraints could inspire novel ways to factor out unwanted signals in other spectral modalities, such as behavioral biometrics or device fingerprints in bot detection. Overall, the paper underscores the value of interpretable, physically constrained latent variable models that can be tailored to separate subtle signals from structured noise in complex observational data.

Cite

bibtex

@article{arxiv2606_13574,
  title={ AESTRA II: Generative Spectral Modeling of the Sun as a Star for Precise Radial Velocities },
  author={ Yan Liang and Joshua N. Winn and Peter Melchior and Sicong Lu and Quang H. Tran },
  journal={arXiv preprint arXiv:2606.13574},
  year={ 2026 },
  url={https://arxiv.org/abs/2606.13574}
}

AESTRA II: Generative Spectral Modeling of the Sun as a Star for Precise Radial Velocities ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​