What You Don't Know Won't Hurt You: Self-Consistent Hierarchical Inference with Unknown Follow-up Selection Strategies

Source: arXiv:2605.06636 · Published 2026-05-07 · By Reed Essick, Amanda M. Farah

TL;DR

This paper addresses a key practical challenge in astrophysical population inference: how to incorporate follow-up observations when the process deciding which initial survey detections are selected for follow-up is unknown or difficult to model. Instead of requiring explicit models of these follow-up selection strategies, the authors show that, within a hierarchical Bayesian inference framework, one can obtain unbiased and self-consistent inferences of the intrinsic population by conditioning on all observed data and the initial detection indicators. They demonstrate that the unknown or complex follow-up selection process cancels out analytically, as long as the inference conditions on the original catalog data, and follow-up data when available. Multiple astrophysical scenarios illustrate the applicability and benefits of this approach, including Gaussian toy models, joint gravitational wave and electromagnetic “bright siren” observations, and catalogs with contaminant populations. Results show that follow-up observations can meaningfully improve constraints without explicit follow-up modeling, and that the method can correctly recover intrinsic population parameters and rare subpopulation fractions. This finding can significantly simplify and improve population analyses for large surveys with heterogeneous and uncoordinated follow-up, such as LSST, Gaia, and gravitational wave interferometers.

Key findings

Hierarchical posterior inference of intrinsic populations can be formulated to not depend on the follow-up selection probability P(F|D,x), eliminating the need to model unknown follow-up selection processes (Eq. 5).
Coverage tests with simulated catalogs (e.g., Gaussian toy model with 500 detected events and 56 followed-up) confirm unbiased posterior recovery independent of follow-up strategy (Appendix B).
In a 50-event bright siren gravitational wave cosmology simulation, adding follow-up EM data tightens the Hubble constant posterior by up to a factor of ~1.9, regardless of follow-up selection (Fig. 3).
Modeling a population mixture with contaminants and follow-up correlated with subpopulation type recovers the intrinsic fraction of rare dark companions accurately (posterior λ = 3.9+5.4/-2.4% vs true 7.4%), even when follow-up fraction is biased (Fig. 4).
Follow-up strategies that selectively follow events based on catalog data can yield more precise constraints than random or exhaustive follow-up, showing the value of informative follow-up choices without requiring explicit modeling.
The method accommodates uncoordinated follow-up from multiple observers adopting different selection heuristics, enabling combination of heterogeneous datasets without joint follow-up modeling.
Ignoring the follow-up process within the hierarchical posterior does not bias inference if the model conditions on all catalog and follow-up data, and properly models contaminants when present.
Precision gains depend on the number and informativeness of follow-up observations but plateau as follow-up increases, indicating diminishing returns.

Methodology — deep read

The authors begin with a hierarchical Bayesian inference (HBI) framework modeling the intrinsic astrophysical population as an inhomogeneous Poisson process with rate density dN/dθ = N_E p(θ|Λ), where θ are single-event latent parameters and Λ population hyperparameters.

They assume detected events form an initial catalog with observed data x and detection indicators D, but only a subset NF ≤ ND of detected events receive follow-up producing additional data f, with follow-up indicator F. The key difficulty is that the follow-up selection probability P(F|D,x) is unknown or intractable.

They derive a joint PDF (Eq. 2) over all latent parameters and observed data including detection and follow-up indicators, assuming conditional independence structure encoded in a directed acyclic graph (DAG, Fig. 2). By carefully marginalizing and conditioning, they show that the follow-up selection probability cancels out exactly from the final hierarchical posterior (Eq. 5). This means the follow-up process does not have to be modeled explicitly as long as inference conditions on all original catalog data x, detection D, and any follow-up data f.

To concretely demonstrate this, they use a Gaussian toy model (Sec. 3.1) where the intrinsic population is Gaussian-distributed θ~N(μ,σ), original catalog data x ~ N(θ,σ_x), and follow-up data f ~ N(θ,σ_f) with smaller measurement errors. Detection and follow-up probabilities are sigmoidal functions of the catalog data x. Using simulated catalogs, they perform hierarchical inference on (μ,σ) by implementing MCMC or nested sampling of Eq. 5.

They extend this to a multi-messenger astrophysics example (Sec. 3.2) where gravitational wave observations constrain luminosity distance (catalog data x), and electromagnetic follow-up provides redshift (follow-up data f). Multiple follow-up strategies selecting events with extreme masses are tested, showing consistent recovery of the Hubble parameter without explicitly modeling follow-up selection.

In Sec. 3.3, they model catalogs with two subpopulations: rare dark companions and contaminants, each with distinct distributions over catalog parameters. Follow-up provides additional classification data correlated with source type. Again, hierarchical inference recovers the intrinsic mixing fraction and population parameters despite biased follow-up selection correlated with subpopulation.

Evaluation consists of multiple posterior coverage tests, comparisons of posterior widths under different follow-up strategies, and demonstration of unbiased recovery of intrinsic parameters in the presence of unknown follow-up processes. Appendices detail mathematical derivations, simulation implementation, and coverage validation.

The framework is general, relying on standard assumptions of hierarchical models, Poisson processes, and conditional independence encoded in the probabilistic graphical model depicted. The key novelty is the analytic proof that unknown follow-up probabilities factor out when conditioning on all catalog and follow-up data, enabling ignoring them in inference.

No code or data release is explicitly mentioned; simulations use synthetic catalogs modeled after plausible astrophysical scenarios. The method applies broadly to surveys with complex or human-driven follow-up decision rules, common in real-world astronomical campaigns.

A concrete example end-to-end: simulate 500 events from Gaussian population, generate catalog data x with measurement noise, simulate follow-up with a follow-up selection depending on x using a logistic function (unknown to inference). Condition on x and f data and detection indicators; run MCMC to infer μ and σ. Validate that posterior covers truth and is centered correctly despite ignoring follow-up selection explicitly.

Technical innovations

Proof that hierarchical Bayesian inference can produce unbiased intrinsic population posteriors without modeling unknown or complex follow-up selection functions P(F|D,x) when conditioned on full catalog and follow-up data.
Formulation of a joint hierarchical model incorporating detection and follow-up as latent indicators with conditional independence structure that enables analytic cancellation of follow-up selection.
Extension of standard population inference methods to implicitly marginalize over follow-up decisions, enabling use of heterogeneous and uncoordinated follow-up datasets without explicit coordination or modeling.
Application of mixture modeling to jointly infer rare subpopulation fractions and contaminant populations under unknown follow-up biases, demonstrating robustness of inference.

Datasets

Synthetic Gaussian toy catalogs — 500 detected events with 56 followed-up — simulated for coverage tests (publicly unspecified)
Synthetic bright siren mock catalogs — 50 detected gravitational wave events with variable follow-up strategies — simulated
Synthetic dark companion search catalogs — 1396 candidates with 100 followed-up, mixing dark companions and contaminants — simulated

Baselines vs proposed

No follow-up (spectral sirens only): H0 posterior std dev = 2.340 × injected H0
Follow-up lowest 10 mass events: H0 posterior std dev = 0.104 × injected H0
Follow-up highest 10 mass events: H0 posterior std dev = 0.068 × injected H0
Follow-up all 50 events: H0 posterior std dev = 0.035 × injected H0 (Fig. 3)
Proposed hierarchical model recovers intrinsic population mixing fraction λ = 3.9(+5.4/-2.4)% consistent with true 7.4%, despite follow-up bias increasing apparent fraction to 74% in follow-up subset (Fig. 4)

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.06636.

Fig 1

Fig 1: (left) Distributions of true event parameters and data for an example mock catalog with 500 detected events of

Limitations

The approach assumes all relevant initial catalog data x used in follow-up decisions are included and conditioned on; violations may bias inference.
Model misspecification of contaminant populations could bias inference of rare subpopulations when contaminants are present.
The method does not address optimization of follow-up selection strategies for maximal precision; follow-up strategy still affects constraint tightness.
No explicit adversarial or robustness tests against pathological follow-up schemes or measurement errors are presented.
No real survey data applications or software releases are provided; effectiveness on complex heterogeneous real data remains to be tested.
Approximate analytic integrals rely on simplified Gaussian measurement error models; more complex likelihoods might complicate matters.

Open questions / follow-ons

How robust is the approach to incomplete or missing conditioning on all relevant catalog data used in follow-up decisions?
How does mis-specifying contaminant subpopulation models impact inferred properties of rare populations in practice?
Can one develop computationally efficient criteria to optimize follow-up selection strategies under finite observational resources within this framework?
How well does this hierarchical approach scale and perform on large real survey data with complex measurement uncertainties and heterogeneous follow-up data?

Why it matters for bot defense

Though this work focuses on astrophysical population inference, the core insight—when a follow-up or secondary selection process depends on observed data, explicitly modeling the follow-up probability can be avoided via hierarchical Bayesian methods if one conditions on all available initial and follow-up data—has analogues in bot defense research. For CAPTCHA or bot-detection engineers, this suggests that if a multi-stage detection or filtering pipeline triggers additional verification steps (the analog of follow-up), and those triggers are difficult or impossible to model explicitly, self-consistent probabilistic inference on the underlying population (e.g., genuine users vs bots) can still be achieved without explicit modeling of the follow-up selection. This could simplify complex infrastructures where follow-up verification decisions come from heterogeneous or adaptive heuristics. However, like in astrophysics, one must ensure conditioning on all observable data used in the selection. Furthermore, including possible contaminants (e.g., false positives or benign anomalies) and modeling them jointly may be critical where initial selections are imperfect. The practical challenge of optimizing follow-up/verification effort under constrained resources also parallels algorithms for adaptive CAPTCHA challenges. Thus, this paper's framework underscores opportunities for principled uncertainty quantification and bias avoidance in systems where only partial or unknown follow-up decisions are made.

Cite

bibtex

@article{arxiv2605_06636,
  title={ What You Don't Know Won't Hurt You: Self-Consistent Hierarchical Inference with Unknown Follow-up Selection Strategies },
  author={ Reed Essick and Amanda M. Farah },
  journal={arXiv preprint arXiv:2605.06636},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.06636}
}

What You Don't Know Won't Hurt You: Self-Consistent Hierarchical Inference with Unknown Follow-up Selection Strategies ​

TL;DR ​

Key findings ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​