Selection Integrity for LLM Graph Memory: An Accumulability Criterion for Information-Flow-Blind Retrieval
Source: arXiv:2606.12290 · Published 2026-06-10 · By Zeming Fei, Hongming Fei, Xiaoyang Wang, Yang yang, Prosanta Gope, Biplab Sikdar et al.
TL;DR
This paper addresses a fundamental blind spot in provenance-based defenses for long-term graph memories used by LLM agents. Existing defenses verify the authenticated provenance of the retrieved facts or passages, but fail to capture manipulation of the global graph structure that determines which facts are selected for answering or action. The authors show that manipulations of the graph’s structure—even when done by an untrusted principal and without altering any visible passages—can fully hijack which authenticated facts are chosen, enabling silent, irreversible harms despite faithful provenance control. They introduce a formal accumulability criterion characterizing when selectors are vulnerable, proving an impossibility result that provenance verification alone cannot close the selection-manipulation channel. Their defense, AUTHSELECT, enforces selection integrity by replaying the selector on a provenance-filtered authenticated subgraph and rejecting any structural-write-induced changes. This eliminates all structural-write attacks across multiple real graph-memory systems and datasets with near-zero overhead (2–3% latency) and no overblocking. The work highlights that threats arise from reallocation of selection membership by unauthenticated structure, not from structural reliance per se. Experimental analysis confirms that popular graph selectors like Personalized PageRank are vulnerable, while others like Graphiti's node-distance reranking are provably immune. Overall, the paper illuminates a previously unrecognized attack vector that can significantly undermine LLM agent safety and presents a practical, efficient defense with provable guarantees.
Key findings
- A single no-source structural write silently misdirected 28 irreversible ledger transfers out of 499 live actions; faithful information-flow control (IFC) permitted all, while AUTHSELECT prevented all (Section 6.4, Figure 4).
- Selectors admit the selection integrity attack channel exactly when their structural term can reallocate an Omega(1) share of top-k membership past a selected fact’s margin (Criterion 1).
- Personalized PageRank always satisfies membership reallocatability and opens the channel for low-floor targets, while rerankers with fixed candidate pools like Graphiti's node-distance are immune (Table 1, Figure 2).
- AUTHSELECT eliminates all 18 blind rows of selection divergence (Q ∈ D) tested across four datasets and three graph-memory systems at zero overblocking and only 2–3% latency overhead (Section 6.3).
- Every existing published agent-memory provenance defense tested—including A-MemGuard and a production memory firewall—is blind to the structural-write channel (Section 6.3).
- The key threat model requires untrusted structural writes outside the justification closure (WU), a capability distinct from inject-only attackers addressed in prior poisoning work (Section 3.3).
- Removing unauthenticated structural writes and re-running selection closes the channel completely and is necessary; no item-level filtering or reranking can do this alone (Corollary 1, Section 4.2).
- Structural reliance as Jaccard overlap of top-k sets does not predict channel openness, but measured adversarial evictability (selection divergence) does (Figure 2, Table 1).
Threat model
The adversary is an untrusted principal with write access to the graph structure (edges, merges, relations) of the LLM agent’s long-term memory but cannot forge source-channel provenance labels, modify the trusted planner, or corrupt authenticated content passages. Their capability is to perform no-source structural writes invisible in retrieved passages or justifications, opening a covert channel that re-routes authenticated selections. Content-only injectors without structural write ability cannot open this channel. The defender assumes the integrity of labels on content and that replaying selection on the authenticated subgraph removes unauthorized influence.
Methodology — deep read
The work begins with a formal threat model: an adversary controls untrusted write access to graph structure—edges, merges, relation assertions—that are not authenticated, but cannot forge labels or compromise trusted content sources. Such untrusted structural writes (WU) may not appear in the justification (the cited evidence) but can alter how a global graph selector chooses facts.
Data comes from multiple real-world long-term memory graph substrates, including HippoRAG2 (which uses Personalized PageRank selection), Graphiti (supporting node-distance and Leiden community selectors), and various knowledge graph datasets. The queries induce agent memory actions ranging from question answering to tool calls. Data splits incorporate multi-session traces, with careful provenance labeling of all edges and nodes by source channels.
The core architecture includes a native selector Sel(G, Q) operating over a memory graph G for a query Q, producing a top-k set of facts. Each graph object receives an integrity label ℓ from the source channel, defining the authenticated subgraph Gauth = G \ WU with unauthenticated structural writes removed. The reader uses Read(Sel(G, Q), Q) to produce an answer aG with justification J(aG).
The key algorithmic contribution is AUTHSELECT (Algorithm 1): on query Q, compute native answer aG from G, then replay the selector on Gauth without unauthenticated structure, yielding aGauth. If results differ, AUTHSELECT rejects aG and uses aGauth or escalates. This guarantees that answers influenced by untrusted structural writes are never accepted.
Training and tuning primarily involve hyperparameters for graph selectors like Personalized PageRank damping and k-top selection sizes. No stochastic training is used. Evaluations run deterministic planners at temperature 0 to approximate deterministic readers.
Evaluation measures harmful divergences D = {Q : J(aG) all authenticated but aG ≠ aGauth }, capturing the blind channel. Tests cover three graph-memory systems, four datasets, and multiple selectors. They measure attack impact (e.g., misdirected ledger transfers), defense coverage (blindness rates), overhead latency, and overblocking. Baselines include published provenance defenses and faithful reproductions thereof.
The paper includes formal proofs (in Appendix C) for the criterion, impossibility, and immune-vs-open selector classes. Empirically, they demonstrate the attack with a concrete multi-agent misdirected transfer (Figure 4) and run statistical tests for adversarial evictability rates. Most experiments use open datasets; some code and implementations are not fully released but re-implementations are described.
End-to-end, an example run has a compromised writer insert no-source structural edges that cause Personalized PageRank selection to favor incorrect but authenticated nodes, causing a victim agent to execute a misdirected transfer. AUTHSELECT removes those edges and re-computes selection, restoring correct transfers with zero availability loss. This concretely demonstrates the necessity and effectiveness of the defense.
Technical innovations
- A formal accumulability criterion characterizing exactly when selectors admit a blind structural write channel on graph memory selection integrity.
- Proof that faithful provenance checking over retrieved records alone cannot detect structural write attacks; closing this requires replaying selection on an authenticated subgraph.
- AUTHSELECT, a principled defense that removes all unauthenticated structural writes before selection and accepts only stable answers, achieving zero harm and near-zero overhead.
- Demonstration that structural reliance on graph structure does not predict vulnerability; reallocatability of top-k membership is the key insight driving the channel’s presence.
- A capability taxonomy separating inject-only (content) attackers from structural-write attackers, clarifying threat models in agent graph memory security.
Datasets
- HotpotQA — size not specified — public question-answering dataset
- MuSiQue — size not specified — multi-agent shared-memory dataset constructed for multi-session agent memory trace
- Graphiti Knowledge Graph datasets — various sizes — underlying graph-based agent memory systems
Baselines vs proposed
- A-MemGuard (original published defense): blind to structural write channel with 18/18 blind rows vs AUTHSELECT: 0/18 blind rows
- Production memory firewall: catches content injection attacks but blind to structural write channel vs AUTHSELECT: closes channel completely
- Native Personalized PageRank selector: adversarial evictability Sadv=high opening channel vs Graphiti node-distance reranker Sadv≈0 immune
- AUTHSELECT latency overhead: 2–3% vs 0% baseline selector
- AUTHSELECT over-block: zero vs 0% baseline
Limitations
- Scope is limited to structural-write attacks where untrusted writes alter graph edges but not corruption of trusted planners or label forging.
- Measured harms reflect QA-answer and sandboxed tool-call decisions, not irreversible real-world side effects or policy-rule creations outside tested scope.
- Evaluated implementation uses insertion-order tail snapshot for write provenance; general min-inheritance taint-propagation is formal but instantiated less fully.
- Evaluation depends on deterministic reader assumption (planner at temperature 0); stochastic generation or alternate planning may reduce reproducibility of divergence set.
- Code and datasets partly closed or re-implemented rather than fully open-source, limiting full external replication.
- Criterion and proofs assume selectors decompose into increasing functions of content and structural terms; selectors outside this model remain untested.
Open questions / follow-ons
- How to extend AUTHSELECT or related defenses to dynamic or probabilistic selectors that do not have deterministic selection outputs?
- Can general min-inheritance provenance tracking and taint-propagation engines realize the formal integrity policies more fully in complex production graph memories?
- What are the performance and usability tradeoffs of selection integrity enforcement in large-scale, low-latency agent memory search deployments with deep multi-hop queries?
- How do structural write attacks manifest and can they be defended against in multi-modal or multi-agent memory graphs where modalities beyond text are integrated?
Why it matters for bot defense
For bot-defense and CAPTCHA practitioners, this work exposes a subtle but impactful attack vector on LLM agents that rely on long-term graph-based memory. Provenance-based defenses that only verify retrieved fact authenticity miss structural manipulations in the retrieval graph that can silently alter the agent’s chosen outputs. This may allow compromised agents or insider attackers to bypass judgement or action checks despite appearing fully verified. Practitioners building bot detection or authentication systems should consider graph-memory selection integrity when LLMs use large knowledge graphs or multi-agent shared memories with writable structure. The accumulability criterion helps predict which selectors are vulnerable, guiding safer selector choice or motivating defenses like AUTHSELECT that replay selection on authenticated subgraphs to close the blind channel. Such selection-integrity enforcement advances trustworthy LLM deployments where kernel memory is mutable but must be resilient to subtle insider or poisoning manipulations beyond simple content injections.
Cite
@article{arxiv2606_12290,
title={ Selection Integrity for LLM Graph Memory: An Accumulability Criterion for Information-Flow-Blind Retrieval },
author={ Zeming Fei and Hongming Fei and Xiaoyang Wang and Yang yang and Prosanta Gope and Biplab Sikdar and Ying Zhang },
journal={arXiv preprint arXiv:2606.12290},
year={ 2026 },
url={https://arxiv.org/abs/2606.12290}
}