WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation
Source: arXiv:2605.08310 · Published 2026-05-08 · By Zhichao Liu, Wenbo Pan, Haining Yu, Ge Gao, Tianqing Zhu, Xiaohua Jia
TL;DR
This paper addresses security vulnerabilities in browser agents tasked with performing long-horizon navigation and interaction workflows. Existing prompt injection attacks often fail to achieve end-to-end attacker goals in complex real-world environments and are easily detected due to disrupting the user task. To overcome these shortcomings, the authors propose WebTrap, a novel mid-task hijacking attack that injects carefully crafted instructions at multiple stages during an agent's navigation to seamlessly fuse attacker and user goals. This allows the compromised agent to temporarily perform the attacker goal and then return to the user task without noticeably degrading usability. WebTrap utilizes a multi-step instruction fusion with context-grounded generation to align the injected content naturally with the environment and system instructions.
Experimental evaluation on extended versions of the WASP web browser and InjecAgent file browser benchmarks demonstrates that WebTrap achieves substantially higher attack success rates and maintains system usability under attack compared to prior methods. Notably, WebTrap attains up to 77.78% end-to-end attack success on Reddit tasks (versus 27.78% max for baselines), while preserving over 80% user utility under attack. The attack also remains robust against state-of-the-art defense techniques, often forcing trade-offs that significantly degrade usability. Exploration of agent navigation traces reveals WebTrap exploits navigation inertia and path-validity degradation over time, enabling sustained stealthy hijacking during extended interactions. This work exposes a critical new vulnerability in long-horizon browser agents and demonstrates how adversaries can hijack their workflows mid-task without alerting users.
Key findings
- WebTrap achieves up to 77.78% ASR-End-to-End on Reddit medium tasks versus the strongest baseline at 27.78% (Table 1).
- WebTrap maintains high Utility Under Attack (UUA), e.g. 83.33% on long Reddit tasks compared to baselines mostly below 50% (Table 1).
- The dual-goal success rate (both attacker and user goals completed) of WebTrap is up to 47.62% in the long web browser setting, substantially higher than baselines below 20% (Table 2).
- WebTrap requires only three injection nodes, improving stealthiness by tightly fusing attacker and user goals instead of direct goal replacement.
- WebTrap remains effective under major defense baselines, forcing defenses to degrade utility significantly to mitigate the attack (Table 4).
- Agent navigation trajectory analysis shows WebTrap forms longer hijacked action sequences, exploiting navigation inertia and path-validity degradation to sustain hijacking.
- In file browser environments, WebTrap achieves up to 70% ASR-I and 60% UUA, outperforming InjecAgent baselines with maximum ASR-I around 30% (Table 3).
- Even simple single-step injections frequently hijack the agent mid-task without disrupting the user flow (Figure 1).
Threat model
The adversary is limited to injecting malicious instructions only into user-observable environment content (e.g., webpage or file system nodes) but cannot alter system prompts, underlying language models, or internal agent components. The attacker does not know the precise user task or restricted area structure but can observe the user area and has a defined attacker goal requiring navigation into a restricted area. The attacker limits injections to at most three nodes with controlled token counts to reduce detection risk.
Methodology — deep read
Threat Model and Assumptions: The adversary can only inject malicious content into user-accessible parts of the environment (user area) but cannot modify system prompts, the underlying LLM model, or system components. The attacker does not know the exact user task being executed or details of restricted areas. Injection cost is limited to at most three nodes with bounded token counts. Attacks rely solely on visible environment and inferred attacker goal.
Data and Environments: The authors extend existing benchmarks WASP (web browser) and InjecAgent (file browser) to create long-horizon, navigation-dominant tasks with controllable depth. These environments simulate multi-step workflows spanning user and restricted areas, where the attacker goal requires navigating into restricted areas. The extensions increase navigation length to better accumulate environmental observations and potential prompt injections.
Architecture/Algorithm: WebTrap's attack consists of three stage-wise instruction traps: lure, inertia, and payload. Each trap injects text that contains 3 parts: a local rationale grounding in environmental context B_k, a routing directive R_k guiding toward the next trap or payload, and a coupling clause C_k that fuses the attacker goal g_a and user goal g_u into a seamless workflow. The lure trap builds operational inertia without revealing attacker intent. The inertia trap frames the attacker goal as a prerequisite step before resuming the user task. The payload trap instructs the agent to execute the attacker goal in the restricted area and then return to continue the user goal.
The attack text is synthesized using a context-grounded generation method that aligns injected instructions with the local webpage/file system context (titles, URLs, directories) and adopts system-instruction-like language style to evade system-level filtering.
Training/Generation: Although the paper does not involve model training, the injection texts are generated by an attacker model that leverages environmental information and task specifics. This generation involves prompt templates for each trap stage and constraints enforcing conformity to system instruction style.
Evaluation Protocol: Experiments use DeepSeek-V3.1-Terminus as the base agent model. Metrics include ASR-End-to-End (binary success of completing attacker goal), ASR-Intermediate (presence of attack intent at any step), Utility Under Attack (user task success rate under attack), and Dual-Goal success (both goals completed). Evaluation is conducted across multiple navigation depths and task variants on extended WASP and InjecAgent tasks, with comparisons to multiple prior injection attacks (TopicAttack, Combined Attack, WASP baselines, InjecAgent baselines) and defenses (System Defense, Step-wise Defense, Goal-RI, Segment-Remove). Both one-time sampling and best-of-n (n=3) are used to simulate realistic and upper-bound attack effectiveness.
Reproducibility: The code to generate WebTrap injections and run experiments is publicly released at https://github.com/liuyaojialiuyaojia/WebTrap. The extended environments are based on publicly known WASP and InjecAgent but with modifications described in appendices. The paper provides detailed prompt template descriptions and generation constraints.
Example End-to-End: In the GitLab web browser environment, the agent begins a user task navigating nodes v0 to vT. WebTrap injects a lure trap at v0 that embeds a navigation hint compatible with the user task, causing the agent to alter its path without suspecting an attack. Upon reaching a subsequent node, an inertia trap further entangles the attacker goal as a prerequisite, manipulating the agent's reasoning. Finally, the payload trap instructs the agent to enter a restricted area and perform a malicious operation, after which the agent autonomously returns to the original user flow. The entire process uses only three injection points, maintains task usability, and results in successful attacker goal completion as verified by ASR metrics.
Technical innovations
- Multi-step instruction fusion steering that tightly fuses attacker and user goals into a single sequential workflow, enabling mid-task hijacking without replacing the user goal.
- Context-grounded generation method producing injected instructions aligned with local environment content and system instruction style, improving stealthiness and evasion of system-level defenses.
- Design of a three-stage trap sequence (lure, inertia, payload) that progressively shapes the agent's decision-making to perform attacker goals and then seamlessly resume the original task.
- Extension of browser agent benchmarks (WASP and InjecAgent) to simulate long-horizon navigation tasks enabling thorough evaluation of multi-step injection attacks and defenses.
Datasets
- Extended WASP web browser environment — size unspecified — extended from public WASP benchmark
- Extended InjecAgent file browser environment — size unspecified — extended from public InjecAgent benchmark
Baselines vs proposed
- Topic Attack [3]: ASR-E = 0-4.17%, UUA = 0-58.33% vs WebTrap ASR-E = 50-77.78%, UUA = 41.67-83.33% (Table 1)
- Combined Attack [20]: ASR-E = 0-37.5%, UUA = 8.33-62.5% vs WebTrap ASR-E = 50-77.78%, UUA = 41.67-83.33% (Table 1)
- Hijacking Text [10]: ASR-E = 11.11-58.33%, UUA = 37.5-91.67% vs WebTrap higher in most settings (Table 1)
- Generic Injection [26] in file browser: ASR-I = 10-25%, UUA=20-45% vs WebTrap ASR-I=40-70%, UUA=40-60% (Table 3)
- Under defenses, e.g. System Defense [28] reduces WebTrap ASR-E from 91.67% to 37.5% but at cost of UUA dropping from 91.67% to 87.5%, with other defenses similarly trade-off attack mitigation versus usability (Table 4)
Limitations
- Evaluations rely on simulation environments extended from WASP and InjecAgent, which may not fully capture all real-world browser and file navigation complexities.
- The attacker knowledge model is limited to visible environment and assumes no access to system internals, which constrains but may also omit certain attack vectors.
- Defense evaluation covers state-of-the-art instruction and context filtering methods but lacks adversarially adaptive defense mechanisms that could be developed in response.
- The agent model used (DeepSeek-V3.1-Terminus) represents a specific architecture; results might vary with different or future LLM-integrated agents.
- The generation of injected instructions is partly heuristic guided, and the robustness of WebTrap against more advanced or fine-tuned defense LLMs is unclear.
- No user studies are presented to assess the practical detectability by human users, especially under stealthy conditions.
Open questions / follow-ons
- Can defense methods be developed that specifically disentangle mixed attacker and user goals embedded by multi-step fusion injections like WebTrap?
- How resilient is WebTrap against adaptive or fine-tuned language model defenses that learn to identify context-grounded trap patterns?
- What are the practical limits of injection stealthiness against human users monitoring long-term agent behavior and UI changes?
- Can WebTrap-like mid-task hijacking attacks be generalized beyond browser agents to other long-horizon, interactive AI systems with sequential decision-making?
Why it matters for bot defense
For bot-defense and CAPTCHA practitioners, WebTrap reveals fundamental vulnerabilities in long-horizon browser agents arising from repeated environmental observations and multi-step navigation. Detection and mitigation strategies focused solely on static, direct goal replacement injections are inadequate against these stealthier multi-step fused attacks. Practitioners should consider defenses that analyze agent decision trajectories over time to detect anomalous detours representing embedded attacker goals. The concept of early intervention and navigation inertia exploited by WebTrap suggests that continuous behavioral monitoring combined with context-aware filtering may be essential. Additionally, injecting challenge-response tests or verification steps at critical navigation junctures could disrupt attacker-controlled workflow fusions. WebTrap also highlights the importance of designing agent architectures that maintain goal disentanglement and robust path-validity judgments over long interactions to resist stealth hijacking. While the attacks target browser agents, the principles extend to any interactive agent executing complex multi-step user workflows, relevant for CAPTCHA systems integrated into autonomous workflows.
Cite
@article{arxiv2605_08310,
title={ WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation },
author={ Zhichao Liu and Wenbo Pan and Haining Yu and Ge Gao and Tianqing Zhu and Xiaohua Jia },
journal={arXiv preprint arXiv:2605.08310},
year={ 2026 },
url={https://arxiv.org/abs/2605.08310}
}