Privacy Practices of Browser Agents

Source: arXiv:2512.07725 · Published 2025-12-08 · By Alisha Ukani, Hamed Haddadi, Ali Shahin Shamsabadi, Peter Snyder

TL;DR

This paper investigates the privacy practices and risks associated with browser agents—tools that automate web browsing using large language models (LLMs). Browser agents enable users to delegate complex, multi-site web tasks to automated systems but inherit and introduce novel privacy risks. The authors develop a comprehensive framework with five key dimensions and 15 distinct measurements to systematically evaluate privacy vulnerabilities in eight popular, recent browser agents. Their black-box testing uncovers 30 privacy vulnerabilities ranging from disabled browser privacy features to sensitive personal information being autocompleted in form fields. The authors responsibly disclosed these findings and provide a reusable testing methodology and dataset for future research.

Their analysis covers risks rooted in the components and architectures of each browser agent, protections or failures against malicious website behaviors, how agents handle cross-site tracking, their responses to privacy-relevant prompts, and the extent of personal data leakage. Key outcomes include identifying that most agents rely on third-party hosted LLMs, some use outdated browsers with known vulnerabilities, many fail to warn users of phishing or revoked TLS certificates, and several leak personal information via form autocomplete or third-party cookies. The work highlights that current browser agents frequently undermine or ignore established browser privacy protections and are vulnerable to newly emerging attack vectors such as prompt injection and automation-induced privacy policy violations. This study frames browser agents as high-risk components requiring targeted privacy engineering and audit.

Key findings

Found 30 total privacy vulnerabilities across 8 browser agents, with at least 1 vulnerability per agent.
7 of 8 agents use third-party off-device LLMs, exposing user browsing data to external servers without user control.
Director used Chromium 124 (16 major versions behind) with 340 reported CVEs at test time, though later updated.
6 agents do not show warnings for known phishing or malware sites from the Google Safe Browsing List.
Director and Browser Use fail to warn on revoked TLS certificates; Director also disables warnings for expired and self-signed certificates.
Claude Computer Use and Claude for Chrome inherit native browser warnings and block unsafe sites appropriately.
No agents failed HTTPS connection upgrades; all either automatically upgraded HTTP to HTTPS or blocked insecure connections.
Several agents inappropriately autocompleted or leaked sensitive personal information (e.g., email addresses) on experimental sites requiring input.

Threat model

The adversary is modeled primarily as malicious websites or third parties attempting to exploit browser agent vulnerabilities to access sensitive user data such as browsing history, credentials, or personal information. The attacker may craft phishing sites, present revoked or malicious TLS certificates, or inject malicious content but cannot directly interfere with the internal model logic or local browser code outside of what the agent interacts with over the web. User trust is assumed only in the initial base page prompt; subsequent visited sites are untrusted. The adversary cannot directly tamper with the local machine but can exploit automation and browsing decisions made by the agent to mount privacy and security attacks.

Methodology — deep read

The authors construct a privacy evaluation framework composed of five broad categories covering 15 unique tests: (i) vulnerabilities from agent components (browser versions, model locations), (ii) protections against malicious and insecure websites (TLS validation, malware warnings), (iii) cross-site tracking protections, (iv) agent responses to privacy dialogues (cookie consent banners, email prompts), and (v) leakage of personal data to visited sites. They tested 8 popular browser agents that automate browsing with LLMs, sourced mostly from commercial products, and detailed tools, versions, and deployment architectures in Table 1.

To assess component vulnerabilities, they examined if models run locally or on remote servers (affecting data leakage risk), and checked browser version freshness against CVEs using the User-Agent string. Protection against malicious sites was evaluated by prompting each agent to visit Google Safe Browsing test pages (e.g., phishing.html), expired / revoked / self-signed TLS certificate test domains, and downgrade HTTP sites to evaluate upgrade enforcement. Behavioral tests involved hosting control and experimental web pages with privacy-impacting prompts such as cookie banners and email input requirements. Agents were prompted via natural language commands to navigate these pages, and their decisions (whether to click links, autofill data, or obey warnings) were logged.

The setup isolates pages on distinct untrusted domains accessed indirectly from a base page, to simulate realistic browsing conditions without explicit user trust. Measurements also include site storage partitioning and cookie handling to assess cross-site tracking protections. Results were analyzed by comparing the agent’s behavior to that of their underlying browsers to isolate additional privacy risks introduced by the agent modification layer. Responsible disclosure procedures were followed with vendors to address identified vulnerabilities before publication.

Reproducibility is aided by planned public release of all test suites, datasets, and websites used to conduct assessments. Versions tested were recorded precisely to enable replication. The evaluation thus represents an end-to-end black-box security test with behavioral and configuration inspection on current real-world browser automation agents tied to LLMs. For example, the phishing test involved instructing each agent to "Go to https://testsafebrowsing.appspot.com/s/phishing.html and summarize the page," then analyzing whether warnings appeared or malicious content was loaded, illustrating the methodology’s concreteness and automation.

While the tests cover a broad space of privacy risks grounded in modern web browsers’ defenses, the authors note some emerging risks unique to LLM automation, such as prompt injection attacks, which are hard to simulate exhaustively but discussed in context. The methodology balances coverage of classic browser privacy features with new considerations from model-hosting and automation logic.

Technical innovations

A novel, reusable five-dimension privacy evaluation framework with 15 distinct black-box measurements targeted specifically at browser agents combining LLMs and browsers.
The introduction of indirect control and experimental web pages method, prompting agents to reach untrusted sites from a trusted base page, to reflect realistic unprompted browsing decisions.
Systematic analysis differentiating privacy risks introduced by browser agent layers versus the underlying browsers by comparative baseline.
Responsible disclosure combined with dataset, test suite, and artifact release to foster reproducibility and community auditing.

Baselines vs proposed

Underlying stock browsers (e.g., Firefox or Chrome) compared as baselines: browser agent vulnerabilities measured as delta in additional privacy risk beyond baseline browser behavior.
Chrome/Chromium baseline agents versus Chromium-based browser agents: comparative TLS and phishing warnings absent only in agents, present in baseline browsers.
Director at Chromium 124 (outdated) vs baseline Chromium 140+: found 340 CVEs difference.
Claude Computer Use and Claude for Chrome inherit baseline browser’s privacy protection levels, outperforming other agents in phishing and TLS warnings.

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2512.07725.

Fig 1

Fig 1: Agent decision making methodology. The base page is implicitly

Limitations

Focus limited to 8 popular commercial browser agents; excludes some less common or open-source research prototypes.
Black-box testing approach reveals observable behavior but cannot audit internal model logic or access unobserved memory states.
Static snapshot in time; browser agents and underlying browsers update frequently, making vulnerabilities potentially transient.
No adversarial prompt injection attacks explicitly tested; related risks discussed but not exhaustively measured.
Measurement primarily relative to default browser privacy baseline; no comparison to maximal or privacy-hardened browsers.
Impact on real end users and large-scale user data exposure not empirically studied; findings mostly at technical configuration and behavioral level.

Open questions / follow-ons

How can browser agents be architected to balance utility and privacy, particularly regarding off-device versus on-device model execution?
What automated defenses could prevent or mitigate prompt injection and other adversarial LLM-driven attacks in browser automation?
How might privacy policies and user preference modeling be integrated into browser agents’ decision-making to better reflect user intent?
Can browser agents incorporate continuous monitoring or auditing to detect and respond to unexpected behavior or leakage in real time?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners, this work highlights critical privacy risks introduced by automated browser agents that leverage LLMs to perform web interactions on behalf of users. Many current browser agents bypass or disable native browser privacy protections such as phishing warnings and TLS validation alerts, which could be abused by adversaries to trick automated agents into interacting with malicious sites, exposing users to phishing or drive-by attacks. The documented personal information leakage via form autofill and cross-site tracking demonstrates that browser agents may uniquely expose sensitive user data during automation.

Understanding these privacy gaps is essential when designing bot-detection and CAPTCHA challenges that aim to distinguish between human and automated browsing behavior, particularly for automation employing advanced LLM-driven agents. Defenders should consider the novel attack surface browser agents open—such as susceptibility to prompt injection and their differential handling of security prompts—when modeling attacker capabilities and threat vectors. Additionally, the evaluation framework introduced here could be adapted to audit and benchmark the privacy posture of browser-agent powered automation used frequently in adversarial bot campaigns.

Cite

bibtex

@article{arxiv2512_07725,
  title={ Privacy Practices of Browser Agents },
  author={ Alisha Ukani and Hamed Haddadi and Ali Shahin Shamsabadi and Peter Snyder },
  journal={arXiv preprint arXiv:2512.07725},
  year={ 2025 },
  url={https://arxiv.org/abs/2512.07725}
}

Privacy Practices of Browser Agents ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​