SoK: Taxonomizing the Low-Level Attack Surface of Modern Web Browsers

Source: arXiv:2606.16646 · Published 2026-06-15 · By Han Zheng, Qinying Wang, Qiang Liu, Mathias Payer

TL;DR

This paper provides a comprehensive systematization of the low-level attack surface in modern web browsers, focusing on memory corruption vulnerabilities as the dominant exploit vector. The authors unify the architectures of Chrome, Firefox, and Safari into a detailed Input × Component × Privilege taxonomy, which maps attacker-controlled input classes to the browser components and their privilege tiers. They then classify 2,233 memory corruption bugs disclosed between 2016 and 2025 against this taxonomy, exposing patterns in where vulnerabilities concentrate across the browser stack.

Overlaying a decade of academic fuzzing campaigns onto this bug map reveals significant gaps: while testing efforts cluster on well-studied script and document inputs, many high-impact surfaces remain under-tested, including WebAPI bindings, GPU/WebGL backends, IPC handlers, and UI event processing. The authors identify three recurring fuzzer deployment gaps — limited configuration, oversimplified harnesses, and missing harnesses — that further limit coverage of these attack-dense components. The work offers a structured foundation and prioritization guide for future browser security testing research.

Key findings

The authors develop an Input × Component × Privilege taxonomy capturing 5 attacker input classes, 9 browser components, and 4 OS privilege tiers across Chrome, Firefox, Safari.
They classify 2,233 memory corruption vulnerabilities from 2016–2025 by component and input, finding that WebAPI bindings, PDFium SDK, WebGL backend, UI input path, and IPC receivers combine high bug density with insufficient fuzzing coverage.
Binary blob inputs mainly threaten low-privilege parsers with 79% of 170 bugs at low privilege, and their fuzzing coverage aligns well (50%–90%), but privileged binary components show uneven coverage (e.g., device brokers at 1%–12%).
Document inputs are the largest bug class (1,051 bugs) where WebAPI subcomponents producing 485 bugs have only 17.4% fuzzing coverage, while document bodies receive 41.5%. This reveals sharp misalignment between bug density and coverage.
Script inputs (371 bugs) do not produce high-privilege bugs; the JavaScript engine sees 64.7% coverage, WebGPU 73.6%, but WebGL backend only 34% coverage and vendor shader compilers are unmeasured and closed-source with many known bugs.
UI gestures concentrate 283 bugs almost exclusively in the high-privilege browser process, but lack dedicated fuzzers with only 14–34% coverage from whole-browser harnesses.
IPC input has 358 bugs with 61% coverage overall, but key high-privilege IPC handlers show coverage gaps between 15%–28%.
The study identifies three key deployment gaps limiting fuzzing efficacy: limited configuration, oversimplified harnesses, and missing harnesses.

Threat model

A remote adversary controls attacker-controlled inputs such as web page content, scripts, binary blobs, and may trick a user into interacting with malicious UI gestures. The attacker can fully control the renderer process memory once exploited, then attempts to escalate privileges by sending crafted IPC messages to higher-privilege browser components or invoke vulnerable high-privilege processes. Kernel-level code execution is in scope as a final target, but the adversary cannot directly corrupt kernel memory without first escalating through browser components.

Methodology — deep read

The authors first define the threat model as a remote attacker controlling web inputs and able to fully control the renderer process memory, seeking to escalate privileges up to the kernel level. They assume an attacker can send crafted IPC messages to higher-privilege browser components once inside the renderer sandbox.

They derive the attack surface taxonomy from detailed study of the architectures of Chrome, Firefox, and Safari using public design documents. This Input × Component × Privilege taxonomy organizes browser inputs into five classes (binary blobs, documents, scripts, UI gestures, IPC calls) processed by nine components (HTML engine, PDF engine, JS engine, media decoder, third-party parsers, GPU, network, utility processes, browser process) at four privilege tiers from low to kernel.

For data, they aggregate 4,099 vulnerability reports spanning 2016–2025 from Chromium and Firefox bug trackers/security advisories, filtering 2,233 memory corruption bugs. Each bug is labeled for input class, affected component, and privilege level by combining automated string matching and LLM extraction, with manual validation on a sample.

They complement the bug data with vendor fuzzing coverage statistics from the Chromium coverage dashboard and OSS-Fuzz for third-party parsers. Coverage sums line coverage across multiple fuzzing engines (libFuzzer, Centipede, Fuzzilli).

Analytically, bugs are mapped to taxonomy nodes to produce bug density heatmaps. Coverage percentages per component are compared to bug density to locate under-fuzzed surfaces. They also overlay academic browser fuzzers by input class across the bug-density map.

Three deployment gaps (limited fuzz config, oversimplified harnesses, missing harnesses) are identified qualitatively from examining fuzzing harness designs and bug-triage discussions.

One concrete example: WebAPI bindings produce 485 bugs; only 17.4% coverage; WebGL backend coverage is 0.1%; thus, testing fails to exercise a bug-dense, high-privilege attack surface segment.

No full code or weights are published due to the empirical nature of the work and proprietary browser codebases.

Technical innovations

Introduction of an Input × Component × Privilege taxonomy unifying browser low-level attack surfaces across three major browsers based on architectural principles of least privilege.
Large-scale empirical mapping of 2,233 memory corruption vulnerabilities (2016–2025) to the taxonomy, quantifying bug density per attack surface component and privilege.
Overlay of decade-long academic fuzzer targets onto the taxonomy, revealing clustering on script/document inputs and neglect of important components like WebAPI bindings, GPU backend, UI input, and IPC handlers.
Identification and classification of three orthogonal deployment gaps limiting fuzzing efficacy: limited configuration, oversimplified harnesses, and missing harnesses.

Datasets

Chromium issue tracker bugs — ~2,770 bugs total, 2016–2025 — public
Firefox security advisories — ~1,329 bugs total, 2016–2025 — public
Combined labeled memory corruption subset — 2,233 bugs — derived from above

Baselines vs proposed

Vendor fuzzing coverage vs bug density on binary blob parsers: coverage 50–90% matches bug density; on privileged binary components: coverage as low as 1%–12%; significant misalignment.
Coverage vs bugs on document sub-surfaces: Document body/Document API coverage 42–57% with 400 bugs collectively; WebAPI coverage only 17.4% with 485 bugs (largest bug cluster) indicating under-testing.
Script input coverage: JavaScript engine coverage 64.7% with 312 bugs; WebGPU backend 73.6% with 13 bugs; WebGL backend only 34% coverage with 11 bugs, showing weaker coverage on a GPU component.
UI gesture coverage extremely low (14–34%) despite 283 bugs mostly at high privilege; no dedicated UI fuzzers exist.
IPC coverage averages 61%, but key high-privilege IPC handlers have only 15%–28% coverage.

Limitations

Study focusses on memory corruption bugs only; logic bugs and other vulnerability classes are not covered.
Only publicly disclosed bugs from Chromium and Firefox are included; Safari bugs are omitted due to lack of trackers, limiting generalizability.
Coverage measurements are limited to Chromium in-tree fuzzers and OSS-Fuzz; open-source data may not fully represent vendor fuzzing in Firefox or Safari.
Analysis relies partly on automated and LLM-assisted bug labeling, which may carry classification errors despite manual validation.
Vendor-provided shader compilers are closed source, so coverage and bug surface assessments there are incomplete.
The study does not execute new fuzzing experiments; deployment gaps identified are qualitative and require future validation and fixes.

Open questions / follow-ons

How can fuzzers be designed or reconfigured to close the identified deployment gaps, especially for under-covered components like WebAPI bindings and IPC handlers?
What mitigation strategies can effectively reduce memory corruption risks in high-privilege components identified as bug-dense but under-tested?
Can these taxonomy insights be extended to other vulnerability classes beyond memory corruption, such as logic or side-channel bugs?
What is the impact of integrating closed-source vendor shader compilers and third-party libraries more fully into fuzzing and testing pipelines?

Why it matters for bot defense

Bot-defense and CAPTCHA engineers can leverage this systematization to understand which parts of the browser's native attack surface remain under-tested and hence potentially exploitable by attackers automating browser-based attacks. The taxonomy clarifies how attacker inputs propagate through browser components at different privilege levels and highlights weakly tested interfaces where exploit code might focus. Awareness of IPC and GPU/WebGL components as under-fuzzed surfaces could guide defensive monitoring and customized test harnesses in anti-bot pipelines.

Moreover, recognizing recurrent fuzzing deployment gaps can motivate practitioners to audit their own fuzzing and defensive instrumentation coverage, targeting complex multi-component attack paths that span privilege boundaries. Overall, the paper underlines the need to move beyond classical input fuzzing towards integrated testing of IPC handlers, UI event processing, and high-privilege WebAPI bindings—areas critical for preventing sophisticated automated browser exploit attempts that might defeat CAPTCHA or bot-detection systems.

Cite

bibtex

@article{arxiv2606_16646,
  title={ SoK: Taxonomizing the Low-Level Attack Surface of Modern Web Browsers },
  author={ Han Zheng and Qinying Wang and Qiang Liu and Mathias Payer },
  journal={arXiv preprint arXiv:2606.16646},
  year={ 2026 },
  url={https://arxiv.org/abs/2606.16646}
}

SoK: Taxonomizing the Low-Level Attack Surface of Modern Web Browsers ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​