Fingerprinting Browsers in Encrypted Communications

Source: arXiv:2410.21101 · Published 2024-10-28 · By Sandhya Aneja, Nagender Aneja

TL;DR

This paper asks whether browsers can still be distinguished when all traffic is encrypted with HTTPS over TLS 1.3, using only observable packet/message lengths rather than plaintext HTTP headers or JavaScript-exposed browser features. The authors’ core claim is that different browsers produce measurably different TLS handshake/data-message length patterns when fetching the same pages, because their cipher-suite lists and related TLS behavior shape the sequence and size of exchanged messages.

What is new here is not a sophisticated classifier, but a lightweight vector-comparison pipeline: capture TLS 1.3 traffic for a small set of browsers and pages, convert message lengths into vectors, interpolate them to equal length, and compute cosine similarity/dissimilarity between browser pairs. In their setup on a UTM-based virtual network, they report mean dissimilarities of 30.94% for Chrome-Edge, 33.57% for Chrome-Firefox, and 32.77% for Edge-Firefox, with per-page dissimilarity ranging from 0.01% up to 52%. The result is a proof-of-concept showing that encrypted traffic still leaks browser-specific structure, but the evaluation is very small and descriptive rather than a rigorous classification benchmark.

Key findings

Across six test URLs, Chrome-Edge mean cosine dissimilarity was 30.94%, Chrome-Firefox was 33.57%, and Edge-Firefox was 32.77%.
Per-URL cosine dissimilarity ranged from 0.001 (Chrome-Edge on url-5) to 0.573 (Chrome-Edge on url-4).
The authors report that different browsers used a different number of TLS 1.3 messages and different message lengths for the same page.
Table I shows high cosine similarity for some pairs/pages, e.g. Chrome-Edge similarity of 0.999 on url-5, indicating fingerprint stability is page-dependent rather than uniformly strong.
The paper states the probability of low dissimilarity is 0.05, but does not define the statistical procedure behind that estimate.
The authors attribute the length differences to the cipher-suite list used by TLS, rather than to plaintext content or HTTP headers.
The experimental environment used a UTM hypervisor on an Apple Mac M1 host, with Windows 11, Kali Linux, and Ubuntu VMs; browsers were captured via Wireshark and parsed with TShark/Python.

Threat model

The implied adversary is a passive network observer who can capture encrypted HTTPS/TLS 1.3 traffic between a client browser and a server and wants to identify which browser is in use. The paper assumes the adversary can see packet/message lengths and sequence structure but cannot decrypt content, alter TLS negotiation, or inject traffic. It does not model active evasion, traffic morphing, or adversarial browser randomization.

Methodology — deep read

Threat model and assumptions are implicit and fairly weakly specified. The paper is not trying to defend against an active attacker manipulating traffic; it assumes a passive observer can capture HTTPS/TLS 1.3 traffic between a browser VM and a web server VM and wants to infer which browser generated it. The authors also assume browser-specific TLS behavior is stable enough that the same page, fetched by different browsers, yields distinguishable message-length sequences. They do not discuss adversarial adaptation, traffic shaping, replay, or the effect of browser updates/extensions, so the practical threat model is closer to passive fingerprinting than to robust hostile-environment identification.

The data collection setup is synthetic and small. The authors built a virtual LAN using UTM on an Apple Mac with M1 processor, with three VMs (Windows 11, Kali Linux, Ubuntu). Apache ran on the Kali VM with six web pages. They installed three browsers on Windows 11 and two on Ubuntu, then accessed those pages while simultaneously capturing packets in Wireshark. For TLS, they configured OpenSSL on the server side with a 4096-byte key/certificate and used TLS 1.3 for the browser-server communication. They then installed Python and TShark on Ubuntu to extract TLS 1.3 fields into CSV and parse the traffic traces. The paper does not give the exact browser list beyond Chrome, Edge, and Firefox in the result tables, does not specify multiple runs per condition, and does not provide dataset size in packets/messages beyond the six URLs and the example vectors.

The algorithm is simple but worth unpacking carefully. For each browser-page pair, the authors extract the lengths of all TLS messages observed during communication and represent that page’s traffic as a vector. Because browsers produced different numbers of messages, the vectors had unequal lengths. To compare them, they use interpolation to standardize vector lengths, then compute cosine similarity between browser vectors. In effect, the pipeline is: capture TLS handshake/application traffic, isolate message lengths, convert each browser’s page trace into a numeric sequence, interpolate sequences to a common length, and compute pairwise cosine similarity/dissimilarity. The paper’s novelty is in using length-only sequences from encrypted TLS 1.3 traffic, rather than the more common TLS handshake parameter extraction or HTTP-based browser fingerprinting.

A concrete end-to-end example is given in the form of two vectors, v1 and v2, for different browsers on the same webpage (Equations 2 and 3). Each vector contains a sequence of message lengths such as 327, 1514, 70, 84, etc.; if lengths differ, interpolation is applied before similarity calculation. The cosine similarity is then computed as A·B/(||A|| ||B||), and cosine dissimilarity as 1 minus that value. The paper interprets higher dissimilarity as stronger browser distinction. In their reported tables, the same-page browser pair can be very close on one URL and far apart on another, which suggests page content and browser implementation interact in the observable traffic pattern.

Evaluation is descriptive and limited to pairwise browser comparisons over six URLs, not a full classification experiment. The main reported metrics are cosine similarity and cosine dissimilarity in Tables I and II, plus average dissimilarity percentages across browser pairs in Section V. There are no machine-learning baselines, no ROC/AUC, no confusion matrix, no held-out attacker, and no statistical significance testing beyond the informal statement that low dissimilarity occurs with probability 0.05. Reproducibility is partial at best: the paper discloses the VM stack, the use of Wireshark/TShark/Python, and the fact that OpenSSL/TLS 1.3 was used, but it does not provide code, traces, browser versions, or a downloadable dataset. Because of that, the results are best read as a proof-of-concept that encrypted traffic length patterns can separate browsers in a controlled lab, not as a validated operational fingerprinting system.

Technical innovations

Uses TLS 1.3 message-length sequences, rather than plaintext HTTP or TLS handshake field decoding, as the fingerprinting signal.
Introduces interpolation to normalize unequal-length browser traffic traces before cosine-similarity comparison.
Frames browser fingerprinting as a vector-similarity problem on encrypted traffic, with dissimilarity derived directly from message-length sequences.
Attributes browser differences to the cipher-suite list’s effect on message lengths, instead of relying on explicit browser headers or JavaScript APIs.

Datasets

6 web pages served by Apache in a UTM virtualized lab — size not specified — author-built, not public
Browser traffic traces from Chrome, Edge, and Firefox over TLS 1.3 — size not specified — captured with Wireshark/TShark in the authors’ UTM VM setup

Limitations

Extremely small and controlled experiment: only six pages and a handful of browsers in a lab VM environment.
No code, traces, browser versions, or exact preprocessing parameters are released, so reproducibility is limited.
Evaluation is pairwise similarity analysis only; there is no end-to-end browser classifier, no confusion matrix, and no held-out test set.
The paper does not test robustness to browser updates, extensions, network jitter, CDN variation, or real-world site diversity.
The claim that differences are due to cipher-suite lists is asserted, but the paper does not isolate cipher suites experimentally with ablations.
The reported “probability of low dissimilarity is .05” is not methodologically explained, so that statistic is hard to interpret.

Open questions / follow-ons

How well do these length-based fingerprints survive real-world variability such as CDNs, caching, browser updates, extensions, and background network noise?
Can an explicit classifier outperform cosine similarity while remaining robust across unseen sites and unseen browser versions?
Which parts of the TLS handshake or record-layer behavior actually drive the separability: cipher-suite ordering, extension lists, certificate size, or application-data framing?
Would the signal remain useful under common privacy defenses such as padding, traffic shaping, or TLS fingerprint randomization?

Why it matters for bot defense

For bot-defense engineers, the main takeaway is that encrypted transport does not eliminate browser-identifying side channels. Even without HTTP headers or JavaScript APIs, a passive observer may extract browser-specific structure from TLS message lengths, which could complement existing fingerprint stacks. In practice, this is more relevant as a risk signal than as a standalone classifier: the study suggests another feature family to consider when modeling clients, but the evidence here is too small to justify production use without much broader validation.

For CAPTCHA and bot mitigation, the immediate reaction should be caution. Browser-traffic fingerprints derived from TLS can be brittle across sites and environments, and the paper does not show how they behave under adversarial adaptation or distribution shift. Still, it points to a broader lesson: if a bot framework tries to hide behind a “real browser,” encrypted transport can still leak implementation details. A practitioner would likely treat such signals as one input among many, and would want far larger, version-diverse data before relying on them for enforcement.

Cite

bibtex

@article{arxiv2410_21101,
  title={ Fingerprinting Browsers in Encrypted Communications },
  author={ Sandhya Aneja and Nagender Aneja },
  journal={arXiv preprint arXiv:2410.21101},
  year={ 2024 },
  url={https://arxiv.org/abs/2410.21101}
}

Fingerprinting Browsers in Encrypted Communications ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​