A First Measurement Study on Authentication Security in Real-World Remote MCP Servers

Source: arXiv:2605.22333 · Published 2026-05-21 · By Huijun Zhou, Xiaohan Zhang, Haozhe Zhang, Haoyang Zhang, Mi Zhang, Min Yang

TL;DR

This work provides the first comprehensive, large-scale empirical analysis of authentication security in real-world remote Model Context Protocol (MCP) servers, a newly emerging infrastructure connecting large language models (LLMs) with external online services. The authors identify 7,973 live MCP servers and find that over 40% expose tool interfaces without any authentication, posing significant security risks. Among authenticated servers, OAuth-based authorization is the dominant framework but diverges from traditional OAuth deployments due to MCP-specific characteristics such as open client environments, dynamic client registration, and delegated authorization. To systematically investigate vulnerabilities rooted in these nuances, the paper introduces an OAuth flaw taxonomy tailored to MCP deployments. Using a semi-automated detection framework combining passive traffic observation with active probing, the authors evaluate a testable subset of 119 OAuth-enabled MCP servers and discover 325 security flaws prevalent across every single server tested, including 96.6% affected by dynamic client registration issues. Many flaws enable sensitive data leakage or account takeover, motivating responsible disclosure efforts that yielded 9 CVE assignments. Overall, this study exposes systemic weaknesses in the authentication fabric of the MCP ecosystem and urges adoption of hardened OAuth deployments for remote MCP servers.

Key findings

7933 live remote MCP servers identified; 40.55% expose tools with no authentication
Among 7,973 servers, 30.45% use OAuth-based authentication, 29% use static tokens or API keys
Of 2,428 OAuth-enabled servers, 46% support dynamic client registration (DCR)
In a testable subset of 119 OAuth-enabled servers, 100% show open client environments and DCR, 68.07% use delegated authorization
All 119 tested OAuth servers exhibit at least one security flaw; total 325 flaws found
Dynamic client registration flaws affect 96.6% of tested servers
Open client environment flaws affect 85.7% of tested servers
Nine concrete flaw types identified, split into four categories: dynamic client registration flaws, delegated authorization flaws, open client environment flaws, and common OAuth misconfigurations
Responsible disclosure led to 9 CVE IDs assigned for discovered vulnerabilities

Threat model

The adversary is a remote attacker capable of sending arbitrary HTTP(S) requests to publicly exposed remote MCP server endpoints and hosting malicious web content to lure victims. They can intercept and analyze network traffic under domains they control and craft deceptive OAuth authorization requests. They aim to bypass authentication, capture OAuth authorization codes or tokens, or bind victim service accounts to attacker-controlled identities. The attacker does not control the victim's devices, browsers, TLS endpoints, or compromise MCP servers or authorization servers themselves. They cannot breach underlying cryptography or internal non-routable networks, focusing solely on exploiting protocol and implementation weaknesses in remote MCP server authentication flows.

Methodology — deep read

The authors first defined a threat model focusing on remote attackers who can send arbitrary HTTP requests and lure victims to crafted OAuth flows, but cannot compromise TLS, victim devices, browsers, authorization servers, or MCP servers themselves. The protected assets are user's MCP sessions, OAuth codes and tokens, and linked service accounts.

To gather data, they used two cybersecurity search engines, FOFA and Shodan, to query for candidate remote MCP servers using protocol-specific fingerprints (e.g., JSON-RPC signatures and MCP session headers) and endpoint patterns. They retrieved 28,715 unique endpoint candidates, which they then actively probed with MCP handshake initialize requests to validate true MCP servers, resulting in 7,973 live validated remote MCP servers.

They categorized authentication mechanisms among these: no authentication, OAuth-based flows, static token or API keys. OAuth-enabled servers were further characterized by probing their OAuth metadata endpoints to detect dynamic client registration (DCR) support.

A testable subset of 119 OAuth-enabled servers was assembled after manual filtering for redundancy, connectivity, and compliance with OAuth flow semantics. This core subset supported DCR, operated in open client environments, and majority used delegated authorization.

Next, the authors abstracted the remote MCP OAuth workflow into four phases: P1 Discovery & Registration, P2 Authorization, P3 Token Exchange, and an optional PA Delegated Authorization for multi-hop flows. They identified required security checks per phase (e.g., client_id & redirect_uri validation during registration, PKCE and state validation during authorization, single-use code validation during token exchange).

Guided by this workflow and MCP/OAuth specs, they developed a taxonomy of nine concrete implementation flaws covering: dynamic client registration flaws (malicious DCR binding, client blind trust), delegated authorization flaws (layer inconsistency, nested context pollution), open client environment flaws (PKCE downgrade, consent page bypass), and common OAuth misconfigurations (open redirect, weak state, code replay).

Their detection framework reconstructed these OAuth lifecycle phases from captured traffic and selectively applied passive checks and active probes to confirm presence of specific flaws. Active probing simulated attacker behaviors such as registering malicious clients, sending malformed authorization requests, or replaying codes.

They systematically tested all 119 OAuth-enabled servers in the core subset using this framework, recording presence and types of flaws. Results were aggregated to report prevalence and severity. Confirmed issues were responsibly disclosed to affected parties and CVEs were assigned.

Throughout, careful manual review ensured adherence to ethical standards. The methodology combines large-scale asset discovery, protocol-aware characterization, rigorous flaw taxonomy design, and semi-automated vulnerability detection leveraging active testing, suitable to map the nuanced landscape of MCP OAuth authentication security.

Technical innovations

First large-scale empirical measurement of authentication security in real-world remote MCP servers connecting LLMs to external services.
Novel taxonomy of nine distinct OAuth implementation flaws structured by MCP-specific OAuth deployment characteristics: dynamic client registration, delegated authorization, and open client environments.
Semi-automated detection framework combining passive MCP traffic reconstruction with active controlled probing to validate OAuth flaw presence at scale.
Identification and quantification of the security impact of MCP-specific OAuth features like dynamic client registration endpoints becoming attacker surface.
Demonstration of multi-hop delegated authorization complexities in MCP OAuth flows leading to unique layered vulnerabilities.

Datasets

Remote MCP servers discovery dataset — 7,973 identified servers — internal compilation from FOFA and Shodan data
OAuth-enabled MCP server evaluation subset — 119 servers — manually curated for testability and full OAuth compliance

Baselines vs proposed

No authentication baseline: 40.55% of MCP servers expose unprotected access vs OAuth-secured servers (0% insecure by definition)
Among OAuth-enabled deployments, 96.6% affected by dynamic client registration flaws vs proposed ideal DCR with strict validation (0% flaws)
Open client environment flaws present in 85.7% of tested deployments vs ideal OAuth clients properly enforcing PKCE and explicit consent (0% flaws)
Total 325 flaws detected across 119 servers vs zero flaws expected in a correctly implemented OAuth flow

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.22333.

Fig 1

Fig 1: Demonstration of the remote MCP server authentication.

Fig 2

Fig 2: MCP specification authentication evolution timeline.

Fig 3

Fig 3: Two-step pipeline for discovering and validating remote MCP

Fig 4

Fig 4 (page 1).

Fig 5

Fig 5 (page 1).

Fig 6

Fig 6 (page 1).

Fig 7

Fig 7 (page 1).

Fig 8

Fig 8 (page 1).

Limitations

Evaluation subset limited to 119 testable OAuth-enabled servers out of 2,428 total OAuth deployments due to connectivity and manual filtering constraints
Focuses on remote attackers and does not consider compromised client or server insider threats
Semi-automated detection framework may miss flaws that require complex or interactive authentication flows not reproducible in testing
No evaluation of impact under distribution shifts or proxying scenarios beyond tested deployments
Excludes manual pre-registered clients from detection due to non-scalability, limiting scope of flaw prevalence estimates on those deployments
MCP ecosystem and specifications are rapidly evolving, meaning findings represent a snapshot that may change as implementations mature

Open questions / follow-ons

How can dynamic client registration protocols and tooling be hardened in the MCP ecosystem to eliminate or mitigate malicious client registration?
What systematic defenses or monitoring can detect or prevent layered delegated authorization attacks arising from multi-hop OAuth in MCP?
How do these vulnerabilities interact or propagate when MCP clients and servers are deployed in enterprise-controlled networks or integrated with zero-trust architectures?
Can formal verification or protocol-level enhancements be incorporated in MCP OAuth specifications to prevent common and MCP-specific OAuth misconfigurations?

Why it matters for bot defense

This study is highly relevant to bot-defense and CAPTCHA practitioners because authentication boundaries are critical lines of defense in automated client-server interactions involving LLM agents accessing user services. The widespread presence of severe OAuth flaws in remote MCP servers indicates that attackers could exploit protocol-level weaknesses to bypass authentication or hijack sessions without needing sophisticated client-side bot detection. CAPTCHA systems often assume that OAuth-protected API endpoints are secure; this research highlights the need for bot-defense engineers to evaluate and monitor backend OAuth flows and dynamic client registrations as part of a holistic defense.

Moreover, the open client environments and delegated authorization patterns common in MCP deployments resemble federated and multi-party authentication scenarios common in modern web and API usage. Bot-defense solutions should therefore consider how abuse tactics can exploit lax OAuth implementations, dynamic OAuth client registrations, or cross-layer authorization inconsistencies to automate credential theft, session fixation, or resource abuse. Incorporating OAuth protocol compliance checks, redirect URI validation, PKCE enforcement, and anomaly detection for OAuth client registrations into bot defense tooling could thwart attacker persistence without relying solely on frontend CAPTCHAs. Overall, this work urges CAPTCHA and bot-defense engineers to extend their scope beyond classical client interaction challenges to include robust OAuth flow security verification in LLM-driven agent ecosystems.

Cite

bibtex

@article{arxiv2605_22333,
  title={ A First Measurement Study on Authentication Security in Real-World Remote MCP Servers },
  author={ Huijun Zhou and Xiaohan Zhang and Haozhe Zhang and Haoyang Zhang and Mi Zhang and Min Yang },
  journal={arXiv preprint arXiv:2605.22333},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.22333}
}

A First Measurement Study on Authentication Security in Real-World Remote MCP Servers ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​