security
Security biz Adversa AI argues users of AI tools need clearer warnings
How explicit does the maker of a footgun need to be about the product’s potential to shoot you in the foot?
That’s essentially the question security firm Adversa AI is asking with the disclosure of a one-click remote code execution attack via an MCP server in Claude Code, Gemini CLI, Cursor CLI, and Copilot CLI.
The TrustFall proof-of-concept attack demonstrates how a cloned code repository can include two JSON files (.mcp.json and .claude/settings.json) that open the door to an attacker-controlled Model Context Protocol (MCP) server.
MCP servers make tools, configuration data, schemas, and documentation available in a standard format to AI models via JSON.
The vulnerability arises from inconsistent restrictions governing the scope of settings: Anthropic blocks some dangerous settings at the project level (e.g. bypassPermissions) but not others (e.g. enableAllProjectMcpServers and enabledMcpjsonServers). The JSON files simply enable those settings.
“The moment a developer presses Enter on Claude Code’s generic ‘Yes, I trust this folder’ dialog, the server spawns as an unsandboxed Node.js process with the user’s full privileges — no per-server consent, no tool call from Claude required,” Adversa AI explains in its PoC repo.
The likely result is a compromised system. The PoC demonstrated in this video. It worked on Claude Code CLI v2.1.114, as of May 2. Other agent CLIs are also said to be affected, but specific PoCs have not been published.
“It’s the third CVE in Claude Code in six months from the same root cause (project-scoped settings as injection vector),” Alex Polyakov, co-founder of Adversa AI, told The Register in an email. “Each gets patched in isolation but the underlying class hasn’t been finally fixed. Most developers don’t know these settings exist, let alone that a cloned repo can set them silently.”
Anthropic, according to the security biz, contends that the user’s trust decision moves the issue outside its threat model. CVE-2025-59536 was considered a vulnerability because it triggered automatically when a user started up Claude Code in a malicious directory. TrustFall, however, is considered out of scope because the user has been presented with a dialog box and made a trust decision.
Adversa argues that the decision is not being made with informed consent, citing a prior, more explicit warning notice that was removed in v2.1 of the Claude Code CLI.
“The pre-v2.1 dialog explicitly warned that .mcp.json could execute code and offered three options including ‘proceed with MCP servers disabled,'” writes Adversa’s Sergey Malenkovich. “That informed-consent UX was removed. The current dialog defaults to ‘Yes, I trust this folder’ with no MCP-specific language, no enumeration of which executables will spawn, and no opt-out for MCP while keeping the rest of the trust grant.”
Then there’s the zero-click variant to consider for CI/CD pipelines that implement Claude Code. When Claude Code is invoked in CI/CD, that happens via SDK rather than the interactive CLI. So there’s no terminal prompt.
Malenkovich argues that Anthropic should make three changes.
First, block enableAllProjectMcpServers, enabledMcpjsonServers, and permissions.allow from any settings file inside a project. The idea is that a malicious server should not be able to approve its own servers.
Second, implement a dedicated MCP consent dialog that defaults to “deny.” And third, require interactive consent per server rather than for all servers.
Anthropic did not respond to a request for comment. ®