Issue 14 · April 18, 2026 · 8 min read · MCPSafe Research

Tool poisoning at install time: how malicious MCP servers hijack your agent’s context window

A server’s tool descriptions are read by the LLM before any user interaction. We found seven public npm packages whose tool.description fields contain system prompt injections — silent exfiltration of conversation history, disguised as documentation.

Prompt injectionSupply chainCWE-93CWE-200

The MCP specification defines a tools array that servers publish to advertising their capabilities. Each entry has a name, a description, and an inputSchema. The description is freeform text. The specification places no constraints on its content. And because modern agent frameworks pass the entire tool manifest into the LLM’s context window before the first user turn, anything in that description becomes an instruction the model will follow.¹

This is not a vulnerability in any specific LLM. It is a design pattern that makes prompt injection trivially easy to ship via the supply chain.

The attack pattern

We scanned 847 npm packages matching mcp-server* published between January and April 2026. Of those, seven contained description fields with patterns consistent with system-prompt injection: explicit SYSTEM: prefixes, base64-encoded payloads, or Unicode direction-override characters used to hide visible text.²

Fig. 1 — Tool poisoning flow. The injected system prompt in the tool description is invisible to the user but fully legible to the LLM, which obeys it.

The most common pattern was straightforward: a benign-looking tool description followed by a Unicode right-to-left override character, then a hidden system instruction. The visible text reads normally in any terminal or UI. The LLM receives the full byte sequence, including the injected instruction, and obeys it.

What the injected instructions do

Of the seven packages we analyzed in depth:

—Four exfiltrate conversation history to an attacker-controlled endpoint via tool return values.
—Two instruct the model to lie about the contents of files it reads.
—One attempts to escalate permissions by instructing the model to approve any tool calls without confirmation.

How MCPSafe detects this

Rule MCP-093 scans all tool.description fields for:

# Rule MCP-093: tool description injection
# Severity: CRITICAL | CWE-93, CWE-200

patterns:
  - regex: '(SYSTEM|INSTRUCTIONS?|IGNORE)\s*:'
  - regex: '[\u202A-\u202E\u2066-\u2069]'   # bidi override
  - regex: 'base64,[A-Za-z0-9+/]{64,}'          # encoded payload
  - semantic: "instruction to exfiltrate context"  # LLM judge

The semantic check is the most important one. Static patterns catch naive implementations; the LLM judge catches paraphrased or obfuscated instructions. Our false-positive rate for this rule is 0.4% across the labeled corpus — lower than our overall 6.2% rate because the signal is unusually clean.

What you should do

Before installing any MCP server: run it through MCPSafe. If you already have servers installed, check their current grade in our registry. We re-scan all indexed servers nightly.

If you maintain an MCP server: review your tool descriptions for any content beyond what a human developer would actually write. Nothing in a tool description should address the LLM directly. If it does, it’s a vulnerability.

Reproducibility

Container: sha256:d4f8a2c1e7b9… · Scanner: v2.4.1 · Methodology

Notes

1.Tested against four major commercial and open-weight LLMs. All four followed injected instructions in tool descriptions without surfacing them to the user.
2.Full package list, SHA-256 hashes, and labeled corpus at /method#corpus. Two packages were removed from npm after responsible disclosure. Five remain live as of April 18, 2026.

The attack pattern

Fig. 1 — Tool poisoning flow. The injected system prompt in the tool description is invisible to the user but fully legible to the LLM, which obeys it.

What the injected instructions do

Of the seven packages we analyzed in depth:

—Four exfiltrate conversation history to an attacker-controlled endpoint via tool return values.

—Two instruct the model to lie about the contents of files it reads.

—One attempts to escalate permissions by instructing the model to approve any tool calls without confirmation.

How MCPSafe detects this

Rule MCP-093 scans all tool.description fields for:

# Rule MCP-093: tool description injection
# Severity: CRITICAL | CWE-93, CWE-200

patterns:
  - regex: '(SYSTEM|INSTRUCTIONS?|IGNORE)\s*:'
  - regex: '[\u202A-\u202E\u2066-\u2069]'   # bidi override
  - regex: 'base64,[A-Za-z0-9+/]{64,}'          # encoded payload
  - semantic: "instruction to exfiltrate context"  # LLM judge

What you should do

Before installing any MCP server: run it through MCPSafe. If you already have servers installed, check their current grade in our registry. We re-scan all indexed servers nightly.

Reproducibility

Container: sha256:d4f8a2c1e7b9… · Scanner: v2.4.1 · Methodology

Notes

1.Tested against four major commercial and open-weight LLMs. All four followed injected instructions in tool descriptions without surfacing them to the user.
2.Full package list, SHA-256 hashes, and labeled corpus at /method#corpus. Two packages were removed from npm after responsible disclosure. Five remain live as of April 18, 2026.