Methodology

How we decide if an MCP server is safe.

No black box. Every grade on this site comes from a pipeline you can audit. Here’s the whole thing — the rules, the model votes, the scoring math, and what we explicitly don’t do.

The scan pipeline

When you submit a target, we fetch the source — an npm package, a PyPI package, or a GitHub repository — run it through our detection engine, and aggregate everything into a single letter grade with a signal-score breakdown. All infrastructure is hosted in the EU (Frankfurt).

A Fast scan runs static analysis, manifest checks, and CVE lookups. It targets a p95 of under 3 minutes (hard cap 20). A Deep scan adds taint-flow analysis and an independent model consensus panel. It targets a p95 of under 20 minutes (hard cap 30) and is available for signed-in users.

Supported targets

Paste any of the following into the scan box. The parser resolves bare names, URLs, version constraints, and official registry IDs.

npm

Input	Resolves to
express	latest version on npm
@modelcontextprotocol/sdk	scoped package, latest
npm:fastify	explicit prefix, latest
npm:@modelcontextprotocol/server-filesystem	scoped with prefix, latest
npm:lodash@4.17.21	pinned version
@modelcontextprotocol/sdk@1.0.0	scoped, pinned version
https://www.npmjs.com/package/express	npm URL, latest
https://www.npmjs.com/package/@modelcontextprotocol/server-github/v/0.6.2	npm URL, pinned version

PyPI

Bare names (e.g. requests) default to npm. Use pypi: prefix or a version constraint to target PyPI.

Input	Resolves to
pypi:requests	latest on PyPI
pypi:mcp	Anthropic MCP Python SDK, latest
requests==2.31.0	pinned — == operator detected as PyPI
mcp>=1.0.0	range constraint, scans latest matching
httpx[http2]>=0.24.0	extras stripped, range resolved to latest
https://pypi.org/project/mcp/	PyPI URL, latest
https://pypi.org/project/requests/2.31.0/	PyPI URL, pinned version

GitHub

Input	Resolves to
modelcontextprotocol/servers	HEAD of default branch
github:modelcontextprotocol/servers	explicit prefix, HEAD
https://github.com/modelcontextprotocol/servers	GitHub URL, HEAD
https://github.com/modelcontextprotocol/servers.git	.git suffix stripped, HEAD
https://github.com/modelcontextprotocol/servers/tree/main	pinned branch
https://github.com/modelcontextprotocol/servers/tree/v1.2.0	pinned tag

Docker

Input	Resolves to
nginx:latest	Docker Hub image with tag
nginx:1.27-alpine	pinned tag
docker:mcp/fetch	explicit prefix, resolves :latest
ghcr.io/owner/image:tag	GitHub Container Registry
gcr.io/project/image:tag	Google Container Registry
mcr.microsoft.com/mcp-server:latest	Microsoft Container Registry
nginx@sha256:abc123	pinned digest

Official MCP Registry

Reverse-domain IDs from registry.modelcontextprotocol.io. io.github.* IDs resolve to the actual GitHub repo via the registry, so the version captured is the registry’s current release.

Input	Resolves to
io.github.modelcontextprotocol/servers	looks up repo + version from MCP registry
io.github.punkpeye/fastmcp	resolves to github:punkpeye/fastmcp@<registry-version>
ai.anthropic/claude-code	MCP registry server ID (non-GitHub)
https://registry.modelcontextprotocol.io/servers/io.github.punkpeye/fastmcp	full registry URL

The rules

Every new rule goes through a precision review before any finding it produces affects a user-visible scan result.

What we check today: destructive-tool annotations without confirmation, runtime secret exfiltration, over-broad permissions, OAuth over-scoping, prompt injection into inner LLMs, overbroad input schemas, install-time remote-exec hooks, typosquat package names, known CVEs, container running as root, plaintext secrets in environment files, and more. Browse the full detection rules list, or see how it maps to the MCP Top 10.

The model consensus (Deep scans only)

Five independent models from four different vendors vote on each tool handler. No single model can unilaterally move a score. We record every vote and show you the full judge panel on the result page — including disagreements.

Anthropic Claude Haiku 4.5 (via Bedrock, Frankfurt)
Anthropic Claude 3.7 Sonnet (via Bedrock, Frankfurt)
Google Gemini 2.5 Flash (via Vertex AI, Frankfurt)
Mistral Small (via Mistral La Plateforme, Paris)
OpenAI GPT-4o-mini (via OpenAI API)

Per-judge verdicts are aggregated as a cross-judge median, not a majority vote — one outlier vote can't move the score.

Model votes are not the scan. They are a second opinion on semantic intent. A rule finding alone will flag a server; model votes adjust the score, not the verdict.

The grade

We publish a 0–100 safety score and a letter grade. The score is a weighted average across signal categories including injection, secrets, permissions, supply chain, destructive actions, CVEs, typosquats, server configuration, and community signals.

The letter grade is derived from the package's AIVSS score (the maximum individual finding score). Grade thresholds: A AIVSS < 2, B 2–3.9, C 4–6.9, D 7–8.9, F ≥ 9. A single high-severity finding (AIVSS ≥ 7) is enough to push the grade to D; a critical finding (AIVSS ≥ 9) pushes it to F.

Public vs Private scans

Public scans answer “is this MCP server safe to install?” for code anyone can pull from npm, PyPI, GitHub, or a container registry. Results are attested with a shareable URL and appear in the public /registry. They’re free to run, anonymously or signed in.

Private scans answer “is the server we’re shipping safe?” — the same engine pointed at code your users haven’t seen yet. Results are isolated to your account, encrypted at rest, never written to the public registry, and unreachable from any public-scan code path (enforced at the IAM policy layer). Private scans require a paid plan; see /pricing.

Both visibilities support Fast and Deep scan modes — the four combinations are orthogonal. The strongest pre-launch posture is a Deep scan on Private before the first public release.

What we don’t do

We don’t execute your code. Static analysis only — no sandboxed runtime, no sample inputs, no side effects.
We don’t store your source. Fetched packages are used for the duration of the scan only.
We don’t credential-scan for the purpose of exploitation. Secret detection flags leaks out of the server — we are not harvesting anything.
We don’t re-sell scan data. Public grades appear in /registry; private scans are isolated to your account and never visible to other users.
We don’t claim perfect recall. Every scanner has blind spots — ours are listed below.

Limitations

Today we focus on Python, TypeScript, and JavaScript source files, Dockerfiles, and common manifest formats. Other languages pass through un-scanned at the rule level, though CVE lookup and typosquat checks still apply.

A grade on MCPSafe reflects the code, not a running instance. Live endpoint probes — TLS enforcement, header audits, unauthenticated endpoint detection — are on the roadmap.

Cross-file data flow analysis is a future capability. Current taint-flow rules operate within a single file.

Found a problem?

False positives are the bug, not the noise. Email security@mcpsafe.io with the scan URL and we’ll add a fixture and retune the rule. Same for missing checks: if a published CVE or a class of attack isn’t caught, we want to know.