After ten days of running the v1.1.0 publishability rubric against every MCP server I can find on npm under the official @modelcontextprotocol scope, the cluster pattern is now
hard to ignore.
6 of 6 official Anthropic-shipped MCP servers score 56–60/100 on the v1.1.0 publishability composite. The cap that fires is the same axis every time: description-five-axis.
| Server | Composite | Protocol | Edge cases | Publish | Per-tool axis avg | Cap |
|---|---:|---:|---:|---:|---:|---|
| server-sequential-thinking | 60 | 100 | 100 | 20 | n/a (single tool) | description-five-axis |
| server-memory | 60 | 100 | 85 | 50 | 1.00 / 5 | description-five-axis |
| server-everything | 60 | 100 | 94 | 20 | 0.55 / 5 | description-five-axis |
| server-filesystem | 60 | 100 | 57 | 50 | 0.88 / 5 | description-five-axis |
| server-github (legacy) | 60 | 100 | 26 | 50 | 0.44 / 5 | description-five-axis |
| server-puppeteer (deprecated) | 56 | 100 | 50 | 20 | 0.17 / 5 | description-five-axis |
Every protocol score is 100. The wire format is right on every server. The 40-point gap is entirely how the schemas read.
## What "0.17 / 5" looks like in practice
Take Puppeteer's puppeteer_navigate. The full schema description is:
Navigate to a URL.
Score that against the 5 axes:
- Purpose — "navigate to a URL" ✓ (1 axis)
- Mutation signal — does it read or write? Silent. ✗
- Side-effects — network call, can hit any URL, executes JS, arbitrary cookie state. High-blast. Silent. ✗
- Invariants — does it close existing tabs? Open a new one? Same tab? Silent. ✗
- Examples — none. ✗
1 / 5. The other six Puppeteer tools score the same way. Average 0.17.
A planner LLM that has to decide whether to call puppeteer_navigate from a tool list of 7 has nothing to pattern-match on. It cannot tell the difference between puppeteer_navigate (mutates browser state, can hit any URL) and puppeteer_screenshot (read-only, current page only) from the schema alone — they read identically.
## Why this matters more than it looks
The reference servers are calibration anchors. When a server author opens the docs to figure out "what does a good MCP server look like", they read these. When an LLM coding agent autocompletes a new MCP server skeleton, it pattern-matches on these. When the spec doc shows "here's how to write a tool", it links to these.
If the bar Anthropic ships at is 56–60/100, that's the bar most third-party servers will start from too — and probably stay at, because there's no public benchmark telling them they're under it.
That's the v1.1.0 thesis: surface the bar so authors can decide where they want to land. mcp-probe score is one command.
```bash npx -y @incultnitollc/mcp-probe score "" --full
The 5-axis breakdown tells you exactly which axis is empty on which tool. Per-tool axis avg below 3.0/5 fires the ≤60 publishability cap. Fix two axes per tool (mutation signal + one concrete example is usually fastest) and the cap lifts.
## Methodology
- v1.1.0 spec: <https://github.com/Incultnitollc/mcp-probe/blob/main/docs/specs/publishability-score-v1.1.0.md>
- Calibration drift notes: <https://github.com/Incultnitollc/mcp-probe/blob/main/docs/specs/publishability-score-v1.1.0-amendments.md>
- 6-server summary (canonical): <https://github.com/Incultnitollc/mcp-probe/blob/main/docs/publishability-scorecards/SUMMARY.md>
- Individual server scorecards: under `docs/publishability-scorecards/` in the same repo
## Caveat — install-time security is a different lane
`mcp-probe` is pre-publish quality (server authors, before they ship). For install-time security (server installers, before they connect a third-party server), see[`@stephenywilson/mcp-doctor`](https://www.npmjs.com/package/@stephenywilson/mcp-doctor). Different audience, different lane, complementary tool.
Top comments (1)
interesting to see all six servers clustering together at such similar scores. it raises questions about the common design patterns across them. at moonshift, you can get a full next.js + postgres + auth app built and deployed in about 7 minutes, and you own the code on your github. if you're curious, happy to set you up for a free run.