If you’ve ever built or worked with AI agents that use tools via the Model Context Protocol (MCP), you’ve probably felt the pain that nobody talks about out loud:
The tool catalog is eating your entire context window and budget.
A single tool defined in MCP JSON Schema typically consumes 100–270 tokens. With 50 tools installed, you’re already spending 5,000–13,500 tokens before the user even writes their first message.
This isn’t just expensive — it actively hurts performance:
- Higher cost on every single request
- Lower tool-selection accuracy as the catalog grows (attention dilution)
- Less room for actual user instructions, memory, or reasoning
The good news? There’s a clean, elegant solution: TERSE Tool Catalog (TTC).
The Problem with Today’s MCP JSON Schema
The current MCP format was designed for machine-to-machine execution contracts, not for LLM reasoning. As a result:
- There is no explicit trigger condition (
WHEN) — the LLM has to guess from a free-formdescriptionstring. - There is no error contract (
ERR) — the model has no idea what to do when a tool fails. - There is no retrieval taxonomy (
TAGS) — dynamic tool retrieval (RAG over tools) becomes painful. - Verbose parameter descriptions add noise with almost zero signal for the LLM.
The result is high cost + mediocre tool selection.
Introducing the TERSE Tool Catalog (TTC)
TTC is an official extension of the TERSE Format — a specification for dense, deterministic, human-and-machine-readable representations optimized for LLMs.
It is not just a compression of MCP JSON. It is a semantic reformulation of the tool contract.
TTC keeps everything the LLM actually needs for execution and adds three fields that MCP is missing:
-
PURPOSE— clear one-line intent -
WHEN— explicit semantic trigger (the most important field for selection) -
ERR— declared failure modes -
TAGS— taxonomy for semantic grouping and retrieval
Measured result: average 66.6% token reduction with net information gain.
TTC Syntax — Clean and Simple
TOOL <tool-id>
PURPOSE: <one-line description of what the tool does>
IN: <param1>:<type>, <param2>:<type>?
OUT: <return-type>
ERR: <error1> | <error2> | <error3>
WHEN: <natural language trigger condition>
TAGS: <tag1>, <tag2>, <tag3>
Supported Types
-
string,int,float,bool -
array[string],array[int], etc. -
object,any
The ? suffix marks an optional parameter.
Real-World Example: gmail_send_email
MCP JSON Schema (208 tokens):
{
"name": "gmail_send_email",
"description": "Sends an email message via the Gmail API to one or more recipients...",
"input_schema": { ... } // very verbose
}
TTC (55 tokens):
TOOL gmail_send_email
PURPOSE: send email via Gmail
IN: to:string, subject:string, body:string, cc:string?
OUT: message_id:string
ERR: auth_failed | quota_exceeded | invalid_recipient
WHEN: user wants to send or compose an email
TAGS: gmail, email, communication
Same semantic content. 73.6% fewer tokens. And the LLM now has structured fields to make much better decisions.
Real Benchmark (10 Production Tools)
| Tool | JSON Schema | TTC | Reduction |
|---|---|---|---|
| gmail_send_email | 208 | 55 | 73.6% |
| gmail_read_inbox | 121 | 52 | 57.0% |
| drive_list_files | 141 | 53 | 62.4% |
| calendar_create_event | 262 | 78 | 70.2% |
| slack_send_message | 206 | 69 | 66.5% |
| github_create_issue | 269 | 84 | 68.8% |
| ... | ... | ... | ... |
| TOTAL (10 tools) | 1948 | 650 | 66.6% |
Projection at scale:
- 50 tools → ~9,740 → ~3,250 tokens
- 100 tools → ~19,480 → ~6,500 tokens Savings: ~13,000 tokens per request
Why TTC Works So Well
It follows the core TERSE principles:
- Maximum information density per token
- Determinism (same input → same output)
- Human + machine readability
- Full composability (tools → servers → agent context)
And it adds exactly what LLMs need for better reasoning:
-
WHENbecomes the primary discriminator for tool selection -
ERRenables graceful degradation and fallback strategies -
TAGSmakes dynamic tool retrieval (RAG over tools) trivial
How to Use It in Your Agent Context
At the start of a conversation (or via dynamic retrieval), you inject:
TOOLS v1.0 [3/47]
MCP gmail v1.2
TOOL gmail_send_email
...
MCP google_drive v2.0
TOOL drive_read_file
...
With semantic tool retrieval, you only inject the 3–5 most relevant tools per request. Context cost becomes sub-linear no matter how large your total catalog grows.
Reference Converter (Python)
The author provides a ready-to-use reference implementation:
github.com/RudsonCarvalho/terse-format
It converts MCP JSON Schema → TTC with sensible defaults. For production use, you simply add explicit annotations for OUT, ERR, WHEN, and TAGS on the server side.
Planned Future Extensions
-
EXAMPLEblock — input/output examples for few-shot learning -
COSTannotation — estimated token/latency cost per call -
CHAINannotation — tool dependencies and composition patterns -
ALIASfield — alternative trigger phrases -
AUTHannotation — required OAuth scopes
Conclusion
The TERSE Tool Catalog is not just a token-saving trick. It is a genuine improvement in agent quality — better tool selection, better error handling, and native support for semantic tool retrieval.
If you work with agents, MCP, LangGraph, CrewAI, AutoGen, or any modern agentic framework, TTC is worth trying today.
Links
📄 Full spec (Zenodo): https://doi.org/10.5281/zenodo.19869007
💻 GitHub: https://github.com/RudsonCarvalho/terse-format/tree/main/extensions/ttc
🌐 Landing page: https://rudsoncarvalho.github.io/terse-format/
📦 TERSE Format (parent spec): https://doi.org/10.5281/zenodo.19058364
Top comments (4)
Cutting tool-catalog token usage is attacking a cost most people don't even see, because the tool definitions sit in the prompt on every single call whether or not the agent uses them, so a fat catalog is a flat tax on every request, and it scales with the number of tools, not the work done. A terser catalog is pure margin, you pay less per call for the same capability. The deeper insight your title hints at is that the catalog is also a context-quality problem, not just cost: a huge verbose tool list doesn't only cost tokens, it makes the model choose worse, because more options and more noise dilute attention on the right tool. So trimming it usually helps accuracy and cost together, which is the best kind of optimization. The thing I'd watch is the floor, terse can't become ambiguous, the tool name and signature still have to carry enough meaning for the model to pick correctly, so the art is minimum tokens that preserve unambiguous selection, not just shortest. Trim the always-present overhead, but keep each tool legible enough to choose right. That cut-the-flat-tax-without-losing-clarity instinct is core to how I think about cost in Moonshift. Did the terser catalog also improve tool-selection accuracy, or purely the token bill?
Both, but for structural reasons, not as a side effect of compression. TTC attacks selection quality through three mechanisms that operate on the context itself.
First, WHEN turns selection from inference into matching. With MCP, the model has to reconstruct the trigger condition from a free-form description; with TTC, the discriminator is declared. Less reasoning spent on "what is this tool for?" means more attention on "is this the right tool now?"
Second, exactly the attention-dilution point you raised: a 66% smaller catalog isn't just cheaper, it's a cleaner signal-to-noise ratio over the same capability set. The tokens removed were mostly redundant parameter prose, noise by your own definition.
Third, and this is where it compounds: TAGS makes RAG-over-tools trivial, so at scale the model never even sees 50 tools... it sees the 3–5 relevant ones. That doesn't mitigate attention dilution; it removes it. Catalog size stops being a variable in selection quality at all.
On the floor: fully agree, and it's why TTC is a semantic reformulation rather than maximal compression. PURPOSE + WHEN + typed signature are the legibility floor, below that you trade a token tax for a selection tax. Formal accuracy numbers (MCP vs. TTC, same model, same tasks) are the next thing I'm publishing.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.