Dennis Traub for AWS

Posted on Feb 11

How a subtle MCP server bug almost cost me $230 a month

#ai #mcp #api #agents

An important part of my job is to collect and distill feedback into recommendations for product and engineering teams. Sure, not quite as glamorous as traveling the world, but it's a lot of fun: I'm getting paid to experiment with brand new tech - and I get to directly influence the developer experience of our products.

But - just like every job - there's also a lot of routine work involved, including the boring type. And if there's anything my ADHD brain hates - with a passion! - it's boring routine work. And there's one thing right at the top of the list: processing tasks in a project management tool.

So I built an AI agent to help me: triage tasks, add context, draft comments, move items around - all through an MCP server.

As I said. Routine work. Boring and predictable. Right?

Right. Until I noticed the time it needed for what should be simple updates.

Three Calls, Zero Updates

The MCP server has an update_task tool, and the agent called it with a custom_fields parameter. The server took the request, processed it, and returned success.

But when the agent continued, the custom fields were unchanged. It tried updating again - with a different format. Success. But nothing changed. Third attempt. Success. Still nothing.

3 success responses. Zero successful updates. And the API never told the agent that anything was wrong.

Tracking It Down

So after multiple failed attempts, the agent started investigating on its own. It checked whether it had the right access permissions. Or if it can use curl to bypass the MCP layer entirely.

curl worked, showing that the problem wasn't permissions. So it must be the tool itself.

After some more back and forth, the agent discovered that create_batch_request - a completely different MCP tool - is the only way to update custom fields. The update_task tool accepts the custom_fields parameter without complaint, but the parameter isn't actually in the tool's schema. The tool silently drops it, updates everything else, and returns a success message.

Maybe a small issue, if it happened only once.

But my logs showed 16 silently failed attempts across 7 tasks. The same cycle every time: try, get "success," see nothing changed, investigate, try again, finally find a workaround.

The agent kept hitting the same wall because the API never told it the wall existed.

A crash would have been so much better - the agent would see the error and immediately try a different tool. Instead, it got a success response and had to figure out through downstream verification that the "successful" call hadn't been sucessfull after all.

Each time a new agent instance worked on a task, it went through the same process, wasting ~93K tokens just to figure out that there is a problem - and how to solve it. Learning wasn't possible, because there was no error to learn from.

Let's Look at the Math

Every number below comes directly from my session logs.

Metric	Value
Wasted tokens per failed attempt	~93,000
Average failed attempts per task	2.3
Cost per attempt (Claude Opus at $5/MTok)	~$0.47
Cost per task	~$1.08

Now imagine a 5-person team with 10 tasks per person per day.

Timeframe	Per person	Team of 5
Per day (10 tasks)	$10.80	$54
Per month (22 working days)	$238	$1,190
Per year	$2,856	$14,280

This is one parameter, in one MCP tool, on a single workflow. And even that scales really fast.

Two things are worth mentioning: this models the cost before the workaround is discovered. Once you know to use create_batch_request, the waste drops to zero. And it assumes every task hits the bug - which was true in my case, since every triage task needed custom field updates.

The point isn't the exact dollar figure. It's that silent failures delay discovery - possibly indefinitely.

A crash costs one attempt. The agent sees the error, adjusts, moves on. But silent acceptance costs multiple attempts, every time, until you realize there's a problem - if you realize it at all.

In software engineering, there's a concept called "graceful degradation". But if your API - MCP server or not - accepts parameters it can't handle and returns success, you're not being graceful. You're being expensive.

What You can Do

If you build APIs or MCP tools: Add input validation. Reject or warn on unrecognized parameters. My data says that one check would have saved every token and every dollar in this story.

If you build agents: Verify state after every state-changing call. Don't trust success responses - confirm the change actually happened. And make sure your agent surfaces anything it didn't expect.

Some say we don't need to worry about architecture anymore, because AI agents are able to figure things out.

But I think it's the opposite: Software architecture principles become more important with AI, not less.

This is Part 1 of "The Inconsistency Tax" - a 3-part series on what happens when AI agents meet inconsistent APIs. Next: why these failures aren't random, and why "just wrapping an API in an MCP server" doesn't automatically make it agent-ready.

DEV Community

How a subtle MCP server bug almost cost me $230 a month

Three Calls, Zero Updates

Tracking It Down

Let's Look at the Math

What You can Do

Top comments (0)