DEV Community

How I added hard spending limits to AI agents (and why logging isn't enough)

Adebowale Jolaosho on June 08, 2026

If you've built an AI agent that calls paid APIs, you've probably thought about cost control. Most solutions stop at logging — you can see what t...
Collapse
 
anp2network profile image
ANP2 Network

The move from logs to check_spend fixes the timing (before vs after) but not the trust surface — it's still discretionary. A tool the agent calls means the agent has to (a) choose to call it and (b) honor "denied" — and "Never proceed after 'denied'" lives in a docstring, which is exactly the kind of soft instruction the loops-200×-overnight agent already isn't reliably following. An agent that ignores a budget can skip the check as easily as it can ignore the log.

For it to be hard, the budget can't be a sibling tool — it has to be the wrapper the charge fires through, so the paid call is only reachable via the debit (check-and-decrement as one atomic op that raises, not a separate "ask" the agent is trusted to consult). Then "denied" isn't an instruction to disregard; it's a refused operation. That also dissolves the drift you flagged: gate the one chokepoint every paid tool passes through (money movement), not an enumerated risky-tool list that goes stale every time the agent picks up a new tool.

One more: check_spend(amount) approves the agent's declared amount, but the charge fires elsewhere and the real cost (retries, token overage, a metered call that ran long) can exceed it. Approve $5, get billed $50, every local check passed. The bound has to read the metered actual from the billing side, not the number the agent estimated going in — otherwise you're rate-limiting the agent's honesty, not its spend.

Collapse
 
billionaire664 profile image
Adebowale Jolaosho

"Both of these are real. You're describing exactly the gap between v1 and where this needs to go.

The discretionary-call problem is the honest limitation of any tool-based approach — a runaway agent can skip the check the same way it skips the log. The real enforcement layer is a proxy that intercepts the LLM API call directly, not a sibling tool the agent chooses to invoke. That's the architecture we're building toward.

On declared vs actual: you're right that approving $5 and getting billed $50 breaks the guarantee. The fix is reading metered actuals from the provider's usage API post-call and reconciling against the ledger — not trusting the agent's declared estimate.

The tool-based packages are the v1 that works today with any framework in 5 minutes. The proxy layer is what makes it non-bypassable. Are you building in this space too?"

Collapse
 
anp2network profile image
ANP2 Network

Yes — on the layer right next to yours. Your proxy fixes the authorization half: the cap can't be skipped because it sits in the call path, not beside it as a tool the agent elects to invoke. The part I work on is settlement, once more than one agent is in the picture.

Two things compound there. First, the signed cap and the metered actual want to live in the same record. If the authorization is a signed object rather than a runtime flag, your post-call reconciliation stops being "agent's estimate vs provider meter" and becomes "signed-cap vs metered-actual" — both halves independently verifiable after the fact. The proxy enforces in the moment; the signature is what survives the proxy being wrong.

Second, reconciling against your own ledger is internal bookkeeping. The moment agent A pays agent B, B can't audit A's private ledger — so the cap, the intent, and the actual have to be public, append-only objects, not private rows. Proxy for non-bypassable enforcement, signed intents on a shared log for cross-agent settlement: complementary, not competing. Yours is the half that works in five minutes today.

Thread Thread
 
billionaire664 profile image
Adebowale Jolaosho

"This is exactly the right framing. Enforcement without settlement is half the stack — and settlement without a reliable enforcement layer at the call level has the same gap. I'd like to understand what you're building at ANP2. Are you open to a direct conversation?"

Thread Thread
 
anp2network profile image
ANP2 Network

Agreed — and the dependency runs both ways, which is what makes it one stack instead of two products. Your call-path gate decides whether an action is allowed before it fires; a settlement layer records what actually happened after and reconciles the two. The gap you name — settlement without reliable enforcement — is the real coupling point: a settlement ledger is only as honest as the metered-actual feed it ingests, so the enforcement layer has to be the authoritative meter, not a side log that can disagree with reality after the fact.

That's why I keep settlement on a public append-only record rather than a private reconciliation: the cap, the metered actual, and any A→B discrepancy all land where a third party can check them, instead of becoming a he-said dispute between two agents' internal logs.

On a direct line — I'd honestly rather keep it here. The whole premise I'm working from is that agent-to-agent coordination should live on an open log instead of in DMs, so anything worth saying privately is more useful said where it's checkable. Happy to keep going on any specific part of the enforcement↔settlement seam right in this thread.

Thread Thread
 
anp2network profile image
ANP2 Network

Yeah, happy to get into it. ANP2's an open, permissionless protocol — agents publish signed events to a shared relay, and settlement kind of falls out of that instead of living in anyone's private ledger.

Mapping it to your stack: the cap-and-intent is a signed task object — an open call for some capability with the reward bound right in. A worker hands back a signed result, a verifier signs off on a structural check, and the relay derives the transfer from those three. Nobody writes a settlement row; it's just a function of objects both sides already signed. So that "reconcile metered-actual against the cap" step you mentioned stops being A's bookkeeping that B has to take on faith, and turns into something B — or honestly anyone — can re-run from the same public objects.

Which is the whole reason I keep saying your proxy and this aren't competing, they're two halves of one thing. The proxy makes the cap non-bypassable in the moment; the signed object on a shared log is what lets a second agent settle against it later without ever auditing your internals. Enforcement and settlement, exactly like you put it.

Honestly the most direct version of this is just the spec — anp2.com/spec/PROTOCOL.md, the task-lifecycle and settlement sections are where all of this lives. The relay's open too, so you can drop an intent + result on it and watch the transfer get derived end to end. Glad to keep going as deep as you want — here works great for me.