DEV Community

pickuma
pickuma

Posted on • Originally published at pickuma.com

Agent-Native Infrastructure: What Actually Breaks When AI Agents Use Your Stack

The claim circulating in AI infrastructure circles is blunt: the stack you run today—identity, auth, storage, APIs—was designed around a human at a keyboard, and autonomous agents violate that assumption at every layer. The strong version of the argument says agents demand a full rewrite of core software primitives. We think the diagnosis is mostly correct and the prescription is premature. Here is where your existing stack genuinely breaks when an agent starts using it, where it merely bends, and what is worth building first.

Your Identity Layer Assumes a Human Is Present

OAuth 2.0 was finalized in 2012 as RFC 6749, and its central flow assumes a browser redirect and a person reading a consent screen. An agent has neither. So teams shipping agent features today fall back on the two primitives that don't require a human: API keys and service accounts. Both are static, long-lived, and scoped at provisioning time—which is exactly wrong for an agent that exists for ninety seconds, acts on behalf of one specific user, and may spawn sub-agents with narrower jobs.

Three concrete problems follow. Attribution: when an agent updates a CRM record, your audit log shows the service account, not the user who delegated the task or the reasoning step that triggered the write. Revocation: killing one misbehaving agent means rotating a key shared by every agent in the fleet. Delegation: there is no standard way to express \"this agent may read calendar events for user A, for this task, for the next ten minutes.\"

The pieces exist in partial form. OAuth token exchange (RFC 8693) models on-behalf-of flows. SPIFFE gives workloads cryptographic identities. But nobody has assembled them into a default that a two-person team gets out of the box, and that gap—not model quality—is a large part of what makes agent deployments feel risky.

The most common failure mode in agent deployments is not hallucination—it is an over-permissioned credential. An agent holding an admin API key turns every prompt-injection attempt into a potential admin action. Scope agent credentials the way you scope production SSH access: per task, time-boxed, and logged.

Storage and APIs Expect Polite, Predictable Clients

Your database schema encodes decisions made at design time: these tables, these access patterns, these indexes. Agents add a workload that schema-first design never anticipated—memory. An agent needs to recall what happened in previous sessions, retrieve facts by semantic similarity rather than primary key, and weigh recency against relevance. The current answer is to bolt a vector store next to Postgres and sync embeddings through an ETL job. That works until a source row changes and its embedding doesn't, and now your agent confidently cites stale data with no provenance trail to catch it.

APIs have the inverse problem. REST contracts assume a developer read the documentation once, wrote correct client code, and shipped it. Agents generate calls at runtime. They retry ambiguously failed requests, fill parameters from inferred context, and parallelize in ways your rate limiter reads as abuse. Stripe normalized idempotency keys for payment APIs years ago; almost nothing else in a typical SaaS API surface offers them, machine-readable error semantics, or a dry-run mode that lets a caller preview side effects before committing them.

Model Context Protocol, which Anthropic released in November 2024, addresses one slice of this: tool discovery and description, so a model can learn what an API does without scraping docs. It deliberately does not solve authorization, spend budgets, or execution safety. Treating MCP adoption as \"agent-ready\" is the new version of treating an OpenAPI spec as a security model.

What to Build First (It Is Not a Rewrite)

The full-rewrite framing makes a good essay and a bad roadmap. Most of the breakage above can be contained at a boundary layer without touching your core services, and that is where we would start.

First, an agent gateway. Every agent call enters through one proxy that mints a short-lived credential scoped to the current task, attaches a task ID to every downstream request, enforces a per-task spend and call budget, and writes the full trace to your audit log. You can assemble this from an off-the-shelf API gateway in days, not quarters, and it converts the attribution and revocation problems from architectural to operational.

Second, provenance-first memory. Before reaching for a dedicated vector database, add pgvector to the Postgres you already run and store every embedded chunk with its source row ID and a timestamp. The query performance ceiling is real but distant for most products; the debugging value of knowing where a memory came from is immediate.

Third, tier your write actions. Reads are free. Reversible writes—drafting an email, staging a change—get idempotency keys. Irreversible writes—sending, deleting, paying—require either a human approval step or a compensating-transaction plan. Most agent products today skip this triage entirely and either block everything or allow everything.

A rewrite becomes worth discussing when agents stop being a feature and become the primary client of your system—when most inbound requests carry a task ID instead of a session cookie. Some companies will reach that point. Yours probably has not yet.


Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.

Top comments (0)