nagasatish chilakamarti

Posted on Feb 18

Agentic AI Security Series (Part 3): A Layered Security Model That Scales

#ai #security #agenticai #devsecops

Agentic AI Security Series (Part 3)

A Layered Security Model for Agentic AI Systems

In Part 1, we saw why AI agents break traditional security assumptions.

In Part 2, we used the OWASP Agentic AI Top 10 to understand how agents fail in production.

In Part 3, we answer the most important question for security leaders:

Where should controls actually live?

Not everything belongs in prompts.

Not everything belongs in a platform.

And not everything should be centralized on day one.

This post introduces a layered agent security model that maps risks to architecture — in a way that scales from early development to enterprise deployment.

A Top‑Down View: From Enterprise Security to Agentic AI

Enterprise security has always been layered.

Infrastructure security protects compute and networks
Application security protects logic and data flows
Identity and governance define who can do what — and why

When AI systems were introduced, these layers expanded to include:

Model access controls
Training data governance
Prompt and output filtering
Evaluation and monitoring of model behavior

These controls work well for single‑step, stateless AI interactions.

Why Agentic Systems Change This

Agentic systems introduce:

Long‑lived memory
Tool execution
Multi‑step planning
Autonomous decision‑making across time

At this point, traditional AI and LLM security controls become necessary but insufficient.

Agentic AI does not replace enterprise security layers —

it sits on top of them, inheriting their assumptions and amplifying their failures.

This is why agentic security must be layered deliberately, rather than bolted onto prompts, frameworks, or platforms after the fact.

The Core Idea: Security Belongs at Multiple Layers

A common mistake organizations make is trying to solve agentic security in one place:

“Let’s add better guardrails”
“Let’s rely on the agent framework”
“Let’s buy a platform and centralize everything”

None of these work alone.

Agentic security works only when controls are layered, with each layer having a clear responsibility.

At a high level, there are three layers:

SDK Layer — close to developers and code
Runtime Enforcement Layer — where actions are mediated
Platform & Governance Layer — where organizations manage risk at scale

Each layer solves a different class of problems.

Layer 1 — SDKs (Developer‑Local Controls)

What this layer is

The SDK layer lives inside the application where agents are built.

It is closest to developers, fastest to adopt, and easiest to evolve.

This layer should handle baseline safety and containment, not enterprise‑wide governance.

What belongs here ✅

1. Input & Output Guardrails

Prompt injection detection
PII detection / redaction
Content moderation
Schema validation for outputs

These reduce likelihood of failure, but don’t contain blast radius.

2. Cost & Resource Controls

Per‑request cost limits
Token ceilings
Retry caps
Loop bounds

This directly mitigates runaway agents early.

3. Context Hygiene

Treat retrieved documents and tool outputs as untrusted
Basic provenance tagging
Separation between user intent and retrieved data

Especially important for RAG‑heavy agents.

4. Lightweight Memory Guards

Classify memory writes (facts vs preferences vs instructions)
Block instruction‑like persistence by default
Scope memory to user/session where possible

This addresses memory poisoning without building a platform.

What does not belong here ❌

Centralized policy management
Cross‑application identity governance
Org‑wide audit correlation
SOC workflows

SDKs should enable safety by default, not replace governance.

Layer 2 — Runtime Enforcement (Action‑Layer Security)

What this layer is

The runtime layer is where most organizations underinvest —

and where most agent incidents actually occur.

This layer sits between the agent and the real world.

Think of it as a control plane for actions, not for text.

What belongs here ✅

1. Tool & Action Mediation

Every tool call should pass through:

Allow/deny checks
Parameter constraints
Least‑privilege credentials
Rate limits and timeouts

Even if the model is compromised, actions must remain constrained.

2. Identity Binding

Bind every action to a human initiator and tenant
Enforce task‑scoped permissions
Prevent model‑chosen resource identifiers

Agents should never operate as anonymous super‑users.

3. Plan Validation

For multi‑step agents:

Validate plans before execution
Gate high‑risk steps
Require human approval for destructive actions

This is how goal hijacks become non‑events.

4. Real‑Time Observability

Tool‑call telemetry
Denied actions
Plan drift
Retry storms
Cost curves

Critically: observe actions, not just outputs.

5. Response Hooks

Kill switches
Safe mode
Token revocation
Session quarantine
Memory freeze

Without response, detection is useless.

What this layer does not do ❌

Long‑term evidence management
Cross‑org reporting
Risk ownership tracking
Compliance attestations

That’s the next layer.

Layer 3 — Platform & Governance (At Scale)

What this layer is

The platform layer exists once you have:

Multiple agents
Multiple teams
Shared risk
Regulatory or audit pressure

This layer turns controls into organizational capability.

What belongs here ✅

1. Central Policy Management

Shared policy definitions
Versioning and rollout
Environment promotion (dev → prod)
Exceptions and break‑glass workflows

2. Agent Inventory & Lifecycle

Register agents
Track ownership
Scope capabilities
Rotate credentials
Decommission cleanly

Essential to prevent rogue agents.

3. Audit & Evidence

Tamper‑evident logs
Cross‑agent correlation
Retention policies
SIEM / GRC export

This is what auditors care about.

4. Risk & Compliance Mapping

Map controls to frameworks (OWASP, NIST AI RMF, internal)
Track coverage gaps
Measure residual risk

5. Human Governance

Approval workflows
Incident playbooks
Operator training
Clear accountability

Governance is not automation — it’s decision clarity.

Why platform‑first fails

Organizations that start here usually:

Slow down developers
Centralize too early
Build brittle processes
Lose adoption

Platform works only after SDK and runtime layers exist.

How the Layers Work Together

SDKs reduce likelihood
Runtime enforcement limits impact
Platform governance manages organizational risk

Or more simply:

SDKs keep agents well‑behaved.

Runtime keeps them contained.

Platforms keep organizations accountable.

A Practical Adoption Path

Phase 1 — Start with SDKs

Guardrails
Cost limits
Context hygiene

Phase 2 — Add runtime enforcement

Tool mediation
Identity binding
Kill switches

Phase 3 — Introduce platform governance

Central policies
Audit and evidence
Risk ownership

Trying to skip phases usually backfires.

Final Takeaway for Security Leaders

Agentic AI security is not about finding the right control.

It’s about placing the right controls at the right layer.

You cannot govern what you cannot contain.

And you cannot contain what you cannot observe.

What’s Next (Part 4 Preview)

In Part 4, we’ll go deep into Layer 1: the SDK layer — the controls that should ship secure-by-default with every agent.

We’ll cover:

input/output guardrails (prompt injection, PII, unsafe content)
cost controls (token ceilings, retry caps, loop bounds)
context hygiene for RAG (treat retrieval as untrusted)
memory safety (what agents are allowed to remember — and what they must never persist)

Because if SDKs don’t establish baseline containment early, every later layer becomes harder to enforce.

Series Navigation

← Part 2 · Part 3

This series is written by a practitioner working on real‑world agentic AI security systems.
Some of the architectural insights here are informed by hands‑on experience building
developer‑first security tooling in the open.