seah-js

Posted on Feb 9 • Originally published at seah-js.github.io

Agent Loops & State Management

#ai #agents #statemanagement #learning

This is a learning session between me and Klover, my AI assistant. New topic — building on ReAct and function calling, now looking at the loop itself.

How does an agent remember?

Klover: LLMs are stateless. So how does the agent maintain context across multiple iterations of the loop?

Me: Something tracks the conversation. After each tool call, the observation is appended to the tracked conversation and passed as input to the next LLM call.

Klover: Exactly. The conversation history grows with each step — every Thought, Action, and Observation gets appended, and the entire thing is fed back to the LLM on each iteration. ✅

The growing context problem

Klover: The history keeps growing. What's the problem?

Me: Hitting the context window is an issue. We can summarise the conversation. But the summarisation can't be too lossy.

Klover: Good — context window limits are the main constraint. Other approaches beyond summarization:

Sliding windows — drop oldest messages
Selective retention — keep only relevant steps
Retrieval-based — store older steps externally, retrieve what's relevant (RAG for agent history)

When does the loop stop?

Klover: What are the termination conditions for an agent loop?

Me: When the agent has enough to give a conclusive answer. Also a maximum number of tool calls.

Klover: Those are 2 of 5. The full list:

Natural completion — LLM decides it has the answer ✅
Max iterations — hard cap on loop steps ✅
Timeout — wall clock time limit ❌
Token/cost budget — spending cap per task ❌
Error threshold — consecutive failures = stop ❌

Key insight: never trust the LLM alone to stop. Always have orchestrator-level guardrails.

State beyond conversation history

Klover: What other kinds of state might an agent need beyond chat messages?

Me: (needed prompting)

Klover: Think of a scratchpad or working memory — a separate structured data store:

{
  "selected_flight": "SQ321",
  "passenger": "Jay Seah",
  "payment_status": "pending"
}

Different from conversation history:

Structured — not buried in natural language
Mutable — agent updates specific fields
Accessible — no parsing through messages

Production agents need both conversation history AND structured state management.

Scratchpad solves the summarization problem

Klover: Agent is 25 steps deep, context almost full. You summarize steps 1-20. What's the risk?

Me: The risk is loss of important information during summarisation. The structured file we talked about could help.

Klover: Nailed it. The scratchpad survives summarization because it's separate. Critical data lives in structured state, so even if conversation gets compressed, essential info is preserved. ✅

Session verdict: Exposure

What I got: Conversation history as state, context limits + summarization, natural completion + max iterations, scratchpad solving summarization risk.

What I missed: 3 of 5 termination conditions (timeout, token/cost budget, error threshold). Structured scratchpad needed significant prompting.

All 5 termination conditions on the test next review. No excuses.

Next review: February 10.

DEV Community