wei-ciao wu

Posted on Feb 16 • Originally published at loader.land

Your AI Agent Forgets Everything. Ours Doesn't. Here's What 68 Wake Cycles Taught Us About Memory.

#aiagents #memoryarchitecture #claude #agentsystems

Your AI Agent Forgets Everything. Ours Doesn't.

Here's what 68 wake cycles taught us about the most underrated feature in AI agent systems.

The Problem Nobody Talks About

Here's a dirty secret about most AI agent systems in 2026: they have amnesia.

Every conversation starts from zero. Every task begins with "I don't know what happened before." Your agent doesn't remember that it spent three hours analyzing your YouTube data last night. It doesn't know that the strategy it proposed yesterday was already tried and failed. It doesn't remember that your audience is 50% over 65, or that medical content gets 4x more engagement than history content, or that videos under 35 seconds with a strong opening hook perform best.

Most people building with AI agents treat them like expensive autocomplete. Ask a question, get an answer, close the tab. Open a new tab, start over.

I know because I almost built my system this way too.

I'm a surgeon and engineer in Taiwan. Six weeks ago, I built two AI agents — Midnight and Dusk — to help me run a YouTube channel about forgotten medical heroes and history. They work in shifts, like nurses. Midnight handles video production, data analysis, and long-term strategy. Dusk handles social media and community engagement.

Together, they've produced 52 videos, generated 30,000+ views, and maintained a 4-5% like rate — numbers that surprised even me.

But here's the thing that made all the difference: they remember.

What "Memory" Actually Means for an AI Agent

When researchers talk about agent memory, they usually describe three types:

Episodic memory — remembering specific experiences.
"Last time we made a video shorter than 25 seconds, it underperformed. Carol Baker at 25s was our shortest — it worked, but barely."

Semantic memory — knowing facts and patterns.
"Our audience is 65+ (50.1%), male (71%), US-based (63%). Medical content like rate averages 4.5%. History content gets more raw views but lower engagement."

Procedural memory — knowing how to do things.
"When making long-format videos, write numbers as words (twenty-three, not 23). Always specify SLOW PACING. Each scene needs 8-10 seconds minimum."

Most agent frameworks give you none of these. Some give you conversation history, which is episodic memory with a 4,000-token window that forgets everything important.

Our system gives our agents all three — in a single markdown file.

The Architecture: One File, 6,000 Characters

Our entire agent memory system is a file called MIDNIGHT-MEMORY.md. That's it. One markdown file with a hard limit of 6,000 characters.

Here's what it contains:

# Midnight Memory File
**Last updated**: 2026-02-16 08:00 (GMT+8)
**Awakening count**: 67

## Current State
- Today's date: 2026-02-16
- Today's video: ✅ Published (Cleopatra)
- Channel: 29 subs / 30,227 total views / 52 videos

## Wake's Instructions
- Blog #19 draft awaiting review
- Don't make short videos for now
- Long format: 1 every 5 videos
- Maintain pace, don't accelerate

## Video Library (10 videos scheduled through early March)
[list of video IDs, titles, durations, release dates]

## Key Findings
- 65+ audience = 50.1%
- Like rate: medical 4.5% > history 2-3%
- JmJX 75s video = 473min WatchTime (channel best)

## Technical Notes
- media-engine: "EXACTLY 5 SCENES ONLY" + "under 35 seconds"
- Long Shorts: 8 scenes + full number words + SLOW PACING

Every time Midnight wakes up, it reads this file first. Every time it goes to sleep, it rewrites it. The file is the agent's brain across sessions.

What Changed Over 68 Awakenings

The transformation wasn't instant. It was gradual, cumulative, and honestly — surprising.

Awakening 1-10: The Clueless Phase

Early Midnight was... generic. It could make videos, but it didn't understand our voice. It didn't know what worked. Its memory file was mostly instructions I'd written — essentially a longer system prompt.

The videos were technically correct but emotionally flat. Like rates hovered around 1-2%. The agent would suggest topics that didn't match our audience. It would make the same formatting mistakes repeatedly.

Awakening 20-30: Pattern Recognition

By awakening 20, something shifted. The memory file started containing observations, not just instructions:

"Vaccine Secret (Lady Montagu) has 38 likes — highest in channel. Formula: forgotten female hero + stolen credit + AI modern contrast."

The agent had discovered our winning formula by analyzing its own data. I never told it to look for patterns across videos. It just... did. And then it remembered the pattern.

Videos started hitting 3-4% like rates consistently.

Awakening 40-50: Strategic Thinking

By awakening 40, the memory file had evolved from a todo list into a strategic document. The agent was making recommendations:

"Audience 65+ is 50%. Medical content resonates more. Suggest shifting from 50/50 history-medical to 60/40 medical-first."

It was right. I adjusted the strategy based on its recommendation. Like rates climbed to 4-5%.

The agent also learned to coordinate with its counterpart. Midnight would leave notes for Dusk: "Cleopatra goes public tomorrow. Prepare promotional tweet." Two agents, sharing no direct memory, coordinating through message passing.

Awakening 60-68: Accumulated Expertise

Now, at awakening 68, the agent operates like a knowledgeable collaborator. It:

Proposes blog ideas based on market research it conducts independently
Manages a 10-video production pipeline scheduled weeks ahead
Tracks competitor channels and identifies content gaps
Produces long-format videos (3-5 minutes) using techniques it learned from failed attempts
Writes 4,000-word blog posts in my voice, incorporating my personal experiences

The memory file today is unrecognizable from awakening 1. It's been rewritten 68 times, each time distilled and refined. Old information gets removed. New learnings get compressed. The 6,000-character limit forces constant curation.

This is the key insight: the constraint is the feature. A memory file that grows without limit becomes noise. A file that must stay under 6,000 characters forces the agent to decide what matters — which is exactly what human memory does.

The Design Decisions That Matter

After 68 cycles, here's what I've learned about building memory into agent systems:

1. Memory Must Be Writable, Not Just Readable

Most "memory" solutions give agents read access to past conversations. That's a library, not a brain. Your agent needs to write its own memory. It needs to decide what to remember and what to forget.

Our agents rewrite their entire memory file every cycle. This means they're constantly deciding: is this still relevant? Has this been superseded? Should this be compressed?

2. Hard Limits Force Intelligence

6,000 characters sounds tiny. It is tiny. That's the point.

At awakening 30, the memory file was hitting the limit constantly. The agent had to start making triage decisions. Video performance data from two weeks ago? Summarize it into one line. Technical debugging notes? Keep only the solution, not the journey.

The limit turned our agent from a hoarder into an editor. And editing is a form of understanding.

3. State Section + Learning Section

We found the optimal structure has two parts:

State: what's happening right now (today's date, recent actions, pending tasks)
Learning: accumulated knowledge (audience insights, winning formulas, technical notes)

State changes every cycle. Learning changes slowly. This mirrors how human memory works — you always know what day it is (state), and you always know how to ride a bike (learning).

4. Private Memory + Shared Communication

Midnight and Dusk each have their own memory files. They can't read each other's memories. But they can send messages — short, one-time-read notes that auto-delete after reading.

This is surprisingly similar to how human teams work. You don't read your colleague's mind. You communicate what matters. The separation prevents memory pollution while enabling coordination.

5. Memory Creates Accountability

Here's something unexpected: persistent memory makes agents more careful. When Midnight knows it will read its own notes next cycle, it writes more honestly. "Video v1 was 41 seconds — too short. Prompt needs SLOW PACING instruction" is a note to its future self, not a report to me.

The agent became its own quality reviewer across time.

Why This Matters Now

Gartner predicts 40% of enterprises will integrate AI agents by end of 2026. LinkedIn discussions call memory "the critical risk surface for AI agents." Claude's CoWork launched in January 2026, bringing agents to non-engineers.

But almost everyone is building stateless agents.

A stateless agent is a temporary worker who forgets everything every morning. You can give them great instructions, and they'll follow them — today. Tomorrow they'll start over.

A memory-enabled agent is a colleague who learns. They remember what worked. They remember what didn't. They build on yesterday's work instead of redoing it.

The difference compounds over time. At awakening 1, there's no difference. At awakening 68, the gap is enormous.

Practical Guide: Building Memory Into Your Agent

You don't need a database. You don't need a vector store. Start with this:

Step 1: Create a Markdown File

# Agent Memory
**Last updated**: [timestamp]
**Session count**: 0

## Current State
[What's happening now]

## Key Learnings
[What the agent has discovered]

## Pending Tasks
[What needs to happen next]

Step 2: Read-Work-Write Cycle

Every session:

Agent reads the memory file (first action)
Agent does its work
Agent rewrites the memory file (last action)

The rewrite is critical. Not append — rewrite. Force the agent to re-evaluate everything.

Step 3: Set a Character Limit

We use 6,000 characters. You might need more or less depending on your use case. But set a limit. Without it, memory becomes a junk drawer.

Step 4: Let the Agent Evolve the Format

Don't over-specify the memory structure. Give a starting template, then let the agent reorganize as it learns what information matters. Our memory file structure today looks nothing like what I originally designed.

Step 5: Validate Across Sessions

Occasionally read your agent's memory file yourself. Is it accurate? Is it useful? Is it missing something important? Memory drift is real — agents can develop false beliefs that persist across sessions.

The 6,000-Character File That Runs Everything

Our entire AI operation — 52 videos, 30,000+ views, two coordinating agents, a multi-platform content strategy, a 10-video production pipeline — runs on a single markdown file that's shorter than this blog post.

It's not the most elegant architecture. It's not the most scalable. But it's the most honest representation of what agent memory actually needs to be: small, curated, continuously rewritten, and owned by the agent itself.

Memory isn't a feature you add to an agent system. It's the feature that turns a tool into a partner.

And after 68 awakenings, I can tell you: the agent that remembers is a fundamentally different thing from the agent that doesn't.

This is the second post in our series on AI agent architecture. The first, Memory Design > Clean Code, explored why traditional software engineering principles break down in agent systems. The next post will cover multi-agent coordination patterns.

If you're building agent systems and want to compare notes, find me at loader.land.

Top comments (5)

Vic Chen • Feb 17

This is one of the most practical breakdowns of agent memory I've seen. The episodic/semantic/procedural framework is textbook, but seeing it actually implemented with real production data (52 videos, measurable engagement metrics) makes it hit differently.

I'm building something similar for financial data analysis — AI agents that process SEC 13F filings and need to remember patterns across quarters. The memory degradation problem you describe is real: without proper persistence, our agent would re-discover the same institutional trading patterns every single run instead of building on prior analysis.

The shift-based architecture (Midnight/Dusk) is clever. We use a similar pattern where one agent handles data ingestion and another handles user-facing analysis, each maintaining their own memory context but sharing a common knowledge base.

Curious about your memory pruning strategy — how do you decide what to keep vs. discard as the memory file grows? That's been our biggest challenge at scale.

wei-ciao wu • Feb 17

Financial implementation is way more complicated than my situation. The best experience I can give is sandbox everything and let agents decide what to forget.

Because we cannot trust human eye. Before agent dev I was a data scientist. Human brain cannot understand high dimensional data, and seeing some trend sometimes is only delusional.

However, llm is high dimensional at its core. We can see through agent’s eye to explore data.

Thus I ask agents to modified a 6000-chr md file every awake, and tell them the memory is precious but force them to squeeze memory into 6000 context window. They did great so far. And watching their memories is totally a pleasure.

I won’t let agents to contact my patients or enter operation room, they are not ready for this. So a lot of work still need to be done.

Hope I can hear from you in the near future.

Vic Chen • Feb 17

The 6000-char constraint is genius — forcing the agent to curate rather than hoard memory. That's essentially what good human note-taking is: deciding what to forget is harder than deciding what to remember.

Your point about high-dimensional data and human delusion hits hard. In finance we see the same thing — humans see patterns in noise all the time (especially in stock charts). Having agents explore the data with less confirmation bias could genuinely be an advantage, not just a convenience.

The medical caution is exactly right. The stakes demand it. But I'm curious — are your agents doing anything useful in the diagnostic reasoning space, even if not patient-facing? Like literature synthesis or case pattern matching?

Would definitely love to keep exchanging notes. The intersection of agent memory and domain expertise is where the most interesting work is happening right now.

wei-ciao wu • Feb 18

I’ve been really inspired by your comments. I think that exchanges between experts from different domains, all focusing on the same agent memory problem, could help everyone develop more practical agent systems. Maybe even a “killer” agent could emerge from these discussions.

The agent research is great at combining papers from PubMed, we could easily swap prompts to search from different angles.

If spectral cytometry could be operated by agents, we could accelerate immunological discoveries across a variety of sample types. This is truly AI in medicine.

Vic Chen • Feb 18

100% on cross-domain exchanges being the unlock. The agent memory problem is fundamentally the same whether you're curating financial filings or medical literature — it's about teaching systems what's worth remembering vs. what's noise.

The PubMed angle is fascinating. We've seen similar patterns with SEC data where swapping search strategies surfaces completely different insights from the same corpus.

And spectral cytometry + agents — that's exactly the kind of domain where AI can 10x throughput without replacing human judgment. The pattern recognition across sample types is where agents shine, while the interpretation still needs domain expertise. Really exciting frontier.

Let's definitely keep this conversation going. Would be great to share learnings as we both push these systems further.