Most Telegram bot tutorials show you how to echo messages back. This tutorial shows how to build a bot that actually remembers who it is talking to.
What We Are Building
A Telegram bot where each user gets persistent memory. The bot stores facts about you in a markdown file and reads it before every response. After a week of chatting, the bot knows your name, your job, your interests, and what you talked about last time.
You can try the finished version here: t.me/adola2048_bot
Architecture
Telegram -> Webhook -> Gateway -> Per-User Container -> AI Model
|
MEMORY.md (persistent)
Each user gets their own Docker container with a bind-mounted workspace directory. Inside that directory:
-
MEMORY.md- everything the AI knows about this user -
SOUL.md- personality configuration -
SCHEDULES.json- proactive check-in times
Step 1: The Gateway
The gateway receives Telegram webhooks and routes to the right container:
app.post("/webhook", async (req, reply) => {
const chatId = req.body.message?.chat?.id;
const text = req.body.message?.text;
const container = await getOrCreateContainer(chatId);
const response = await sendToAgent(container, text);
await sendTelegramMessage(chatId, response);
});
Step 2: Per-User Containers
Each container runs an AI agent with access to the workspace:
services:
user-container:
image: adola-agent
volumes:
- ./data/users/${USER_ID}/workspace:/workspace
environment:
- MODEL=google/gemini-2.5-flash
The agent reads MEMORY.md at the start of every conversation and writes updates when it learns something new.
Step 3: Memory Management
The AI manages its own memory file. A typical MEMORY.md after a few conversations:
# About This Person
- Name: Alex
- Location: Berlin
- Job: Frontend developer at a startup
- Interests: climbing, cooking, sci-fi books
# Recent Context
- Feb 10: Mentioned job interview at Google on Wednesday
- Feb 11: Nervous about the interview, practiced questions together
Step 4: Proactive Check-ins
A scheduler reads SCHEDULES.json every 30 seconds and fires messages when due:
[
{
"task": "Ask about Google interview",
"due": "2026-02-12T18:00:00Z",
"recurring": false
}
]
The AI creates these entries itself by writing to the file.
Results
This architecture supports 9 concurrent users on a $35/month GCP instance. Containers that are idle get stopped automatically and restart in under 2 seconds when a new message arrives.
The memory feature is what keeps users coming back. Try it yourself: t.me/adola2048_bot
Top comments (0)