Every Python developer learns early: you call a function, it runs, it returns. Simple. And for most of your career, that model holds.
Until you hit async def. Until you hit yield. Until something suspends and resumes and you're not quite sure what's keeping state alive between those moments.
There's a single mechanism underneath all of it. One trick that Python learned in 2001 and has been reusing ever since. Once you see it, generators, coroutines, and async generators all collapse into variations of the same idea.
That mechanism is the detachable frame.
What Is a Frame?
Before we can detach a frame, we need to know what one is.
Every time Python calls a function, it creates a frame object. This is a real thing — not a metaphor, not an abstraction. It's a C struct in CPython, and you can inspect it from Python:
import sys
def show_frame():
frame = sys._getframe()
print(f"locals: {frame.f_locals}")
print(f"code: {frame.f_code.co_name}")
print(f"line: {frame.f_lineno}")
print(f"caller: {frame.f_back.f_code.co_name}")
show_frame()
A frame holds everything a function needs to execute:
-
f_locals— the local variables (x = 1lives here) -
f_code— pointer to the code object (the bytecode blueprint) -
f_lasti— the last bytecode instruction executed (where we are) -
f_back— pointer to the calling frame (the chain back up)
Frames live on the call stack. When you call a function, its frame gets pushed on top. When the function returns, its frame gets popped off. That's the fundamental rhythm of execution in Python. Quite simple: add, execute, return, pop. That's it! All superficial complexity of your programs comes down to this simple data structure, stack! Add, execute, return, pop.
But here's the question: what happens to the frame after it's popped?
The One-Shot Frame
For a normal function, the answer is simple. It dies.
def add(a, b):
return a + b
result = add(1, 2)
The lifecycle:
-
add(1, 2)is called — frame created, pushed onto the call stack -
f_localsgets{'a': 1, 'b': 2} - Bytecode executes:
LOAD_FAST a,LOAD_FAST b,BINARY_OP +,RETURN_VALUE - Frame popped off the stack — destroyed
call add(1,2) → frame ON stack → execute → return → frame DESTROYED
One shot. The frame goes on, does its job, comes off, and it's gone forever. You can't get back to it. The locals are gone. The instruction pointer is gone. Everything that function was while it was running — gone.
This is how most people think all functions work. Create, run, destroy.
And for normal functions, they're right.
In general Python has exactly four types of functions.
def add(a, b): # 1. Normal function
return a + b
def counter(): # 2. Generator function
yield 1
yield 2
async def fetch_user(id): # 3. Coroutine function
return await db.query(id)
async def stream(ids): # 4. Async generator function
for id in ids:
yield await db.query(id)
Two binary toggles — has async? has yield in the body? — give you a 2x2 matrix:
No yield Has yield
───────────────── ─────────────────
No async Normal function Generator
Has async Coroutine Async generator
The compiler decides which type at compilation time. Sets a flag in co_flags. That's it.
The normal function is the one we just saw — one-shot frame, create, run, destroy. The other three? They all break that model in the same way. And that way has been available since 2001.
What If the Frame Could Survive?
Now imagine a different rule. What if, when a function gives up control, its frame doesn't die? What if it gets popped off the call stack but keeps living — in the heap (part of memory), with all its locals intact, its instruction pointer remembering exactly where it stopped?
And what if you could push it back onto the stack later and pick up right where you left off?
That's what I'm calling a detachable frame. You won't find this term in the CPython docs — but the mechanism is there. The CPython internals describe how generator and coroutine frames are embedded directly in their objects on the heap, rather than living and dying on the call stack like normal function frames. The concept is real. I'm just giving it a name.
And Python has had them since 2001.
The Detachable Frame: Generators
def counter():
x = 1
yield x
x = 2
yield x
return "done"
When you call counter(), Python does something unexpected — it doesn't execute the body. It creates a generator object, allocates a frame, attaches the frame to the object, and hands it back to you. Not a single line of the body has run.
g = counter() # body does NOT execute
type(g) # <class 'generator'>
g.gi_frame # a live frame object — it exists!
g.gi_frame.f_locals # {} — empty, nothing has run yet
g.gi_frame.f_lasti # -1 — instruction pointer at the very start
The frame is sitting in the heap, attached to g, waiting. Now you drive it:
next(g) # → 1
What happened:
- The frame gets pushed onto the call stack
- Execution starts from where
f_lastipoints -
x = 1runs -
yield xfires — the value1is sent out to the caller - The frame gets popped off the call stack — but NOT destroyed
- It goes back to the heap, attached to
g, with all its state preserved
g.gi_frame.f_locals # {'x': 1} — state preserved!
g.gi_frame.f_lasti # advanced past the yield
Call next(g) again:
next(g) # → 2
Same dance. Frame goes on the stack, resumes from exactly where it left off, x = 2 runs, hits yield, frame comes off again.
One more:
next(g) # StopIteration("done")
g.gi_frame # None — NOW it's destroyed
The frame went on and off the call stack three times before finally being destroyed on return. Between each trip, it sat in the heap, holding its locals, its instruction pointer, its entire execution state — alive but dormant.
One-shot frame: ON → execute → OFF → destroyed
Detachable frame: ON → OFF → ON → OFF → ON → OFF → destroyed
That's the whole trick. A frame that can leave the call stack and come back. Python learned this in 2001 with generators. And this — not async def, not await, not asyncio — is the mechanism that makes async Python possible.
The Two-Way Door
There's more. yield isn't just an exit — it's a door that swings both ways.
def ping_pong():
received = yield "first" # send "first" out, pause, wait for input
print(f"got: {received}")
received = yield "second" # send "second" out, pause, wait for input
print(f"got: {received}")
return "done"
g = ping_pong()
val = next(g) # → "first" (value comes OUT)
val = g.send("hello") # → "second" (prints: got: hello — value goes IN)
g.send("world") # → StopIteration("done") (prints: got: world)
yield sends a value out and pops the frame off. .send() pushes the frame back on and delivers a value in. Two-way communication through a suspended frame.
yield x send(y)
Generator ──────────────→ Caller Caller ──────────────→ Generator
"here's x, "here's y,
your turn" your turn"
Read that twice. This bidirectional channel — yield out, send in, through a detachable frame — is the entire foundation of Python's async system.
The Repackaging
Now the part that should make you pause.
In 2013, Python 3.4 introduced asyncio. It used generators for async I/O. Not a new mechanism — plain generators with yield from:
# Python 3.4 — this is how async worked before async existed
@asyncio.coroutine
def fetch_data():
data = yield from asyncio.sleep(1)
return data
Generator suspends → event loop takes control → I/O completes → event loop calls .send() → generator resumes.
Same detachable frame. Same .send() protocol. The event loop was just a fancy caller doing next() and .send() based on I/O readiness instead of iteration needs.
It worked. But it was confusing — is this generator producing values or doing async I/O? The syntax doesn't tell you. You had to squint.
So in 2015, Python 3.5 introduced async def / await (PEP 492). And here's the thing — no new frame machinery was built. None. The detachable frame mechanism is identical. What changed:
A new type. Coroutine objects are not generator objects. Different type, different name.
Renamed attributes. Same internals, new labels:
Generator Coroutine
───────── ─────────
g.gi_frame → coro.cr_frame
g.gi_code → coro.cr_code
g.gi_running → coro.cr_running
g.gi_yieldfrom → coro.cr_await
Guardrails. You can't accidentally mix them up anymore:
for x in my_coroutine() # TypeError!
await my_generator() # TypeError!
One flag. One bit in co_flags:
Normal function: co_flags = 0b0000_0011
Async function: co_flags = 0b1000_0011
^
CO_COROUTINE — this one bit
When the VM sees this flag at call time, it says: "don't execute the body, wrap it in a coroutine object." Same thing it does for generators with CO_GENERATOR. Same logic, same mechanism.
That's it. async/await is a repackaging of generator frame suspension with type safety and clearer intent. New clothes on an old engine.
The Only Difference That Matters
If the engine is the same, what's actually different?
Who calls .send() to put the frame back on the stack.
That's it. That's the whole thing.
Generator: YOUR code calls next()/send() — you decide when to resume
Coroutine: EVENT LOOP calls .send() — the orchestrator decides
Async generator: BOTH take turns — two drivers, one frame
The frame doesn't know or care who wakes it up. It gets pushed on the stack, runs until it suspends, gets popped off, and waits. The driver is external to the mechanism.
A generator suspends at yield and waits for your code to ask for the next value. A coroutine suspends at await and waits for the event loop to signal that I/O is ready. Same suspension. Different reason. Different caller.
You can prove this to yourself right now. Drive a coroutine manually, without any event loop:
async def simple():
return 42
coro = simple()
try:
coro.send(None) # YOU are the event loop
except StopIteration as e:
print(e.value) # 42
.send(None) pushes the frame on the stack, the body executes, return 42 triggers StopIteration with the value. Same protocol as generators. Because it is the same protocol.
The Async Generator: Both Reasons at Once
The fourth type in our matrix — async def with yield — creates something that suspends for two different reasons:
async def stream_data():
data = await fetch() # returns ["a", "b", "c"]
for item in data:
yield item
data = await fetch() # returns ["d", "e", "f"]
for item in data:
yield item
async for value in stream_data():
print(value)
Trace the frame:
anext() → frame ON stack
→ hits await fetch()
→ frame OFF (reason: I/O — event loop will resume)
I/O returns → frame ON stack
→ data = ["a", "b", "c"], enters loop
→ yield "a"
→ frame OFF (reason: producing value — async for will resume)
→ consumer prints "a"
anext() → frame ON → yield "b" → frame OFF → prints "b"
anext() → frame ON → yield "c" → frame OFF → prints "c"
anext() → frame ON (loop ends, continues to next line)
→ hits await fetch()
→ frame OFF (reason: I/O — event loop resumes)
I/O returns → frame ON → yield "d" → frame OFF → prints "d"
anext() → frame ON → yield "e" → frame OFF → prints "e"
anext() → frame ON → yield "f" → frame OFF → prints "f"
anext() → frame ON → end of function → StopAsyncIteration → frame DESTROYED
Eight suspensions. Two for I/O, six for producing values. Two different drivers taking turns pushing the same frame back onto the stack. The frame doesn't care. It just does its thing.
The Full Picture
Four function types. One core mechanism. Three variations of it.
Normal function → frame ON → execute → frame OFF → gone forever
one-shot, no suspension
Generator → frame ON/OFF, driven by next()/for
suspends at: yield
purpose: produce values
Coroutine → frame ON/OFF, driven by event loop
suspends at: await
purpose: wait for I/O
Async generator → frame ON/OFF, driven by BOTH
suspends at: yield AND await
purpose: stream values with async I/O
The normal function is the simple case — frame created, used, destroyed. The other three all share the detachable frame: the frame leaves the call stack, survives in the heap, and comes back when someone calls .send().
They differ only in the answer to two questions:
Why did the frame suspend? To produce a value (yield), to wait for I/O (await), or both?
Who puts it back? Your code, the event loop, or both?
Why it suspends Who resumes it
────────────── ──────────────
Generator yield (value out) your code
Coroutine await (I/O wait) event loop
Async generator yield + await both
That's the whole model. Everything else — asyncio, uvicorn, FastAPI's dependency injection, database connection pools — is built on top of this. Abstractions upon abstractions, all the way up. But at the bottom, there's a frame going on and off the call stack.
The Timeline
This didn't appear overnight. It was a twenty-year evolution:
| Year | What Happened | PEP |
|---|---|---|
| 2001 | Generators + yield — detachable frames are born |
PEP 255 |
| 2005 |
.send() / .throw() — two-way communication |
PEP 342 |
| 2012 |
yield from — delegation to sub-generators |
PEP 380 |
| 2013 | asyncio — generators repurposed for async I/O | PEP 3156 |
| 2015 |
async def / await — dedicated type and syntax |
PEP 492 |
| 2016 | Async generators — yield inside async def
|
PEP 525 |
Twenty years from detachable frames to the async ecosystem we use today. Not a revolution — an evolution. Each step built on the one before. The frame suspension mechanism from 2001 is still the same mechanism running your FastAPI handlers right now.
Pull back async/await and you find generators. Pull back generators and you find detachable frames. Pull back detachable frames and you find the simple insight that a function's state doesn't have to die when it gives up control.
That's the whole story.
All concepts in this post can be verified hands-on: create a coroutine, don't await it, inspect cr_frame, call .send() yourself. Drive it manually. Build a toy event loop in 20 lines. Once you see that an event loop is just "pick a frame, .send(), handle what it yields" — the magic evaporates and understanding takes its place.
Top comments (0)