DEV Community

Cover image for Space Management: Where Bytes Are Won and Reclaimed in SQLite
Athreya aka Maneshwar
Athreya aka Maneshwar

Posted on

Space Management: Where Bytes Are Won and Reclaimed in SQLite

Hello, I'm Maneshwar. I'm working on git-lrc: a Git hook for Checking AI generated code.
AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.
git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

We stitched together all the control structures (Btree, BtShared, MemPage, BtCursor), today we move into something just as critical but less glamorous:

Space management.

Because trees don’t just grow and shrink structurally, they constantly allocate and free bytes inside pages and across the file and if that space isn’t managed carefully, performance collapses.

The tree module receives insert and delete requests from the VM in completely unpredictable order.

An insert requires space inside a tree page.

A delete frees cell space, overflow pages and sometimes entire tree pages

So SQLite must manage two kinds of free space:

  1. Free pages in the file
  2. Free space inside pages

Let’s examine both.

Management of Free Pages

When a page is removed from a tree — for example, after a merge during delete it does not disappear.

It is added to the file freelist.

The freelist originates in the database file header at offset 32. It is essentially a linked structure of free pages available for reuse.

The logic is simple:

  • When a tree needs to grow → take a page from freelist
  • If freelist empty → append a new page at end of file

Pages appended from the native file system always grow the file physically.

This is why databases sometimes grow but don’t shrink automatically.

To shrink the file, you must run:

VACUUM;
Enter fullscreen mode Exit fullscreen mode

The VACUUM command compacts the entire file and eliminates freelist pages.

If the database is created with autovacuum mode, SQLite automatically shrinks the file during each COMMIT instead of accumulating freelist pages.

So at the file level, space is never lost only recycled or compacted.

Management of Page Space

Now let’s zoom into a single tree page.

From earlier discussions, we know a page contains three types of free space:

  1. Middle unallocated region
    Between the cell pointer array and the cell content area.

  2. Free blocks
    Linked list inside the cell content area.

  3. Fragments
    Tiny ≤3-byte holes inside cell content area.

Only the first two partitions are used for allocation.

Fragments are bookkeeping leftovers until defragmentation.

The allocator’s fundamental responsibility is toEnsure the cell pointer array and cell content area never overlap.

Cell Allocation

Now we step into the allocator’s actual algorithm.

Suppose a request arrives for nRequired bytes (minimum 4 bytes — smaller requests are rounded up).

If total free space nFree is insufficient, allocation fails.

Otherwise, SQLite proceeds in three stages.

Step 1 — Search Free Block List (First Fit)

The allocator scans the free block linked list looking for a block large enough.

This is a first-fit strategy.

If a suitable block is found:

Case A: Block barely larger than required

If block size < nRequired + 4:

  • Entire block is consumed
  • Remaining ≤3 bytes become fragments

Case B: Block much larger than required

  • Allocate from bottom of block
  • Reduce block size by nRequired
  • Block remains on freelist

This approach minimizes fragmentation and avoids unnecessary compaction.

Step 2 — Defragment If Necessary

If no free block is large enough like middle region is too small or fragmentation too high

SQLite runs defragmentation.

Defragmentation:

  • Compacts all cells to bottom of page
  • Rebuilds pointer array
  • Consolidates free space into one contiguous middle region

After compaction, allocation becomes trivial.

This is expensive — so SQLite delays it as long as possible.

Step 3 — Allocate from Middle Region

Finally, allocation proceeds from the bottom of the free middle region.

The “top” boundary value (offset 5 in header) is increased by nRequired.

Space is carved out cleanly.

Cell Deallocation

Delete operations reverse the process.

Suppose nFree bytes are released.

The allocator:

  1. Creates a new free block of size nFree
  2. Inserts it into the free block list in address order
  3. Attempts to merge adjacent free blocks

If a fragment lies between two neighboring free blocks, it is absorbed.

If a free block sits exactly at the boundary of the middle unallocated region, SQLite merges them and adjusts the top pointer.

This merging prevents freelist explosion and maintains compactness.

Putting It All Together

Now the full lifecycle of a row looks like this:

Insert:

  • Cursor locates leaf
  • Allocator finds or creates space
  • Cell inserted
  • Overflow pages allocated if needed
  • Pager marks page dirty

Delete:

  • Cell removed
  • Space freed
  • Blocks merged
  • Possibly pages freed to freelist
  • Pager journals change

File grows only if freelist empty.
File shrinks only via VACUUM or autovacuum.

Nothing is wasted permanently.

git-lrc

👉 Check out: git-lrc
Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub:

GitHub logo HexmosTech / git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

  • 🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
  • 🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
  • 🔁 Build a habit, ship better code. Regular review → fewer bugs → more robust code → better results in your team.
  • 🔗 Why git? Git is universal. Every editor, every IDE, every AI…




Top comments (0)