DEV Community

Zongzhi Chen
Zongzhi Chen

Posted on

In 2026, Can AI Modify Database Kernel Code? Rewriting PostgreSQL with Claude Code: Full Page Write vs Doublewrite Buffer

It's 2026. Can AI actually modify database kernel code? I used Claude Code to replace PostgreSQL's Full Page Write with MySQL's Doublewrite Buffer approach. Turns out DWB is about 3x faster.

Why I Did This

I've been bugged by this question for years: which torn page protection is better, PostgreSQL's Full Page Write (FPW) or MySQL's Doublewrite Buffer (DWB)? I brought it up on the pgsql-hackers mailing list a while back, but the discussion didn't really go anywhere.

Now in 2026, I figured I'd kill two birds with one stone: see if Claude Code can handle kernel-level modifications, and settle the FPW vs DWB debate with actual code. Show me the code -- so here we are.

The Torn Page Problem

Databases manage data in pages -- 8KB in PostgreSQL, 16KB in MySQL. But the OS and disk atomic write unit is usually 4KB.

So writing one database page takes multiple physical I/Os. If you lose power or crash halfway through, you get a partially written page -- a torn page with mixed old and new data. Corrupted.

PostgreSQL and MySQL handle this very differently.

PostgreSQL: Full Page Write

After each checkpoint, the first time a page gets modified, PostgreSQL dumps the entire page into WAL. If a crash corrupts that page, recovery grabs the full page image from WAL, overwrites the bad page, then replays the remaining WAL records.

The problem? FPW makes checkpoint frequency a lose-lose decision.

You want fewer checkpoints because every checkpoint triggers a flood of full-page WAL writes for all the pages getting dirtied for the first time. WAL bloats, write performance tanks. That's why checkpoint_timeout has a floor of 30 seconds. And of course checkpoints can also be triggered by exceeding max_wal_size.

But you also want more checkpoints because crash recovery replays less WAL, so the database comes back faster.

FPW wants fewer checkpoints. Fast recovery wants more. Pick one.

MySQL: Doublewrite Buffer

InnoDB does it differently. Before flushing dirty pages to their actual data file locations, it sequentially writes them into a dedicated Doublewrite Buffer area on disk. Once the buffer fills, one fsync(), then scatter-write the pages to where they actually belong.

Crash? On restart, check the Doublewrite Buffer for intact copies of any torn pages, restore them, done. No torn pages.

Why I Think Doublewrite Buffer Wins

Foreground vs background

Without data merging:

  • FPW = 1 WAL write + 1 page write
  • DWB = 2 page writes

Both 2 I/Os. But WAL writes are on the foreground path -- they directly hit your SQL latency. DWB writes are on the background flush path -- users barely notice.

Batching potential

DWB doesn't fsync() every page. It fills a buffer, then syncs once. WAL writes can batch too, sure, but it's the foreground path -- you can't make users wait forever, so the batching window is small.

No checkpoint frequency trade-off

DWB doesn't depend on checkpoints for torn page protection. So you can crank up checkpoint frequency for faster crash recovery without the write amplification penalty.

Benchmarks

Config:

shared_buffers=4GB
wal_buffers=64MB
synchronous_commit=on
maintenance_work_mem=2GB
checkpoint_timeout=30s
Enter fullscreen mode Exit fullscreen mode

Fresh database per scenario, VACUUM FULL + 60s warmup before each 300s run.

Scenario: io-bound, --tables=10 --table_size=10000000

Workload Threads FPW OFF (QPS) FPW ON (QPS) DWB ON (QPS) FPW OFF (TPS) FPW ON (TPS) DWB ON (TPS) FPW OFF (ms) FPW ON (ms) DWB ON (ms)
read_write 32 360,764 158,865 260,171 18,038 7,943 13,009 1.77 4.03 2.46
read_write 64 484,988 190,654 307,735 24,249 9,533 15,387 2.64 6.71 4.16
read_write 128 556,021 194,301 301,791 27,801 9,715 15,387 4.60 13.17 9.81
write_only 32 318,879 108,696 188,760 53,146 18,116 31,460 0.60 1.77 1.02
write_only 64 345,766 117,533 197,251 57,628 19,589 32,875 1.11 3.27 1.95
write_only 128 356,725 89,144 202,884 59,454 14,857 33,814 2.15 8.61 3.78

The numbers speak for themselves: FPW OFF is the baseline, FPW ON drops to ~25% of baseline, DWB ON holds at ~57%. At write_only 128 threads, DWB delivers 2.3x the throughput of FPW with much better latency across the board.

Code

The modified PostgreSQL with Doublewrite Buffer support: https://github.com/baotiao/postgres

The whole kernel modification was done with Claude Code. So yeah, AI can hack on database internals now.

BTW: Check out AliSQL

Top comments (0)