James Miller

Posted on Feb 14

How I Slashed API Response Time from 200ms to 10ms

#ai #programming #api #webdev

Listen, young Padawan. When your API is as slow as a snail, your P95 latency is skyrocketing, and your server crashes at 3 AM due to a traffic spike, you have two choices:

Spend three months rewriting everything in Rust.
Watch your users churn.

Or, you can cheat like I did.

I grafted the extreme speed of Bun onto the massive ecosystem of Node.js. Don't laugh; I'm serious. I managed to crush a bloated backend interface down to under 10ms without rewriting 5 years of legacy business logic.

1. Bun on the Edge, Node.js Worker Pool in the Back

Everyone knows the overhead of Node handling HTTP requests is high. But my business logic is full of legacy crypto libraries and old SDKs that simply cannot be ported to Bun.

So, my solution is "Front Shop, Back Factory."

I used Bun to build an extremely thin HTTP layer responsible solely for routing, parameter validation, and blocking invalid requests. Only when the actual legacy business logic is needed do I offload the task to a resident Node process via IPC (Inter-Process Communication).

Crucial Tip: Never spawn a Node process when a request arrives. That is slower than using Node alone. You must pre-start a group of Node Workers and keep them warm.

Bun Side (The Gateway):

// bun-gateway.ts
const textDecoder = new TextDecoder();
const textEncoder = new TextEncoder();

// Start a resident Node process, not one per request
const nodeWorker = Bun.spawn(["node", "heavy-lifter.js"], {
  stdin: "pipe",
  stdout: "pipe",
});

// Simple wrapper to offload the dirty work
async function askNode(payload: any) {
  const msg = JSON.stringify(payload) + "\n";
  nodeWorker.stdin.write(textEncoder.encode(msg));

  // Simplified reading logic (Handle sticky packets in production!)
  const reader = nodeWorker.stdout.getReader();
  const { value } = await reader.read(); 
  return JSON.parse(textDecoder.decode(value));
}

Bun.serve({
  port: 3000,
  async fetch(req) {
    if (req.url.endsWith("/fast")) return new Response("Bun is fast!");

    // Only send heavy lifting to Node
    if (req.url.endsWith("/heavy")) {
      const data = await req.json();
      const result = await askNode(data);
      return Response.json(result);
    }
    return new Response("404", { status: 404 });
  },
});

Node Side (The Worker):

// heavy-lifter.js
const readline = require('readline');

const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
  terminal: false
});

rl.on('line', (line) => {
  const data = JSON.parse(line);
  // Pretend we are doing heavy crypto computation
  // Legacy Node ecosystem code runs here unchanged
  const result = { processed: true, echo: data };
  console.log(JSON.stringify(result));
});

With this setup, routing and I/O are sub-millisecond, while Node focuses purely on computation. Efficiency doubled instantly.

2. Stop the CPU from Moving Bricks: Zero-Copy with Bun

I noticed my server CPU was high just because we were reading local config files and static JSON, serializing them, and sending them to users.

In Node, you typically fs.readFile and then res.send. This involves multiple data copies: Disk -> Kernel -> User Space Buffer -> Socket.

In Bun, I switched to Bun.file(). This isn't just a syntax change; it tells the OS: "Throw this file directly to the network card; don't let it pass through my hands."

// Stop using readFile, stream directly
Bun.serve({
  fetch(req) {
    if (req.url.endsWith("/config")) {
      return new Response(Bun.file("./big-config.json"));
    }
    return new Response("404");
  }
});

This single line change tripled my static resource throughput.

3. Micro-batching: Queue Up!

What's the scariest thing about high concurrency? It's 1000 requests hitting at once, each triggering a separate database call or Node process invocation. It’s like students rushing the cafeteria at lunch.

I added a tiny buffer window. If 50 requests come in within 3ms, I pack them into an array and send them to Node or the DB in one go.

let buffer: any[] = [];
let timer: Timer | null = null;

function processBatch() {
  const currentBatch = buffer;
  buffer = [];
  timer = null;
  // Send 50 tasks to Node in one go, not 50 times
  askNode({ type: 'batch', items: currentBatch });
}

function enqueue(item: any) {
  buffer.push(item);
  // Only start the timer on the first push
  if (!timer) {
    timer = setTimeout(processBatch, 3); // 3ms delay is imperceptible to users but huge for throughput
  }
}

Waiting 3ms resulted in a 60% drop in CPU load.

4. Don't `new` Objects Inside Loops, Please

When reviewing code, I see people writing const db = new DatabaseClient() or const regex = new RegExp(...) inside fetch or handleRequest.

Reallocating memory, establishing connections, and compiling regex on every request is a recipe for a GC (Garbage Collection) explosion.

Lift everything reusable—Database pools, TextEncoder, RegEx, Encryption Keys—to the global scope. In a hybrid Bun/Node architecture, this is critical because we are chasing extreme low latency.

5. Dual-Layer Caching: When RAM Isn't Enough

I used to rely solely on Redis, but network requests still have overhead. Then I realized Bun reads files insanely fast.

So I implemented Dual-Layer Caching:

L1 Memory Cache: Use LRU to store the hottest 1000 keys. Microsecond response.
L2 File Cache: Write slightly colder data as JSON files to /tmp/cache/.

Checking if a file exists is much faster than initiating a TCP request to Redis.

6. Ditch the Bloated npm Packages

In Node, we habitually npm install uuid or qs just to generate a UUID or parse params.

In Bun (and modern Node), crypto.randomUUID() and URLSearchParams are built-in and optimized at the C++ level.

I stripped out all unnecessary npm dependencies and switched to native APIs. This not only improved cold start times but, more importantly, reduced the node_modules I/O nightmare.

7. Solving the Schizophrenic Dev Environment

This architecture uses Bun as the Gateway and Node as the Compute. But locally, I almost lost my mind.

My laptop runs Node 22. To maintain legacy projects, I need Node 14. I also need Bun, and occasionally Deno for scripts. Switching back and forth with nvm was exhausting—port conflicts, path errors, and environment variables were a mess. I’d fix the Bun environment, and the old Node project would break.

Then I found ServBay. It’s a lifesaver for developers. It’s not a crude version switcher; it’s a complete, isolated runtime environment platform.

Multi-Version Coexistence: I can run Node 14, Node 22, and Bun 1.1 environments simultaneously. They are completely isolated and don't fight each other.
One-Click Ecosystem: I can install database with one click (Redis for caching, PostgreSQL for data), and even Caddy for reverse proxying. Everything just works.
Zero Config: I realized I wasted so much time configuring Docker and Homebrew in the past.

With ServBay, I perfectly replicated the production hybrid architecture locally: Bun listening on port 3000, Node listening on internal pipes, and Redis running in the background. I no longer worry if it’s an environment issue or a code issue.

Conclusion

As long as I can crush response times into the 10ms range, I don't care how many runtimes I mix.

Bun gives me speed. Node gives me stability. ServBay gives me a sane environment.

Stop agonizing over whether to use Bun or Node.js. We are adults; why can't we have both? Combine them, and go cut your API latency by 90% right now.

DEV Community

How I Slashed API Response Time from 200ms to 10ms

1. Bun on the Edge, Node.js Worker Pool in the Back

Bun Side (The Gateway):

Node Side (The Worker):

2. Stop the CPU from Moving Bricks: Zero-Copy with Bun

3. Micro-batching: Queue Up!

4. Don't `new` Objects Inside Loops, Please

5. Dual-Layer Caching: When RAM Isn't Enough

6. Ditch the Bloated npm Packages

7. Solving the Schizophrenic Dev Environment

Conclusion

Top comments (0)

1. Bun on the Edge, Node.js Worker Pool in the Back

Bun Side (The Gateway):

Node Side (The Worker):

2. Stop the CPU from Moving Bricks: Zero-Copy with Bun

3. Micro-batching: Queue Up!

4. Don't new Objects Inside Loops, Please

5. Dual-Layer Caching: When RAM Isn't Enough

6. Ditch the Bloated npm Packages

7. Solving the Schizophrenic Dev Environment

Conclusion

4. Don't `new` Objects Inside Loops, Please