APIVerve

Posted on Feb 14 • Originally published at blog.apiverve.com

API Rate Limits: What They Mean and Why You Hit Them

#ratelimiting #apidesign #performance #errors

You hit a 429 error. "Too Many Requests." Your code was working fine, and now it's not. What happened?

Rate limits happened. And understanding them will save you hours of debugging and frustrated Slack messages to your team.

What Rate Limits Actually Are

Rate limits cap how many requests you can make in a given time window. Hit the cap, and the API stops responding until the window resets.

Every API has them. Without limits, one runaway script could consume all the server's resources, taking down the service for everyone. Rate limits are protection — for the provider and for you.

Common patterns:

Limit Type	Example	What It Means
Requests per second	10 req/s	Max 10 calls in any 1-second window
Requests per minute	100 req/min	Max 100 calls in any 60-second window
Requests per day	10,000/day	Daily quota, usually resets at midnight UTC
Concurrent requests	5 concurrent	Max 5 in-flight requests at once

Some APIs use combinations. You might have 100 requests per minute and 5 per second. Both limits apply.

Why You're Hitting Them

Nine times out of ten, it's one of these:

Loops without delays. You wrote a script to process 1,000 items, and it fires requests as fast as JavaScript can loop. That's way faster than most rate limits allow.

Retrying failures too aggressively. Something failed, so you retry immediately. And again. And again. Now you're hammering an already-stressed endpoint.

Parallel requests without limits. Promise.all() with 50 requests hits the API 50 times simultaneously. If the limit is 10 per second, you've exceeded it 5x in one moment.

Development quirks. Hot reloading, multiple browser tabs, test scripts running in loops — development environments can rack up requests faster than production ever would.

Shared rate limits. Some providers rate-limit by account, not by API key. If you have multiple apps using the same account, they share the limit.

What Happens When You Hit One

The response is usually a 429 Too Many Requests status code. Sometimes 503 Service Unavailable or a custom error code.

Good APIs include helpful headers:

Header	What It Tells You
`Retry-After`	Seconds to wait before trying again
`X-RateLimit-Limit`	Your total allowed requests
`X-RateLimit-Remaining`	How many you have left
`X-RateLimit-Reset`	When the window resets (timestamp)

Read these headers. They're telling you exactly what to do.

Strategies That Actually Work

Add delays between requests

If you're processing items in a loop, space them out:

for (const item of items) {
  await processItem(item);
  await sleep(100); // 100ms between calls = max 10/second
}

Simple, effective, boring. That's the goal.

Implement exponential backoff

When a request fails, don't retry immediately. Wait, then wait longer if it fails again:

First retry: wait 1 second
Second retry: wait 2 seconds
Third retry: wait 4 seconds
Fourth retry: wait 8 seconds
Fifth retry: give up or alert someone

This gives the API time to recover and prevents you from making things worse.

Use a request queue

Instead of firing requests whenever your code wants to, funnel everything through a queue that respects the rate limit. Libraries like bottleneck or p-limit handle this for you.

The queue holds pending requests and releases them at a controlled pace. Your code doesn't need to think about limits — the queue handles it.

Cache aggressively

The request you don't make can't be rate-limited.

IP addresses don't change location every minute. Validated emails stay validated. Currency exchange rates don't shift every second. Cache responses and skip redundant calls.

What Not to Do

Don't retry 429s immediately. You'll just get another 429. Respect the Retry-After header.

Don't create multiple accounts to bypass limits. Providers track this. You'll get all your accounts banned.

Don't assume the limit is wrong. If you're hitting limits with normal usage, either your usage pattern is inefficient or you need a higher tier. The limit isn't the problem.

Don't ignore rate limits during testing. Your test suite running 500 requests to validate "API connectivity" will burn through limits fast. Mock the API in tests.

Reading the Documentation

Before integrating any API, find the rate limit section. Look for:

What's the limit? Requests per second, minute, hour, day?
What's the window type? Rolling (sliding) or fixed (resets at intervals)?
Is it per endpoint or global? Some APIs have different limits for different endpoints.
What happens when exceeded? Hard block? Queued requests? Graceful degradation?
How do you increase it? Paid tiers? Contacting support?

This takes five minutes and saves hours of debugging later.

When Higher Limits Make Sense

Sometimes you legitimately need more capacity. Signs it's time to upgrade:

You're regularly hitting limits during normal business hours
Your backlog of queued requests keeps growing
Users are seeing delays because of throttling
You've already optimized caching and request patterns

APIVerve's Pro plan includes {{plan.pro.rateLimit}} requests per second — enough for most production workloads. For truly high-volume needs, custom limits are available.

The Right Mindset

Rate limits aren't obstacles. They're part of the API's contract with you. The provider is saying: "We can guarantee this level of service for this many requests."

Exceed that, and the guarantee breaks down — for you and for everyone else using the API.

Build your integration assuming limits exist. Handle 429s gracefully. Implement backoff and queuing from the start, not after you hit problems in production.

Most rate limit issues come down to not reading the docs and not adding delays. Fix those two things, and you'll avoid 90% of problems.

Ready to integrate? Get your API key and check out the rate limit documentation for specifics.

Originally published at APIVerve Blog

DEV Community