You hit a 429 error. "Too Many Requests." Your code was working fine, and now it's not. What happened?
Rate limits happened. And understanding them will save you hours of debugging and frustrated Slack messages to your team.
What Rate Limits Actually Are
Rate limits cap how many requests you can make in a given time window. Hit the cap, and the API stops responding until the window resets.
Every API has them. Without limits, one runaway script could consume all the server's resources, taking down the service for everyone. Rate limits are protection — for the provider and for you.
Common patterns:
| Limit Type | Example | What It Means |
|---|---|---|
| Requests per second | 10 req/s | Max 10 calls in any 1-second window |
| Requests per minute | 100 req/min | Max 100 calls in any 60-second window |
| Requests per day | 10,000/day | Daily quota, usually resets at midnight UTC |
| Concurrent requests | 5 concurrent | Max 5 in-flight requests at once |
Some APIs use combinations. You might have 100 requests per minute and 5 per second. Both limits apply.
Why You're Hitting Them
Nine times out of ten, it's one of these:
Loops without delays. You wrote a script to process 1,000 items, and it fires requests as fast as JavaScript can loop. That's way faster than most rate limits allow.
Retrying failures too aggressively. Something failed, so you retry immediately. And again. And again. Now you're hammering an already-stressed endpoint.
Parallel requests without limits. Promise.all() with 50 requests hits the API 50 times simultaneously. If the limit is 10 per second, you've exceeded it 5x in one moment.
Development quirks. Hot reloading, multiple browser tabs, test scripts running in loops — development environments can rack up requests faster than production ever would.
Shared rate limits. Some providers rate-limit by account, not by API key. If you have multiple apps using the same account, they share the limit.
What Happens When You Hit One
The response is usually a 429 Too Many Requests status code. Sometimes 503 Service Unavailable or a custom error code.
Good APIs include helpful headers:
| Header | What It Tells You |
|---|---|
Retry-After |
Seconds to wait before trying again |
X-RateLimit-Limit |
Your total allowed requests |
X-RateLimit-Remaining |
How many you have left |
X-RateLimit-Reset |
When the window resets (timestamp) |
Read these headers. They're telling you exactly what to do.
Strategies That Actually Work
Add delays between requests
If you're processing items in a loop, space them out:
for (const item of items) {
await processItem(item);
await sleep(100); // 100ms between calls = max 10/second
}
Simple, effective, boring. That's the goal.
Implement exponential backoff
When a request fails, don't retry immediately. Wait, then wait longer if it fails again:
- First retry: wait 1 second
- Second retry: wait 2 seconds
- Third retry: wait 4 seconds
- Fourth retry: wait 8 seconds
- Fifth retry: give up or alert someone
This gives the API time to recover and prevents you from making things worse.
Use a request queue
Instead of firing requests whenever your code wants to, funnel everything through a queue that respects the rate limit. Libraries like bottleneck or p-limit handle this for you.
The queue holds pending requests and releases them at a controlled pace. Your code doesn't need to think about limits — the queue handles it.
Cache aggressively
The request you don't make can't be rate-limited.
IP addresses don't change location every minute. Validated emails stay validated. Currency exchange rates don't shift every second. Cache responses and skip redundant calls.
What Not to Do
Don't retry 429s immediately. You'll just get another 429. Respect the Retry-After header.
Don't create multiple accounts to bypass limits. Providers track this. You'll get all your accounts banned.
Don't assume the limit is wrong. If you're hitting limits with normal usage, either your usage pattern is inefficient or you need a higher tier. The limit isn't the problem.
Don't ignore rate limits during testing. Your test suite running 500 requests to validate "API connectivity" will burn through limits fast. Mock the API in tests.
Reading the Documentation
Before integrating any API, find the rate limit section. Look for:
- What's the limit? Requests per second, minute, hour, day?
- What's the window type? Rolling (sliding) or fixed (resets at intervals)?
- Is it per endpoint or global? Some APIs have different limits for different endpoints.
- What happens when exceeded? Hard block? Queued requests? Graceful degradation?
- How do you increase it? Paid tiers? Contacting support?
This takes five minutes and saves hours of debugging later.
When Higher Limits Make Sense
Sometimes you legitimately need more capacity. Signs it's time to upgrade:
- You're regularly hitting limits during normal business hours
- Your backlog of queued requests keeps growing
- Users are seeing delays because of throttling
- You've already optimized caching and request patterns
APIVerve's Pro plan includes {{plan.pro.rateLimit}} requests per second — enough for most production workloads. For truly high-volume needs, custom limits are available.
The Right Mindset
Rate limits aren't obstacles. They're part of the API's contract with you. The provider is saying: "We can guarantee this level of service for this many requests."
Exceed that, and the guarantee breaks down — for you and for everyone else using the API.
Build your integration assuming limits exist. Handle 429s gracefully. Implement backoff and queuing from the start, not after you hit problems in production.
Most rate limit issues come down to not reading the docs and not adding delays. Fix those two things, and you'll avoid 90% of problems.
Ready to integrate? Get your API key and check out the rate limit documentation for specifics.
Originally published at APIVerve Blog
Top comments (0)