What is concurrency limiting?
When you send “action” requests to Kombo (e.g., creating a candidate, moving
an application, reading attachments), each request is forwarded to the underlying HR or ATS tool.
These tools have their own rate limits, which are often significantly lower than
what Kombo allows.
If too many write requests are in flight at once, they queue up waiting for
downstream capacity. This creates backpressure: requests wait for minutes and
may eventually time out on your side, while the action partially finished
on our side.
Concurrency limiting solves this by capping the number of action requests
that can be processed simultaneously per integration. When the cap is
reached, additional requests are immediately rejected with a 429 status code.
This gives you a fast, clear signal to retry rather than leaving requests stuck
in a slow queue.
Concurrency limiting only applies to action endpoints (POST requests that
write to or read from the underlying tool). Model endpoints for reading synced
data are not affected.
How it differs from rate limiting
| Rate Limiting | Concurrency Limiting |
|---|
| What it caps | Total requests per time window | Simultaneous in-flight requests |
| Scope | All API requests | Actions only |
| When it resets | After the time window elapses | As in-flight requests complete |
| Error code | PLATFORM.RATE_LIMIT_EXCEEDED | PLATFORM.CONCURRENCY_LIMIT_EXCEEDED |
Both return HTTP 429, but with different error codes and headers.
When concurrency limiting is active, every action response includes two headers:
| Header | Example | Description |
|---|
Concurrency-Limit | 30 | The maximum number of concurrent in-flight requests allowed for this integration |
Concurrency-Remaining | 12 | How many additional concurrent requests can be accepted right now |
The limit may vary per integration. Sensitive integrations (e.g.,
reverse-engineered APIs) may have significantly lower limits. Use the response
headers to discover the actual limit rather than assuming a specific number. The
default limit is 30.
These headers appear on both successful responses and 429 rejections, so you can monitor utilization proactively.
Handling concurrency limit errors
When you exceed the concurrency limit, you receive a 429 response with the error code PLATFORM.CONCURRENCY_LIMIT_EXCEEDED. The response includes both concurrency and rate limit headers:
HTTP/1.1 429
concurrency-limit: 30
concurrency-remaining: 0
ratelimit-limit: 1000
ratelimit-remaining: 834
ratelimit-reset: 44
{
"status": "error",
"error": {
"code": "PLATFORM.CONCURRENCY_LIMIT_EXCEEDED",
"title": "Concurrency limit exceeded.",
"message": "Maximum concurrent action requests per integration is 30. Currently 30 in flight."
}
}
Notice how the rate limit headers show remaining quota (ratelimit-remaining: 834) while the concurrency headers show no remaining slots (concurrency-remaining: 0). This tells you:
- The underlying tool is at capacity for this integration — not Kombo itself.
- You still have rate limit quota, so waiting for
ratelimit-reset is unnecessary.
- Slots free up quickly as in-flight requests complete, so a short retry is effective.
How to retry
- Retry the failed request with exponential backoff — start with a short delay (e.g., 1s) and increase on repeated
429s (2s, 4s, 8s, …).
- Optionally, throttle proactively — check
Concurrency-Remaining on successful responses to slow down before hitting the limit.
What this means for your integration
Concurrency limiting does not reduce your overall throughput. The downstream
tool was always the bottleneck. Concurrency limiting just makes that bottleneck
visible through fast 429 responses instead of hidden through slow timeouts.