> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kombo.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Concurrency Limiting

> Concurrency limiting caps simultaneous in-flight unified actions per integration, giving you fast feedback instead of slow timeouts.

## What is concurrency limiting?

When you invoke unified actions in Kombo (e.g., creating a candidate, moving
an application, reading attachments), each call is forwarded to the underlying HR or ATS tool.
These tools have their own rate limits, which are often significantly lower than
what Kombo allows.

If many unified actions run in parallel, they all compete for limited
downstream throughput. Completion times grow, your HTTP clients may hit
timeouts while work is still in flight, and retries can overlap with operations
that are still running.

Concurrency limiting solves this by capping how many unified actions
can be processed simultaneously per integration. When the cap is
reached, additional requests are immediately rejected with a `429` status code.
This gives you a fast, clear signal to retry with backoff instead of stacking
unbounded in-flight work that might eventually time out on your side.

<Note>
  Concurrency limiting only applies to unified actions (calls that run work
  against the underlying tool). Model endpoints for reading synced data are not
  affected.
</Note>

<Frame>
  <img src="https://mintcdn.com/kombo/1ermrU0qvAfbWZQb/images/concurrency-limiting.png?fit=max&auto=format&n=1ermrU0qvAfbWZQb&q=85&s=73a2a69790775806d0aa7145a267821b" alt="Two-panel diagram comparing overload (client timeout before completion, ambiguous retries) with concurrency limiting (429 and controlled in-flight work)." width="1646" height="1426" data-path="images/concurrency-limiting.png" />
</Frame>

## How it differs from rate limiting

For Kombo-wide request volume over time, see [Rate limiting](../getting-started/querying-api#rate-limiting).

|                    | Rate Limiting                  | Concurrency Limiting                  |
| ------------------ | ------------------------------ | ------------------------------------- |
| **What it caps**   | Total requests per time window | Simultaneous in-flight requests       |
| **Scope**          | All API requests               | Unified actions only                  |
| **When it resets** | After the time window elapses  | As in-flight requests complete        |
| **Error code**     | `PLATFORM.RATE_LIMIT_EXCEEDED` | `PLATFORM.CONCURRENCY_LIMIT_EXCEEDED` |

Both return HTTP `429`, but with different error codes and headers.

## Response headers

When concurrency limiting is active, every unified action response includes two headers:

| Header                  | Example | Description                                                                             |
| ----------------------- | ------- | --------------------------------------------------------------------------------------- |
| `Concurrency-Limit`     | `30`    | The maximum number of concurrent in-flight unified actions allowed for this integration |
| `Concurrency-Remaining` | `12`    | How many additional concurrent unified actions can be accepted right now                |

The limit may vary per integration. Sensitive integrations (e.g.,
reverse-engineered APIs) may have significantly lower limits. Use the response
headers to discover the actual limit rather than assuming a specific number. The
default limit is 30.

These headers appear on both successful responses and `429` rejections, so you can monitor utilization proactively.

## Handling concurrency limit errors

When you exceed the concurrency limit, you receive a `429` response with the error code `PLATFORM.CONCURRENCY_LIMIT_EXCEEDED`. The response includes both concurrency and rate limit headers:

```
HTTP/1.1 429
concurrency-limit: 30
concurrency-remaining: 0
ratelimit-limit: 1000
ratelimit-remaining: 834
ratelimit-reset: 44
```

```json theme={null}
{
  "status": "error",
  "error": {
    "code": "PLATFORM.CONCURRENCY_LIMIT_EXCEEDED",
    "title": "Concurrency limit exceeded.",
    "message": "Maximum concurrent action requests per integration is 30. Currently 30 in flight."
  }
}
```

Notice how the rate limit headers show remaining quota (`ratelimit-remaining: 834`) while the concurrency headers show no remaining slots (`concurrency-remaining: 0`). This tells you:

* The underlying tool is at capacity for this integration, not Kombo itself.
* You still have rate limit quota, so waiting for `ratelimit-reset` is unnecessary.
* Slots free up quickly as in-flight requests complete, so a short retry is effective.

### How to retry

For `PLATFORM.CONCURRENCY_LIMIT_EXCEEDED`, retrying the same request is
expected and does not run the unified action twice. Kombo rejects the call
before the unified action runs (no concurrency slot was acquired). That differs
from failures where the outcome is unclear, e.g., a timeout after a write may
still have succeeded, so retries need extra care.

1. Retry the failed request with exponential backoff: start with a short delay (e.g., 1s) and increase on repeated `429`s (2s, 4s, 8s, ...).
2. Optionally, throttle proactively: check `Concurrency-Remaining` on successful responses to slow down before hitting the limit.

## What this means for your integration

Concurrency limiting does not cap steady-state throughput below what the
downstream tool can sustain. The integrated system still bounds how fast work
completes. The limit turns that constraint into immediate `429` responses
instead of long waits and unbounded in-flight unified actions.
