> ## Documentation Index
> Fetch the complete documentation index at: https://docs.kombo.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# How Kombo Syncs Data

> Understanding how Kombo keeps integration data up-to-date internally without sacrificing API reliability and performance.

# Syncs in Kombo

We're mirroring the data in the source systems into a database at regular
intervals (or when you [manually trigger a sync](../v1/post-force-sync)). When you
call our API, we will simply read from our database, resulting in endpoints that
always respond reliably and quickly.

Through our syncing approach, we can also provide [webhooks](./webhooks) for
systems that don't provide webhooks themselves.

<Info>
  This page explains how Kombo syncs data internally. If you're looking for how
  to fetch data from Kombo into your system, see our [Fetching Data
  guide](../getting-started/fetching-data).
</Info>

<Note>
  Note that syncs adhere to [your scope config](../features/scopes), so if there
  are sensitive data points you're not using, then those will **not** end up in
  our database.
</Note>

## Full Syncs vs. Delta Syncs

Kombo uses two different sync strategies to keep your data up-to-date. Each
serves a different purpose and has different trade-offs regarding speed,
completeness, and frequency.

### Full Syncs

A full sync **re-fetches all data** from the connected system. It pulls every
record (employees, candidates, jobs, etc.) regardless of whether it changed
since the last sync.

**Key characteristics:**

* Fetches the **complete dataset** from the remote system
* **Detects deletions**: records that existed before but are no longer returned
  by the source system are marked as deleted in Kombo
* **Reconciles data quality issues**: because everything is re-fetched, full
  syncs can correct inconsistencies that may have been missed by delta syncs or
  webhooks
* **Picks up configuration changes**: if you update your
  [scope config](../features/scopes) to include new fields, a full sync is needed to
  backfill those fields across all existing records
* Runs at a **lower frequency** than delta syncs
* Takes **longer to complete** since it processes all records

Full syncs are essential because they are the only sync type that can reliably
detect deletions and ensure the complete dataset is consistent.

### Delta Syncs

A delta sync **fetches only data that changed** since the last sync. It uses
upstream filters (like `updated_after` timestamps) to request only new or
modified records from the source system.

**Key characteristics:**

* Fetches only **new and modified records**, making it significantly faster
* **Cannot detect deletions**: since it only asks for changes, it has no way of
  knowing which records were removed
* Runs at a **higher frequency** than full syncs
* **Not available for all integrations**: delta syncing depends on the source
  system's API supporting change-based filtering

### How They Work Together

When an integration supports delta syncs, Kombo runs both types on
complementary schedules:

| Sync type  | Frequency | What it fetches      | Detects deletions |
| ---------- | --------- | -------------------- | ----------------- |
| Full sync  | Lower     | All records          | Yes               |
| Delta sync | Higher    | Only changed records | No                |

For integrations that **don't** support delta syncs, Kombo runs full syncs at a
higher frequency to compensate for the lack of incremental updates.

The exact sync frequency depends on your plan and can be customized on higher
tiers.

<Info>
  Some integrations also support **upstream webhooks**, which provide near
  real-time updates for individual record changes. When webhooks are active, the
  polling sync frequency may be reduced since webhooks already keep the data
  fresh. Learn more about [upstream webhooks](./upstream-webhooks).
</Info>

## Sync Scheduling

Kombo uses a frequency-based scheduling system that automatically determines
when syncs should run for each integration.

### How Scheduling Works

The scheduling system works by tracking the **last sync time** and adding the
**configured frequency** to determine when the next sync is due.

For example, if a full sync completed at 2:00 PM and the configured frequency is
10 hours, the next sync will be due at 12:00 AM (midnight).

### Sync Frequencies

Different sync types run at different frequencies:

* **Full syncs** run less often when delta syncs are available, and more often
  when they are not
* **Delta syncs** run more frequently than full syncs

The exact frequency depends on your plan and can be customized on higher tiers.

When you manually trigger a sync, it **moves the schedule forward**:

**Example**: Say automatic full syncs are scheduled every 10 hours:

* Last sync: 10:00 AM → Next due: 8:00 PM
* Manual sync at 2:00 PM → Next due: 12:00 AM (midnight) (not 8:00 PM)

The schedule resets from the time of the manual sync, rather than keeping the
original schedule.

## Real-time Data Updates

In addition to full and delta syncs, we are also able to receive webhooks from
selected tools to keep our data in sync more efficiently. Read more about it
[here](./upstream-webhooks).

## Data Normalization

When syncing data from connected systems, Kombo normalizes records into the
unified model. Here's how different data scenarios are handled:

### Missing and Unsupported Fields

If a field is not available in the source system or the integration doesn't
support it, Kombo sets it to `null`. Records are **never dropped** due to
missing fields — you'll always receive the full record with `null` for any
unavailable properties.

This means you should design your system to handle `null` values gracefully,
since different source systems expose different sets of fields.

### Enum Normalization

For categorical fields like `gender`, `employment_type`, or
`employment_status`, Kombo maintains extensive mapping tables that translate
hundreds of system-specific values into unified enum values. For example,
"Vollzeit" or "CDI" from a German or French system both map to `FULL_TIME`.

If a value from the source system cannot be mapped to a known enum value, Kombo
**passes through the original string** rather than discarding it. This means you
may occasionally see raw values from the source system alongside standardized
ones for less common integrations. See the documentation of the specific
endpoint and enum for more information. Some enums don't pass unknown values through.

You can use [remapping](../features/remapping/introduction) to customize how specific
values are mapped if the defaults don't fit your needs.

### Scope-Based Filtering

Fields that are not included in [your scope config](../features/scopes) are set to `null`
during sync. This ensures that sensitive data points you're not using never
enter the Kombo database, even if the source system provides them.

## How Syncs Write Data to the Database

Understanding how Kombo syncs write data to the database is crucial for
understanding data availability and timing behavior.

### Non-Atomic Data Streaming

**Syncs are not atomic operations**. Instead of waiting to commit all data at
the end of a sync, Kombo streams data directly into the database as it's
processed during the sync. This means:

* **Data becomes available immediately**: Updated records appear in API
  responses as soon as they're processed, even while the sync is still running
* **Partial sync results**: If you query the API with `updated_after` during an
  ongoing sync, you'll receive records that have already been processed
* **No "commit phase"**: There's no single point where all sync data becomes
  available at once

### Change Tracking with `changed_at`

Each record in Kombo's database has a `changed_at` timestamp that tracks when
the record was last modified. This field is automatically updated by database
triggers when any tracked field changes:

* **Immediate updates**: `changed_at` is set as soon as a record is upserted
  during sync
* **API filtering**: The `updated_after` parameter in API requests filters based
  on this `changed_at` timestamp
* **Real-time availability**: Records with updated `changed_at` timestamps
  become immediately visible in API responses that use `updated_after` filtering

### Sync Lifecycle and Database Operations

During a sync, Kombo performs these database operations:

1. **Upsert operations**: Each record is upserted (created or updated)
   immediately when processed. If there were changes, the `changed_at` timestamp
   is updated.
2. **Deletion tracking**: For full syncs, records not seen during the sync are
   marked with `remote_deleted_at` after the sync completes successfully.
3. **Reference validation**: Ensures data integrity by validating relationships
   between records.

### Implications for API Consumers

* **Progressive data updates**: You can start receiving updated data before the
  sync completes
* **Timestamp-based filtering works reliably**: Using `updated_after` will
  correctly capture changes made during ongoing syncs
* **No need to wait**: There's no need to wait for sync completion to access
  newly updated data
* **Consistent behavior**: This works the same way for all sync types (full,
  delta, and default syncs)
