Before you can read data

Data will start automatically syncing once your customer has successfully completed the connection flow to connect their HRIS.

While the first sync is still in progress, you cannot request any data from Kombo. If you attempt to do that, we will give you a 503 error that looks like this:

{
  "status": "error",
  "error": {
    "message": "The first sync of this integration didn't finish yet! You can keep polling this until you get a successful response or react to our webhooks."
  }
}

As mentioned in the error message, you can keep requesting the endpoint and wait until you get a non-503 response or listen to our sync-finished webhook, as described in setting up your webhook.

🦉 The first data sync could take a few seconds up to multiple hours, depending on your scope config, the system itself (some are heavily rate-limited), and how much data is in your customer’s system.

The UX for your customer should therefore be designed around some delay/waiting time.

Getting employees

Once the first sync is finished, you must fetch the data from the Kombo API and store it in your system so that you can show it to your customer without having to call the API again.

When querying data from the Kombo API, you should consider the following things:

  • We recommend setting the page_size query param to the maximum batching (250 elements) to minimize the number of API calls and maximize API response size.
  • Our API is optimized to serve a few calls with large payloads at comparatively low latencies. That makes our API perfect for batch-requesting large amounts of data in a few seconds.
  • Make sure to implement pagination, by using the next key in our API response and passing it in the cursor query param

A request to the get employees endpoint could therefore look like this:

curl --request GET \
  --url 'https://api.kombo.dev/v1/hris/employees?cursor=26vafvWSRmbhNcxJYqjCzuJg&page_size=250' \
  --header 'Authorization: Bearer <token>' \
  --header 'X-Integration-Id: <x-integration-id>'

Once you have the data, you can perform your specific business logic. Most customers in HRIS define their behavior like this:

  • Users to add: These are all of the records that, until now, did not appear in your query. They need to be inserted into your database and probably should get an invitation in case it’s the first time they are onboarded to your product.
  • Users to update: These are users that already have access but were changed in any given property (e.g. changed address, bank details, etc.). Those users can just be updated in the database.
  • Users to remove: These are users that you currently have in your database but that don’t show up in the API response anymore. This means you should notify the user that their access will be removed and mark them as deleted. It can happen that a user will lose access and get onboarded at a later time. Therefore you should keep track of off-boarded employees so that you can distinguish them from the ones that get access to the product for the first time. Just storing the ID in the HRIS system is a GDPR-compliant and relatively robust way to do so.

Fetching only updated data

Instead of reading the entire dataset, we highly recommend reading only employees that have been updated since you last read them. We have implemented change tracking to make this as easy as possible.

The change-tracking of Kombo (which you can learn more about here) centers around the updated_after query parameter, which you can use in the following way:

  1. Store the timestamp at which you start ingesting the data from the first sync in your own database. This field should probably be called something like this:
customer_idkombo_integration_idkombo_last_sync_started_at
<end_user.origin_id>personio:8d1hpPsbjxUkoCoa1veLZGe51970-01-01T00:00:00.000Z
<end_user.origin_id>hibob:B1hu5NGyhdjSq5X3hxEz4bAN1970-01-01T01:13:24.000Z
  1. Every time Kombo is done syncing data, we send you a sync-finished webhook that looks like this:
{
  "id": "5gjAtURLPbnTiwgkaBfiA3WJ",
  "type": "sync-finished",
  "data": {
    "sync_id": "B89SCXXho7Yw8PGo8AKJxLn4",
    "sync_state": "SUCCEEDED",
    "sync_started_at": "2021-09-01T12:00:00.000Z",
    "sync_ended_at": "2021-09-01T12:30:00.000Z",
    "sync_duration_seconds": 1800,
    "integration_id": "personio:8d1hpPsbjxUkoCoa1veLZGe5",
    "integration_tool": "personio",
    "integration_category": "HRIS",
    "log_url": "https://app.kombo.dev/env/production/logs/C3xUo6XAsB2sbKC7M1gyXaRX"
  }
}
  1. You should make a lookup in your database, finding the kombo_last_sync_started_at for this specific integration and then pass it again in the updated_after query param of the get endpoint, like this:
curl --request GET \
  --url 'https://api.kombo.dev/v1/hris/employees?updated_after=1970-01-01T00:00:00.000Z' \
  --header 'Authorization: Bearer <token>' \
  --header 'X-Integration-Id: <x-integration-id>'
  1. We will return all records that have been altered in one of the following ways:

    • property changed (i.e. employment_status property of an employee)
    • relation property changed (name of a group the employee is part of)

Implementing Kombo in an existing system (Matching users)

When fetching employees from Kombo, you will encounter three possibilities:

For the case of onboarding a customer that already has some employees onboarded to your system, you need to implement a way to match the existing user records in your database with the ones you’re getting from the Kombo API.

You basically want to have an upsert that finds existing employees and persists the Kombo ID of those employees and creates new entries for employees that are not yet in your database.

Before diving into the matching logic, let’s look at the different values that you can use to identify an employee contained in an API response:

  • id: This is a unique, Kombo-specific id that is randomly generated and not present in any other system
  • remote_id: This is the id of the record in the “remote” system (the HRIS, e.g. Personio). The ID is always a string but the format will vary widely across tools. This is a fairly robust value to use as an identifier but it will change in case the employee is deleted and re-created in the HRIS.
  • work_email: The email address of the employee. While this should be guaranteed to be unique, it could be that the email address of people changes over time (e.g. bc the email naming convention is changed john@example.comjohn.d@example.com or the email domain changes john@example.comjohn@new-example.com). It could also be that an employee’s work_email is added 2+ to a tool because they left a company and returned OR moved from a temporary to a full time position. You should make sure you are aware and ready to handle these cases if and when they come up.

Now for the matching logic. You should try to match the incoming records based on those values (in order of reliability):

  • remote_id: if you are already storing the external ID of a user in your system, you should use this as much as possible
  • work_email: the next best option is using the email to match. Please note above that emails will not always be unique and you should be ready to handle edge cases (e.g someone leaves a company and returns later on) for duplicate emails when matching. For some tools temporary workers will all have the same email making matching on work_email not possible.
  • work_email but with a changed domain: there might be cases in which the email differs only by the domain. You can probably match based on the “prefix” of the email, but it’s not a 100% reliable operation and should be done cautiously.
  • Edge cases:
    • In case you cannot match all employees between the API response and your own database, you should ask your user to match the remainder of the employees.
    • A possible solution is showing a UI with two sides (the new data on the left, and your current data on the right) that displays the mismatched records.
    • You should suggest a mapping based on the full name / private email of the employee but let your customer confirm that those mappings are correct.
    • If after the manual mapping, there are still some mismatched records, you should consider those employees to be removed/added to your system.

Once you match an employee you should persist both the id and remote_id in your database. This could look like this:

user_idhris_idkombo_id
<your_user_id1><kombo_remote_id1><kombo_id>
123412394637E2gyuv6TmvtByzBxW9Sxt53

Handling failing syncs

It is possible that a sync fails, and if that happens, you will still be able to access the data based on the latest successful sync. Once the sync succeeds again, you will be able to get all updates that have happened since the last time.

When a sync fails, the sync-finished has a data.sync_state property that is not "SUCCEEDED":

{
  "id": "5gjAtURLPbnTiwgkaBfiA3WJ",
  "type": "sync-finished",
  "data": {
    "sync_id": "B89SCXXho7Yw8PGo8AKJxLn4",
    "sync_state": "AUTHENTICATION_FAILED",
    "sync_started_at": "2021-09-01T12:00:00.000Z",
    "sync_ended_at": "2021-09-01T12:30:00.000Z",
    "sync_duration_seconds": 1800,
    "integration_id": "personio:CBNMt7dSNCzBdnRTx87dev4E",
    "integration_tool": "personio",
    "integration_category": "HRIS",
    "log_url": "https://app.kombo.dev/env/production/logs/C3xUo6XAsB2sbKC7M1gyXaRX"
  }
}

If you receive these values it means the sync went through and you’ll get updates on the data

sync_stateexplanationnext steps
SUCCEEDEDeverything went fine
PARTIALLY_FAILEDsucceeded with non-fatal errorsKombo will be notified and look into the issue asap

These values mean the sync failed and the problem is only fixable by Kombo

sync_stateexplanationnext steps
CANCELLEDThe sync was actively canceled by KomboThis happens very rarely, has no negative side-effects, and if it does happen, we will schedule a new sync shortly after
FAILEDsucceeded but had non-fatal errorsIf this happens, we get an alert and will look into the issue to fix it ASAP
TIMED_OUTThe sync timed out before completionThis happens rarely and will cause an immediate and automatic restart of the sync. Kombo will be notified and look into the issue ASAP

These values mean the sync failed because we were not able to authenticate. We will try to sync 3 more times. Afterwards we will send the integration-state-changed webhook with state INVALID and you / your customer will have to re-connect.

sync_stateexplanationnext steps
AUTHENTICATION_FAILEDThe sync couldn’t complete because the API credentials are invalid or don’t allow requesting all data points in your scopeThis can only be fixed by your customer adding additional permissions to the credentials or updating the credentials all-together