Extractor Builder
The Extractor Builder is a low-code development tool for data connectivity. It allows you to build your own Extractor using a guided interface. Once you’ve created your Extractor with a few clicks, users in your Data Pool will be able to use it as any other native Celonis Extractor, leveraging features including table/column configuration, pseudonymization, or custom filtering.
Besides creating a new extractor you have the option to customize one of the existing extractors (for example: Bamboo, Ironclad, Happyfox, Jira, Greenhouse, etc.) and easily export/import any Extractor that you’ve built with the Extractor Builder from/into different Data Pools or even Celonis team.
The Extractor Builder supports GET requests to all REST APIs that return a JSON or XML response. As a team admin, you can find the Extractor Builder as a new menu option in Data Integration.
Features and requirements
Features:
All components of native Extractor available (table/column selection, filtering, data type conversion, pseudonymization etc.)
Sample response based on API call to source system
Define request parameters and headers for filtering
Pagination
Various Error handling rules
Support for dependent endpoints
Export/import functionality
Customize existing Extractors (Happyfox, Bamboo)
Requirements:
Celonis Data Pool Admin Role
REST API only
Response in JSON or XML format
Only GET Requests
Authentication using Basic, Bearer, API Key or OAuth2
![]() |
Step-by-Step Guide
Step 1: Create a new Extractor
As a Team or Data Pool Admin, the Extractor Builder can be found by going to “+ Add Data Connection” and scrolling down to the section Custom Connections. By clicking on the Build a Custom Connection tile, you can create your own Extractor. After selecting the Build a Custom Connection tile, you have the options to create a new Extractor, customize an existing (pre-built*) Extractor, as well as export/import an Extractor from another Celonis team or Data Pool.
Once you’ve created an Extractor via the Extractor Builder it will appear like any other source system on the connection overview page. To edit an existing Extractor, you have to navigate to the connection overview page again and click on the three-dot menu of the Extractor you'd like to configure and click Edit.
* Those were pre-built and shared by customers or the Celonis team. The number of Extractors available for customization will constantly increase.

Step 2: Define Extractor information
After creating a new Extractor, you can provide its name (and optionally a description).
Step 3: Authentication method
Now, you have to select your source system’s authentication method, usually found in the system’s API documentation. You will find a short description incl. required input fields for each authentication method. Depending on your selection here, input fields in the data connection configuration to which users can configure a data connection based on your Extractor Builder will adjust accordingly.
Click Extractor Builder - Authentication Methods to learn more about authentication methods.
When setting up an authentication using OAuth (Authorization Code) you should use the following Redirect URL (Callback URL) when setting up the integration in the source system:
https://auth.redirect.celonis.cloud/extractor_redirect
If you cannot find the authentication method you are looking for, please get in touch with the Celonis ServiceDesk.
![]() |
Step 4: Connection parameters
In addition to authentication method-specific parameters, you can define additional parameters that will be displayed as user input fields in the data connection configuration. These parameters can also be accessed in the next steps of the Extractor Builder configuration - such as in the endpoint configuration’s request URL, request parameter or header definition - via their respective placeholder. Example: parameter for your system’s API version.
By default, the {Connection.API_URL} parameter will be created as a mandatory parameter and usually contains the host. Even though this is a default parameter, a default value can be determined.
![]() |
You also have the option to define a default value for the parameter and configure whether the parameter will be stored as a secret such that it will be displayed as a password input (Is Confidential) or whether the parameter is mandatory.
![]() |
Step 5: Define endpoints
The final step is to define the API endpoints to fetch relevant data.
![]() |
Configure endpoint
Define a name for your endpoint. This name is only used to differentiate between endpoints and does not have any functional impact.
Configure request
The request URL defines the API endpoint that is called and can be identified in your source system’s API documentation. It always starts with the connection parameter {Connection.API_URL} (s. previous section).
Add Request Parameter
Add request parameters to, e.g., apply filters to your API requests, such as the last creation or updated date filters. Available request parameters can be found in the source system’s API documentation, as well. Users of your Extractor Builder can provide parameter input in the extraction configuration.
Sticking with the Zendesk API ticket endpoint example from the last step, a common parameter use case here would be using the updated_at parameter (date format) to filter during delta loads.
Add Request Header
This is done via defining a key-value pair. (Example: define Accept as key and application/json as value for the API response to be returned as json)
The connection parameters that have been defined in Step 4 can also be leveraged here for both request parameters and request headers.
Choose Pagination Method
When extracting large data volumes via an API, its response usually does not return all values at once. Instead, it returns multiple pages with 100 records per page, e.g. To fetch data from all pages when extracting via the Extractor Builder, you only have to select the pagination mechanism which is used by your source system’s API and usually included in its documentation.
For the Zendesk example, information about the relevant pagination mechanism can be found here: Zendesk Documentation
If you cannot find the pagination method you are looking for, please contact Celonis ServiceDesk.
Configure Response After configuring the API request, you can also configure its response. First, define the name of the target table into which the response data will be written in your Celonis Data Pool (example “tickets” in screenshot). To define the response structure and content, you have two options:
Copy and paste a JSON response example from the API documentation.
Sample a response directly from your source system (requirement: a data connection to the source system is already configured).
Based on the JSON response, the Extractor Builder automatically creates a table structure including all columns, data types and nested objects.
If your JSON response has multiple roots, you have the option to specify which root you want to extract via the dropdown.
Further, you can adjust column types/formats as required and select primary keys (will be used as default primary keys in the extraction configuration) in the table configuration.
You can easily delete elements from the created table structure by deleting the respective key value pair from the JSON response. Elements you remove here will not be extracted. (Example: remove raw_subject from extraction by deleting the corresponding key value pair from the response as shown in the screenshot below.)
Note that if you define a primary key for a parent table, the primary key will automatically be created as a foreign key column in its nested tables if they exist. (Example: in the screenshot, the tickets_id column in the nested table is automatically created after selecting the id column as parent table (tickets) primary key.
Based on the JSON response nested tables are created automatically. These are indicated by square brackets in the response.
An example for that in the shown response is e.g.
"tags": [ "Enterprise", "Other_tag" ] Based on this nested json response, the table tickets$tags will be automatically created:
If you don’t want these dependent tables to be extracted you can also delete the respective parts from the json response.
That’s it! Click Finish to save your current configuration. Of course, you can always go back and adjust it.
Add an additional endpoint
You can flexibly add additional endpoints to your Extractor Builder. Often, the easiest way is to duplicate an existing endpoint and adjust its configuration.
Add a dependent endpoint
Dependent endpoints are using another endpoint’s response element as input for their request. For example: extracting audit logs for each Zendesk ticket via the tickets_audits endpoint (dependent endpoint) based on the ticket_id returned by the tickets endpoint. From the API documentation, you would see that the request structure for ticket_audits would look like this: GET /api/v2/tickets/{ticket_id}/audits (you would query the ticket audits by iterating over every extracted ticket_id). Let’s set this up:
![]() |
To add dependent endpoints, you follow the same steps as for creating normal endpoints with the only difference being that you have to define a dependency. You can define dependencies in your request URL (as in the example above), as well as in a request parameter. In both cases, you would use the dependency parameter {Dependency.id} and add it either to the endpoint URL (in this example, the URL would then be {Connection.API_URL}/v2/tickets/{Dependency.id}/audits) or as a parameter.
In the Zendesk example, we would configure the tickets table’s id column as dependency ID, which automatically creates the {{Dependency.id} parameter. Now, the dependent endpoint will be requested for every previously extracted, unique dependency ID value
![]() |
![]() |
The remaining configuration steps are identical to those of other endpoints.
Error handling
By default, Extractor Builder extractions will fail if an API request’s response status is not 200 (OK). Error handling rules provide the option to continue the extraction besides non-200 response status. They can be based on the HTTP status/response body, or on a response field.
In the Zendesk dependent endpoint example from the previous section, it could happen, e.g., that a ticket does not have an audit log yet. Then, querying the dependent endpoint would return response status 404, indicating that no audits have been found for a ticket id. To resume the extraction in those cases and continue extracting the audit logs for the next ticket, you can configure an error handling rule as in the below screenshot.
![]() |