Skip to main content

Celonis Product Documentation

Data Pools

The main structural element of Data Integration is the Data Pool. Data Pools cluster Data Connections, Data Jobs, Schedules, and Data Models, equipping you with everything you need to set up a data integration workflow.

The three key steps to set up a Data Pool are:

  1. Configure a Data Connection

  2. Create a Data Job

  3. Create a Data Model

Group_11.png

After you have created at least one Data Pool, the home page of Data Integration provides you with the following options:

  1. Create a new Data Pool.

  2. Search for an existing Data Pool - the view will update as you type.

  3. See your overall Data Consumption

  4. Name of your Data Pool and APC consumption per Data Pool

  5. Tag of the Data Pool

  6. The aggregated status of all components within the Data Pool - if there is at least one component with the status error or canceled, this will be reflected in the status of the Data Pool.

  7. Creator of the Data Pool

  8. Last execution time of components inside the Data Pool

  9. Additional configuration options for a Data Pool such as permissions etc.

General Structure

10944646.png
  1. The main structural element of Data Integration is the Data Pool. Data Pools cluster Data Connections, Data Jobs, Schedules, and Data Models, equipping you with everything you need to set up a data integration workflow.

  2. Data Connections allow you to connect different source systems (e.g. SAP ECC, Salesforce) from which to extract data and check their status. Creating a Data Connection also defines the space on the underlying database in which data is stored.

  3. Data Jobs combine Extraction and Transformation Tasks. There are two distinct types:

    1. Data Connection Data Jobs: Data Jobs that are linked to a specific Data Connection and allow you to extract and transform data from that source.

    2. Global Data Jobs: Data Jobs without a Data Connection cannot extract data, but they can be used to combine data from different systems in a common Scope called Global Scope.

  4. Schedules allow you to specify when and in which frequency your Data Jobs are executed automatically.

Further elements

Despite the main structural elements explained above, there are two other components in a Data Pool that can be used in multiple places inside a Data Pool:

  • Data Pool Parameters: Centrally defined values that can be reused in Extractions and Transformations across different Data Jobs.

  • Task Templates: Centrally stored Tasks that can be reused in different Data Jobs. Those Templates are useful in case you want to connect multiple systems that are quite similar (e.g. multiple instances of SAP ERP).

Tip

To learn more about Data Integration, see the free online training track from Celonis Academy on how to Get Data into Celonis Platform.