Skip to main content

Celonis Product Documentation

How do I structure my Data Pools with multiple processes and systems?

It is best practice to have a Data Pool per process cluster (e.g. one for your ERP processes, one for you ITSM processes etc.) and/or according to regions and legal entities.

In the following, the guiding principles are outlined.

Permissions

Permissions and therefore restrictions with regard to configuring the data pipeline can only be applied on a Data Pool level. An exception is the permission to use data models which can be given on a data model level.

Consequence

If you need to restrict access to the data pipeline (e.g. extractions, transformations, viewing data in transformations, building data models), you need to create one Data Pool for every group of users.

Example

Company A has two subsidiaries: A1 and A2. Data engineers from A1 should not manipulate the extractions from A2. Therefore, you need to create one Data Pool for A1 and one for A2.

Combining systems

The data in Data Pools is separated and cannot be shared or moved. If you create one Data Pool that connects to one source system and second Data Pool to connect to a second source system, the data from both systems cannot be combined in transformations or data models.

Consequence

You need to put data that is supposed to be used together in one Data Pool.

Example

You want to analyze your purchasing process in which one part happens in SAP ECC and the other part happens in SAP Ariba. In order to combine the data of both processes, you need to create Data Connections to these systems in one Data Pool.

Performance and system load

The data in Data Pools is separated and if you use one Data Pool per process, you will need to extract the data that is shared between processes multiple times.

Consequence

Due to multiple loads, the overall needed time to extract all required data is higher. Moreover, data is duplicated. Therefore, it is recommended to have multiple process data models in one Data Pool if they share data.

Example

You want to analyze your ServiceNow Incident Management and Service Request process. Among the required data is customer data which is needed for both cases. This needs to be extracted multiple times if you use one Data Pool per process. If you combine both processes in one Data Pool you can handle everything in one extraction and you also have to create the Data Connection only once.