Skip to main content

Celonis Product Documentation

Data Integration

Data Integration integrates and transforms data from a multitude of source systems into Celonis Execution Management System. It allows you set up automatic data pipelines which provide a loaded data model to all other services in Celonis Execution Management System. Its main benefits include:

  • End-to-end solution: Data integration made with the process mining use case in mind

  • Easy set-up: Configuration through an intuitive UI

  • Up-to-date data: Small, frequent and configurable delta loads

  • Scalable and flexible infrastructure: Possibility of large data volumes both on premise and in the cloud

  • Multi-functional: SAP and other systems can be connected

Quick Links Data Connections

41191180.png

After you have created at least one Data Pool, the home page of Data Integration provides you with the following options:

  1. Create a new Data Pool.

  2. Import a template Data Pool.

  3. Search for an existing Data Pool - the view will update as you type.

  4. Group the Data Pools by tag.

  5. Filter the Data Pools by tag.

  6. Perform actions on a Data Pool.

  7. The aggregate status of all Data Jobs within the Data Pool is displayed - if there is at least one Data Job with the status error or cancelled, this will be reflected in the status of the Data Pool.

  8. Click on the Data Pool to see and modify its contents.

General Structure

10944646.png
  1. The main structural element of Data Integration is the Data Pool. Data Pools cluster data connections, data jobs, schedules and data models, equipping you with everything you need to setup a data integration workflow.

  2. Data Connections allow you to connect different source systems (e.g. SAP ECC, Salesforce) from which to extract data and check their status. Creating a data connection also defines the space on the database to where the data is extracted.

  3. Data Jobs combine extraction and transformation tasks and allow you to execute them in sequence. There are two distinct types:

    1. Data connection jobs: Jobs that are linked to data connection can extract and transform data from that source.

    2. Global jobs: Jobs without a data connection cannot extract data, but they can be used to combine data from different systems into a common pool space.

  4. Schedules allow you to specify when and in which frequency your jobs are executed automatically.Schedules

Further elements

Despite from the main structural elements explained above there are two other components in a data pool which can be used in multiple places which is why they are defined on a pool level:

  • Data pool parameters: They are centrally defined values which can be reused in extractions and transformations across different data jobs and templates.

  • Task templates: They are centrally stored tasks which can be reused in different data jobs. This is useful if you would like to connect multiple systems which are very similar (e.g. multiple instances of SAP ERP).

Tip

To learn more about Data Integration, see the free online training track from Get Data into EMS Celonis Academy: Get Data into EMS (6h).

Please check that the HANA database actually uses a DATE column or a DATETIME column and not a string column like NVARCHAR. If the column is NVARCHAR you can still use filters, but you have to use string values or text parameters instead of dates.

  1. Acquire a certificate for the domain you want to connect to, and store it somewhere memorable.

  2. Navigate to the folder in your jre where "cacerts" is located. This is usually: .../jre/lib/security

  3. Open a terminal window from this folder

  4. Run the following two commands, to import the certificate from step 1.

    keytool -import -trustcacerts -keystore cacerts -storepass <KEYSTORE-PASSWORD> -noprompt -alias alias -file <PATH TO CERTIFICATE>

    By default, KEYSTORE-PASSWORD is "changeit", without quotes.

You need to make your database trustworthy by using the following command in MSSQL:

ALTER DATABASE <DATABASE_NAME> SET TRUSTWORTHY ON

You receive this error if the owner of your database is a domain user. This can be fixed by making e.g. the user SA the owner of your database.

Most likely, you do not have Java 64-bit installed. Please make sure that this is installed by typing the following in the command line:

java -version # should result in something like "Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)"

If it is not 64-bit, please download the Open JDK version or the official Oracle release and make sure that it is the 64-bit server version.

Another cause might be that you try to run our software on a 32-bit operating system which is unfortunately not supported.