Skip to main content

Celonis Product Documentation

Data Integration Release Notes

Three new audit logs have been introduced. One for creating a new version, one for deleting a version and one for copying a version into another data pool on the target team.

For details, see Audit Logs.

The custom data pipeline monitoring enables you to use the EMS analytics to monitor your data pipeline health.

System events of Data Integration are collected and made available in a dedicated Data Pool called 'Monitoring Pool' in form of tables. You can use these tables in transformations to adjust them to your needs and in Process Data Models to use the information in analyses, views or signals.

All details can be accessed in Custom Data Pipeline Monitoring.

55707126.png

This is part of the cloud release v1.70 of Data Integration.

To increase the data privacy and security of our customers even more, we changed the default behavior of data permissions in data models. If data permissions are in use, any user not mentioned will have no data access at all. This might mean that users that had access to all data before, don't see any data now in their analyses or views. We recommend for those users to create a group in the team settings and a manual permission assignment for this group with unlimited access on the data model.

For further information on the Data Permissions Update and a detailed step-by-step guide on how to to get data access rights, see Data Permissions Update.

For further information on Data Permissions, see Data Permissions.

This is part of the cloud release v1.62 of Data Integration.

The Data Pool Versioning enables you to trace changes in your Data Pools. By backing up stable versions and being able to jump back to them if any issues are detected in the current setup, the Data Pool Versioning can increase the health of productive data pipelines.

Moreover with the Data Pool Versioning a new publishing workflow of Data Pools is introduced that allows you to clearly separate the development and productive environment. Versions of Data Pools can be copied easily from one data pool into another which allows large teams to set up a more scaleable software development lifecycle for their data pipelines.

All details can be accessed in Versioning of Data Pools.

This is part of the cloud release v1.70 of Data Integration.

55706960.png

Dynamic parameters can now also be created as pool parameters. You need to select the data connection and same as for local parameters the table and column to be used in the dynamic calculation. You can also select the function to be applied on the column specified.

See Parameters for more detailed information.

We are moving all teams to a more concise Data Integration Homepage format. Many teams appreciate this concise format already. For those teams nothing changes.

For teams that experienced their Data Integration with the below shown format so far, will now also enjoy the new format.

Previous format

57542526.png

New concise format

57542527.png
Who is the change relevant to?

This change is only relevant for teams using Data Integration On-Premises, i.e. use the marked tab below.

57543620.png
What changed?

We enabled the permission handling in the Admin and Settings for teams that use Data Integration On Premises:

  • Even if Data Integration is disabled, the permission control in the team settings is now visible and usable.

  • To get an overview of which user has which permissions, you can use the 'Export As CSV' button. The exported CSV file includes all users and permissions for Data Integration & Data Integration On-Premises. The service name remains Data Integration for both applications.

57543619.png
  • If you use Data Integration and Data Integration On-Premises in parallel, then the permission table will reflect the as is status until you confirm for each user that she or he is also allowed to have the permissions for Data Integration On-Premises.

    That means that the table (shown below) will show you initially solely the permissions for Data Integration. They don't automatically include Data Integration On-Premises permissions. As soon as you make a change to a user's permissions (even just deselecting and selecting a permission again) and saving it, that user will have these permissions on both Data Integration and Data Integration On-Premises. If you are not sure for which user the permissions in the table below apply to the Data Integration and for which user they apply to both the Data Integration and the Data Integration On-Premises, you can use the "Export as CSV" option mentioned above. In the exported file you can detect the permissions on team settings level by looking at the service Data Integration and a blank object name. If this line appears twice for a user, then he or she has the mentioned permissions in the permissions column for both: Data Integration and Data Integration On-Premises. If this line appears only once for a user, then she or he has the mentioned permissions in the permissions column only for Data Integration.

57543621.png
  • For custom pool providers the 'CREATE DATA POOL' permission only works in combination with the 'EDIT ALL DATA POOLS' permission.

Next to searching for the maximum value of a specified column of a specified table, dynamic parameters can now also be selected to:

  1. Find the minimum value of a specified column of a specified table by using the operation type FIND_MIN.

  2. Find the list of distinct values of a specified column of a specified table by using the operation type LIST.

For further information on Dynamic Parameters, see Parameters.

This is part of the cloud release v1.69 of Data Integration.

Understanding what is going on with your Data Jobs is important for monitoring the stability of your data pipeline. In order to have more control when you are notified about a certain Data Job, you can now define alerts for every Data Job. These replace the previously existing Data Job subscriptions. Existing subscriptions are migrated so that you automatically have alerts enabled and you are notified in case of failures and when the Data Job is successful after a failure.

This is part of release v1.35 of Data Integration.

35553458.png

You can define on a more detailed level when you would like to receive an email:

  • If the Data Job fails.

  • If the Data Job succeeds.

  • If the Data Job succeeds (but only once after a failure).

You can change this setting at any time.

35553459.png

To accelerate performance and guarantee a stable environment for all our customers a name mapping limit was introduced of 200,000 records per name mapping. This limit is enforced for new uploads only.

See Name Mapping for more information.

If you want to duplicate a Data Pool, you can click on the three-dot menu of the Data Pool and select the Copy-to option.

All the teams you have access to, including the team you are logged in to, will be displayed to select the destination team.

If you select the team you are logged in to, the data pool will be duplicated to the same team.

This is part of the cloud release v1.69 of Data Integration.

When table meta data changes, e.g. columns are added or removed, the default behavior of the Data Push API is to reject new data in a different format to preserve data integrity. You could either perform a full load or you could change the table structure manually in these cases. There might be some use cases however, where tables change constantly or NULL values are not a big data integrity risk when analyzing the data. In those cases, you might simply want to nullify non-existant values. This is what the new API option provides.

Note

The default remains to reject metadata changes. Moreover, only added and removed columns will be handled. Data type changes still have to be resolved manually or with a full load.

Example

Existing data in the EMS

column_a (primary key)

column_b

a

1

b

2

c

3

Newly pushed data

column_a

column_c

a

7

b

8

d

11

As you can see, one new entry is added (d). However, column_b does not exist and column_c is new.

Merged data set with nullification

column_a

column_b

column_c

a

1

7

b

2

8

c

3

NULL

d

NULL

11

As you can see, the existing value c did not receive an update and therefore has a NULL value in column_c which did not exist before. The new entry with primary key d is NULL in column_b because it does not exist in the new data.

In order to use this new upsert strategy you need to specify the parameter upsertStrategy when creating a push job. See the documentation for details.

This is part of release v1.35 of Data Integration.

You can now receive an overview of table in your team and identify tables with the biggest data consumption footprint.

In the section "Data Consumption" on the Data Integration homepage, you see a list of tables which you can order by size and row count.

30999101.png

If you only want to update a subset of tables in a Data Model, you no longer need to refresh all tables which might take some time. Instead, you can take advantage of the new API that allows partial loads of tables.

See the Process Data Model API for further details.

This is part of release v1.31 of Data Integration.

You can now decide which tables should be refreshed when loading a Data Model after a manual Data Job execution. This is a follow-up feature to Partial table update in Process Data Models via API which already allowed you to do the same via API.

In order to only partially load a subset of tables in a Data Model load you simply need to execute a Data Job and then when configuring the execution (see screenshot on the right), you can choose among the Data Model tables.

Example: You only run one transformation changing the activity table of the Data Model, but nothing else. You can then choose the activity table to be refreshed and de-select all the other tables. This means the Data Model load will be much faster in most cases.

This is part of release v1.35 of Data Integration.

35554015.png

You can now find and install Process Connectors right from the Data Integration homepage by clicking on "Import Template". Alternatively, you can navigate to https://<team>.<realm>.celonis.cloud/store/ui/processes.

This will bring up the page with all Process Connectors just like you were able to access when using the "App Store" previously.

This is part of release v1.46 of Data Integration.

41192849.png

In a Data Job, you may specify which Data Model should be loaded if and only if the Data Job is successful. This helps you to make sure that only the Data Models which use Data that comes from the Data Job in question are loaded.

The documentation contains details on how to set it up.

This is part of release v1.32 of Data Integration.