Skip to main content

Celonis Product Documentation

Data redaction

Data redaction is a process of removing pieces of data from sensitive or personally identifiable information to ensure its security. Data redaction in Task Mining collects all data necessary for the effective task mining process without also collecting any personal or sensitive data.

From a task mining perspective, data redaction is applied on the task mining client installed on the desktop directly after an event is captured. That data is then modified before any information is stored to disk or sent to the EMS.

Note

You can always define additional custom data redaction in the EMS using custom SQL transformations in case the options provided by the task mining client are not sufficient.

Setting up data redaction

Data redaction for the task mining client is configured in the configuration editor (available on Windows OS only). With data redaction rules, you define what types of collected data are hidden using regular expressions (regex). By adding, editing, and deleting regex rules you decide if data that matches your regex is hidden or not.

Note

Masking personally identifiable information (PII) may reduce the quality of the Task Mining results.

To set up data redaction from your task mining project:

  1. Click Client Settings.

    Client_settings.png
  2. Select Use advanced settings and then click Configure Settings.

    configure_settings.png
  3. Click Download Configuration Editor and install the configuration editor on your Windows computer.

  4. Click Download File to download the configuration file.

  5. Open the newly installed Configuration Editor and then click the Open File button to open the downloaded file.

    task_mining_configuration_editor.png
  6. Click Data Redaction on the left and then click the New Pattern button.

    configuration_editor_data.png
  7. Edit your pattern accordingly and then click OK. For further information about creating regular expressions, see Creating regular expression patterns.

  8. Preview your configuration locally by clicking Preview, interacting with data on your machine that should be masked, and then validating the redactions in the Live Event Monitor. Adjust your configuration if necessary and then preview again.

  9. Once you're satisfied with the configuration, click Save.

  10. Return to EMS, upload the new configuration file and then click Save Configuration.

    save_configuration.png

    Your data redaction configuration is now applied to your task mining project. However, it can take a couple of minutes for this change to be applied to any connected task mining clients.

Creating regular expression patterns

You can create regular expression patterns with the help of free online tools, allowing you to turn standard text into regex. This regex is not case sensitive, meaning that you can use both upper and lowercase text can be matched to your pattern.

There are a number of commonly used regular expressions that we recommend:

Description

Regular Expression

Example - Attribute without masking

Example - Attribute with masking applied

Replace all windows usernames with ‘***’

{{username}}

Saplogon - TR21 - Profile a.doe

Saplogon - TR21 - Profile ***

Replace all machine names with ‘***’

{{machinename}}

System Settings - LP12392

System Settings - ***

Replace all emails with ‘***’

[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+

Outlook - Draft Mail - t.cookofficial@apple.com

Outlook - Draft Mail - ***

Replace all social security numbers with ‘***’

\d{3}-\d{2}-\d{4}

Adobe Acrobat - Verification 1231-21-2929.pdf

Adobe Acrobat - Verification ***.pdf

Replace non-domain part in URL

(?<=((https|http):\/\/)?(www\.)?[a-zA-Z0-9@%()\-+~]+(\.[a-zA-Z0-9@%()\-+~]+)?\.[a-zA-Z0-9()]{1,6}\/)[\S]+

https://www.google.com/search?q=testquery

https://www.google.com/***

Using variables in regular expression patterns

You can also use variables provided by the task mining client in the regular expression patterns. This simplifies finding regular expressions, such as the window username on each client, instead of specifying and maintaining a list of all possible windows usernames in the configuration centrally.

Supported variables are:

Description

Name

Example

Windows username

{{username}}

m.doe

Machine name

{{machinename}}

LP1924