Skip to main content

Celonis Product Documentation

Best practices for ML Workbench

When using the ML Workbench capabilities, we recommend the following best practices.

General Recommendations

When starting your development journey in the ML Workbench, we recommend following these principles to optimize the efficiency of your experience:

  • Ensure stability and consistency by always utilizing the base environment in all notebooks.

  • Implement effective version control by utilizing tools such as Github, GitLab, or Bitbucket. These tools are conveniently integrated into notebook extensions for ease of use.


While the above recommendations are essential, pairing them with the following additional principles will greatly enhance your development experience.

These principles, focused on resource and environment management, are available to Dedicate Plan account holders only.

Environment Setup (Dedicated Plan)

When setting up your ML Workbench environment, you should first distinguish between development and production environments.

  • Development Environment: Used for prototyping and testing.

  • Production Environment: Used when deploying production-ready scripts and applications.

Then, once you've decided, you should set up your resource pools for this environment.

Resources Pools

A resource pool is a set of resources that work on specific activities and these must be setup for each of your environments.

Resource pools can be set up and configured by going to Configuration - Add pool.

As an example, the following resource pool is set up for a small team to create ML Workbench scripts:




Dedicated Plan

16 GB


50 GB

Resource Pool (PROD)

8 GB


25 GB

Resource Pool (DEV)

8 GB


25 GB

These resources are configured by going to Configuration - Add Pools:



Resource pools can then be associated with workspaces.

To create a workspace, go to Apps - New Workspace. Then once the workspaces are created, assign their respective resource pools to the correct environments.



Once your workspaces are configured, you can now setup your workbenches.

In the following example, the team decided to assign each developer a dedicated development workspace.


Because their workbenches have all been initialized with the same repositories, each developer implements their part of the script in their corresponding feature branch. This approach allows developers to work in parallel, rather than within the same environment.

The final and stable version of the script is then based on the default stable branch, rather than manually updated. This script then runs in a dedicated workbench in the production workspace, with that workbench set to productive. This ensures that the workbench is never automatically shut down.

Development Environments

When setting up development environments, we recommend following these principles:

  • Ensure optimal resource allocation by distributing resources as necessary among the notebooks

  • Maintain efficient resource utilization by promptly shutting down notebooks that are no longer required

Production Environments

When setting up production environments, we recommend following these principles:

  • Limit the number of notebooks within the ML Workbench by calculating the required resources and allocating a finite number of notebooks accordingly

  • Clearly identify productive notebooks by marking them as such in the configuration tab

  • Implement a structured resource allocation strategy by distributing resources in a fixed manner


When creating a notebook, we recommend following these principles:

  • Ensure consistency by checking out the main branch in all production notebooks

  • Implement an organized development process by performing development work in designated feature branches

  • Maintain accountability by having only one person work on a single feature branch in a single notebook at a time

  • Keep development and production environments separate by only using development data pools in development notebooks

  • Preserve the integrity of production data by only using production data pools in production notebooks

Application Keys

When connecting your ML Workbench to a data model, we recommend that you use an application key. Application keys are user-independent API keys, allowing you to assign granular permissions to your applications rather than to individual users.

Application keys are created by account administrators in the Admin and Settings area. For more information, see: Creating and granting permissions to application keys


You can still create a personal API key, however these should only be used for private ML Workbenches and not used to grant permissions to other users.

IP Restrictions

When using IP restrictions in your EMS, the outbound IP of the ML Workbench must be excepted.

To find your outbound IP, run the following command on the terminal of your ML Workbench:


Your outbound IP must then be added to your Allowed IPs list, available in the Admin and Settings area.


When adding your outbound IP, it must be suffixed with /32.

For example: should be added as

Allowed IPs

The following IPs are allowed:


ML Nodes

ML Clusters

















API Endpoints for Managing the ML Workbench

There are a number of API endpoints available, allowing you to trigger executions, manage resources, and more.

For more information, see: Connecting the audit log API to Celonis EMS

Upgrading a ML Workbench

ML Workbenches can be upgraded to the latest version (e.g. to get new features for Jupyter Lab) using the context menu in the MLWB overview page.



Upgrading a Workbench will have no impact on the files on '/home/jovyan'. This includes user generated files and logfiles. Also, no changes to the versions of the already installed python libraries are made.

In case of a bigger change (e.g. upgrading the version of python) an additional message is displayed asking for the confirmation of the user.

Storage Guidelines and Tips

Initial Storage

  • Default MLW App boots with 5GB Storage total, 1.2GB already used.

    Figure 2. Initial Storage bar on a new MLW App
    Initial Storage bar on a new MLW App

  • This is 5GB per MLW App, not 5GB shared across all MLW Apps of a team.

  • The 1.2GB used from the start are for the MLW setup (same as a new 64GB iPhone which only has 58GB available because of the iOS storage).

Storage tips - Current Storage

To view the largest files/folders, you can use the Terminal and run any of these:

  • For Overview top 10 top-level folders in nice format:

    du -hsx * | sort -rh | head -1
  • For Detailed top 10 any folders in raw format:

    du -a /home/jovyan | sort -n -r | head -n 10

Storage Tips - Cleaning/Deleting Storage

For deleting folders, since it isn't possible with the typical right-click, you can use the Terminal and run:

rm -rf foldername


Make sure you reference the folder with absolute path or correct relative path.