Skip to main content

Celonis Product Documentation

4. Set Up the Machine Learning Workspace

In the Accounts Receivable App, we are performing two calculations using a Python script (mentioned below) as follows. For these scripts we need to set up an ML Workbench.

  1. Customer Priority Calculation: We calculate Customer Priority based on Open AR, AR Due , Disputed Invoices, Invoices with Broken Promise to Pay and assign priority between 0 to 100. The higher number has more priority. Based on priority, collectors can contact customers for their Due Invoices.

  2. Data Importer: If a customer wants to send data in Celonis in CSV files on FTP then Data Importer will help you to download and load the files in Data Pool tables.

To setup ML workbench please follow the steps mentioned below:

  1. Open Machine Learning Tab from the top menu

    image9.png
  2. Create a New Workbench App by clicking on New App button.

    image26.png
  3. Enter the App Details shown below to create it. Please read the guide on which permissions need to be set for the App in order to read data from the Data Model.

    image101.png
  4. Open the ML App: Open the ML App by clicking on the App Card.

  5. Download and Configure the Python Script: We need to download the python script from github and follow the instructions to configure it.

    1. You can download the script from github into the ML App. For that we have to open the Terminal in the App and clone the github repository using the command below:

      git clone https://github.com/celonis/ar-backend-jobs

      image80.png

      Open the terminal

      image18.png

      Clone the repository

      Important

      Pre-requisites: You should have access to celonis’s github organization and a “Read” access to ar-backend-jobs repository. Please contact the AR Product team(ar.support@celonis.com) if you don’t have access.

    2. Go to ar-backend-jobs folder: Go to the ar-backend-jobs folder and install required libraries using the following commands:

      pip3 install -r requirements.txt

      pip3 install -r requirements-internal.txt

      image17.png

      Install Python library

    3. Configure Customer Priority Calculation Script: To configure this script, first we need to set the environment variable by creating a .env file in ar-backend-jobs folder.

      image84.png

      env. variable for Priority Calculation

      1. API_KEY: We need to create an Application Key from User Profile > Team Setting > Applications > New Application Key and give appropriate permission to access the Data Pool.

      2. CELONIS_URL: Configure team host url.

      3. PRIORITY_DATAPOOL_KEY: Need to configure id of Data Pool of AR Process Connector which you will get from browser’s URL.

    4. Schedule Priority Calculation Script: Next we need to schedule the priority script to run daily so that we can get the daily revised priority score of each customer. For that, Go to the Machine Learning Tab > Scheduling > New Schedule which will open the screen below where you have to enter a Schedule Name, Choose Workspace App that you have created in Step 3 and select the execute_priority.ipynb Notebook and select Schedule frequency as per your requirement.

      image87.png

      Scheduling Settings for Priority Calculation Script

    5. Configure Data Importer: After cloning the ar-backend-jobs repository, you have to configure data importer to make it working. For that, open config.py under ar-backend-jobs > dataImporter folder and configure the following properties.

      1. API_KEY: We need to create an Application Key from User Profile > Team Setting > Applications > New Application Key and give appropriate permission to access the Data Pool.

      2. CELONIS_URL: configure team host url.

      3. DATAPOOL_ID: Need to configure the id of the data Pool of AR Process Connector which you will get from the browser's URL.

      4. DATAMODEL_ID: Need to configure id of the Data Model of AR Process Connector which you will get from browser’s URL by opening Process Data Model.

      5. BUCKET_ID: Bucket Id of Celonis File Storage Manager. Kindly refer to the document for more details. This is required when you use Celonis FTP (IS_CELONIS_FTP = True)

      6. IS_SANDBOX: Set True if IBC is for a Sandbox.

      7. TEAM_NAME: Name of the team which will be used in email communication.

      8. IS_CELONIS_FTP: Set True if you are using Celonis FTP to place a CSV file for import into the Celonis Data Pool.

      9. FTP_PORT: FTP server port. This is required when IS_CELONIS_FTP=False

      10. FTP_HOST: FTP server host. This is required when IS_CELONIS_FTP=False

      11. FTP_USERNAME: FTP server user name. This is required when IS_CELONIS_FTP=False

      12. FTP_PASSWORD: FTP server password. This is required when IS_CELONIS_FTP=False

      13. SMTP_PORT: SMTP server port.

      14. SMTP_SERVER: SMTP server host.

      15. SMTP_USER: SMTP server User Name.

      16. SMTP_PASSWORD: : SMTP server password.

      17. NOTIFICATION_SUBSCRIBER: Comma separated email id for email notification subscriber of Data Importer Process.

      18. RELOAD_DATAMODEL: After data import, if you want to reload the data model then set this to True, otherwise set this to False.

      19. FULL_LOAD: If you set True then Data Model will reload all tables otherwise reload only tables which we have loaded using data importer in particular execution.

      20. CSV_SEPERATOR: Column separator of CSV file. By default it is comma (,) but you can change to pipe(|) when a pipe separated file.

      21. DATE_FORMAT: Date format which we are receiving in a CSV file from a customer.

      22. DATE_TIME_FORMAT: Date Time format which we are receiving in a CSV file from a customer.

      23. ENABLE_CUSTOMIZATION: If you want to perform transformation on a CSV file received from the customer then set this as True and write your logic into the onInit method of the DataImporterCustomizations class and return the newly created file path.

      24. PROCESSES: Define an array with active processes (file) in sequence. The Data Import process will import files in a given sequence in this array. The Process name configured here must match with the Process name given in the PROCESS_DETAILS property.

      25. PROCESS_DETAILS: This contains the configuration of each CSV file as below

        1. tableName: Data Pool table in which CSV needs to be imported.

        2. columnMapping: mapping of CSV column with Table column along with data type.

        3. fileName: CSV file name which needs to be fetched from FTP to load.

        4. filePath: FTP File Path URL.

        5. primaryKeys: Primary Keys list by which Data Importer will Upsert the data.

    6. Schedule Data Importer Module: Need to schedule data importer script as per our requirement. For that, Go to the Machine Learning Tab > Scheduling > New Schedule which will open the below screen where you have to enter Schedule Name, Choose Workspace App that you have created in Step 3 and Select data_importer.ipynb Notebook and Select Schedule frequency as per your requirement.

      image71.png

      Scheduling Settings for Data Importer Script