Skip to main content

Connecting to Databricks

You can connect your Databricks tenant to the Celonis Platform, allowing you to integrate both your 'Data Science and Engineering' and your 'SQL Warehouse' clusters. This is achieved by integrating the Databricks JDBC driver with the Celonis Platform, with more information about this driver available here: Databricks JDBC Driver.

Prerequisites for connecting to Databricks

To connect your Databricks tenant to the Celonis Platform you need access to a Personal Access Token from your Databricks account.

For information on how to create this token, see: Databricks Help Center - Databricks personal access token authentication.

With access to your Personal Access Token, you can create a connection between your Databricks instance and the Celonis Platform from your data pool diagram:

  1. Click Data Connections.

    A screenshot showing how to access the data connections screen from a data pool diagram.
  2. Click Add Data Connection and select Connect to Data Source.

    A screenshot showing how to add data connections from a data pool diagram.
  3. Click Cloud - Database.

  4. Configure the following connection details:

    • Name: An internal reference for this data connection.

    • Database type: Select Databricks

    • Configuration type: Select Standard.

    • Host: The database server name or IP address of the database server.

    • Port: The default is 443.

    • Schema name: The schema to use (optional).

    • Additional Properties: Additional properties like validateCertificate=false.

    • Personal Access Token: Enter the token obtained from your Databricks account.

    Note

    If you want to connect to your SQL Warehouse cluster, you must remove from the Validation Certificate from the data connection. To do this:

    A screenshot showing how to validate the certificate for databricks.
    1. Click Advanced Settings.

    2. Select Validation Certificate - REMOVED

  5. Click Test Connection and correct any highlighted issues.

  6. Click Save.

    The connection between your Databricks tenant and the Celonis Platform is establised. You can manage this connection at any time by clicking Options:

    A screenshot showing how to edit an existing data connection.

You can also authenticate the connection to your Databricks instance using Databricks service principals:

Creating a service principal in Azure

  1. Login to your Azure Databricks dashboard and click Select User - Settings - Identity and Access.

  2. Click Manage Service Principals.

    authentication_with_Microsoft_Entra.png
  3. Create a databricks managed service using these settings:

    • Management: Databricks managements

    • Service principal name: Enter a reference here.

    • Status: Active

    • Entitlements: Databricks SQL Access, Allow workspace access.

  4. Click Add to save the service principal.

    add_service_principal.png
  5. Copy the application ID (to be used when creating the data connection in the Celonis Platform).

Adding permissions for the service principal

To add permissions for your newly creatred service principal:

  1. Click Catalog.

  2. Select the Catalog you will use in your in the Celonis Platform and then click Permissions.

  3. Grant Access to your service principal with the following privileges:

    • SELECT

    • USAGE

    • READ METADATA

    All schemas under that catalog can use the service principal. You can also assign manually the service principal to only schemas.

assign_permissions_to_service_principal.png

Creating OAuth secret for the service principal

To create an OAuth secret for the service principal:

  1. Go to Service Principal Secrets page and generate a new secret. The ID will be the same as the application ID generated previously.

    secret_ID.png
  2. For Identity Federation enabled accounts: Assign the Permissions on the SQL Warehouse to the Service Principal

    SQL_permissions.png
  3. For Identity Federation enabled accounts: Assign the Service Principal to the Workspace.

    sharing_workspace.png

Creating the data connection in the Celonis Platform

You can now create the data connection in the Celonis Platform by following these steps: Creating a connection between Databricks and the Celonis Platform.

When prompted for credentials, enter the following:

  • Principal ID: The ID generated when creating the service principal.

  • Principal Secret: The OAuth secret generated above.

data_connection.png

You can also authenticate the connection to your Databricks instance using Micrsoft Entra ID (formerly Azure ID):

Creating a service principal in Azure

  1. Login to your Azure Databricks dashboard and click Select User - Settings - Identity and Access.

  2. Click Manage Service Principals.

    authentication_with_Microsoft_Entra.png
  3. Create a databricks managed service using these settings:

    • Management: Microsoft Entra ID managed.

    • Service principal name: Enter a reference here.

    • Status: Active

    • Entitlements: Databricks SQL Access, Allow workspace access.

  4. Click Add to save the service principal.

    add_service_principal.png

Adding permissions for the service principal

To add permissions for your newly creatred service principal:

  1. Click Catalog.

  2. Select the Catalog you will use in your in the Celonis Platform and then click Permissions.

  3. Grant Access to your service principal with the following privileges:

    • SELECT

    • USAGE

    • READ METADATA

    All schemas under that catalog can use the service principal. You can also assign manually the service principal to only schemas.

assign_permissions_to_service_principal.png

Creating OAuth secret for the service principal

To create an OAuth secret for the service principal:

  1. Go to Service Principal Secrets page and copy the application ID.

    secret_ID.png
  2. Click Generate secret and copy the secret ID.

  3. For Identity Federation enabled accounts: Assign the Permissions on the SQL Warehouse to the Service Principal

    SQL_permissions.png
  4. For Identity Federation enabled accounts: Assign the Service Principal to the Workspace.

    sharing_workspace.png

Creating the data connection in the Celonis Platform

You can now create the data connection in the Celonis Platform by following these steps: Creating a connection between Databricks and the Celonis Platform.

When prompted for credentials, enter the following:

  • Principal ID: The application ID generated when viewing the OAuth secret above.

  • Principal Secret: The OAuth secret generated above.

data_connection.png

If the JDBC driver you're using doesn't support certificate validation, you can edit the application-local.yml file to remove this from the configuration.

To do this, open the application-local.yml file (found in the package directory) and add the following configuration:

database:
    validateCertificateSupported: false