Connecting to Snowflake
Limited availability feature - Bulk exporting
If you are using a direct connection to your Snowflake instance with the Celonis Platform, you can now take advantage of a bulk export for full extractions. This method offers significant reductions in extraction times without any alterations or new configurations needed on your end.
For more information on direct connections, see Choosing the connection type. If you are interested in participating in this limited availability release, contact us at Celopeers.
The Celonis Snowflake extractor allows you to transfer data from Snowflake databases into the Celonis Platform for process mining and analysis. It supports the following basic features:
Prerequisites
This section details important prerequisites or prerequisite knowledge for using this extractor.
Before creating a connection between your database and the Celonis Platform you must decide which connection type you want to use. Except where stated in Supported database types, all databases have two basic connection types: Direct connections and Uplink connections via an on-premise extractor, as described below:
Direct connections: Use direct connections when you want to allow the Celonis Platform direct access to your database without additional infrastructure. Meaning, you do not need to install, patch, or maintain on-premises extractors, which speeds up implementation, reduces complexity, and simplifies operations.
Note
By default, all cloud-based extractors are direct connections.
Uplink connections via an on-premise extractor: Use uplink connections when you don't want to or can't allow the Celonis Platform to directly access your on-premise or private cloud database. The connection between the database and Celonis is then established using an on-premise extractor that's installed within your network ideally on a dedicated server.
The role of the extractor is to poll and fetch job requests from the Celonis Platform, before then submitting the execution information the the database via an SQL query. Once the data is retrieved from the database, the extractor fetches it and sends it back to the Celonis Platform. As such, the connection between the database and the Celonis Platform is always made by the extractor, with it continuously querying the Celonis Platform for any extractions to execute.
Note
To use an uplink connection, you must install an on-premise extractor in your environment. To do so, see Setting up. Additionally, if you want to use a proxy (optional), see Proxy settings for on-prem clients.
For the database extractor to communicate with your database and the Celonis Platform, you must modify your network settings to allow access.
Note
Follow the instructions in network settings section below based on the connection type you using. Additionally, if you are using uplink connections, follow the instructions in Celonis Platform IP addresses depending on the cluster .
The following network settings apply only for direct connections:
Source system | Target system | Port | Protocol | Description |
---|---|---|---|---|
Celonis Platform | Source system | Depending on the database, typical ports are 5432 for PostgreSQL and 30015 for HANA for example | TCP | JDBC connection from the Celonis Platform to the database. The port is the one you normally use to connect to the database. The IPs of the Celonis Platform depending on the cloud cluster (which can be seen in the URL). |
The following network settings apply only for uplink connections (via the on-premise extractor):
Source system | Target system | Port | Protocol | Description |
---|---|---|---|---|
On-premise extractor server | Source system | Depending on the database, typical ports are 5432 for PostgreSQL and 30015 for HANA for example. | TCP | JDBC connection from on-premise extractor server to the database. The port is the one you normally use to connect to the database. |
On-premise extractor server | Celonis Platform | 443 | TCP | HTTPS connection from on-premise extractor server to Celonis cloud endpoint. The IPs of the Celonis Platform depending on the cloud cluster (which can be seen in the URL). |
The respective clusters use multiple IPs each, so you need to enable all three of them in your firewall configuration to connect the on-premise extractor server and the cloud endpoint.
For a complete list of inbound and outbound Celonis Platform IP addresses to be allowlisted if needed, see: Allowlisting Celonis domain names, IP addresses, and third-party domains
This section describes the guidelines for using custom JDBC strings in extractor configurations:
Authentication: The Credentials fields in the extractor configuration are required and always used to authenticate the connection. Do not embed credentials directly in your JDBC string.
Encryption: For standard (unencrypted) extractors (examples: SAP HANA, PostgreSQL), you can enable encryption by adding
encrypt=true
to the JDBC string. For encrypted extractors (examples: SAP HANA encrypted, PostgreSQL encrypted), connections are established with encryption enabled (encrypt=true
) by default. You do not need to include this parameter in your JDBC string.Certificate validation: Do not include
validateCertificate=true
in your JDBC strings. Instead, use Advanced Settings > Validate Certificate > Enabled.Additional properties: You can include additional properties in either the JDBC string or the Additional Properties field. Do not specify the same properties in both places.
The Snowflake extractor supports the authentication methods described in the following sections.
This extractor can connect to the database using a database user account. Provide the username and password for this account to authenticate the connection. Ensure this database user has sufficient permissions to access the data to be extracted.
If you are using Snowflake’s built-in identity provider, you can connect to your public-facing Snowflake instance using OAuth authentication.
Important
If your Snowflake instance uses an external identity provider for SSO authentication, OAuth is not supported. We recommend using Key Pair authentication instead.
To use OAuth authentication with Snowflake’s built-in identity provider:
Sign into your Snowflake account.
Note
This account must be able to run commands as both the USERADMIN role for user and role management tasks, and the ACCOUNTADMIN role for account-level operations. Ensure you are using the appropriate role when performing each step.
As the USERADMIN role, create a new user role and grant the required role permissions by running the following commands in the Worksheets window:
# Create a new role for celonis create role celonis_role; # Grant permissions for that role grant select on all tables in schema MY_DB.PUBLIC to role celonis_role;
As the USERADMIN role, create a new user and assign it the role by running the following commands in the Worksheets window:
# Create a new user for celonis create user celonis password='celonis123'; # Grant celonis permissions to celonis user grant role celonis_role to user celonis;
As the ACCOUNTADMIN role, create a security integration group by running the following commands in the Worksheets window:
# Create security integration for the oauth create security integration oauth_kp_int type = oauth enabled = true oauth_client = CUSTOM oauth_client_type = 'CONFIDENTIAL' oauth_redirect_uri = 'https://auth.redirect.celonis.cloud/snowflake_redirect' oauth_issue_refresh_tokens = true oauth_refresh_token_validity = 7776000; # Allow celonis user to use the integration as ACCOUNTADMIN ALTER USER celonis ADD DELEGATED AUTHORIZATION OF ROLE celonis_role TO SECURITY INTEGRATION oauth_kp_int;
As the ACCOUNTADMIN role, retrieve the Client ID and Client secret for the OAuth integration by running the following commands in the Worksheets window:
# Get the client-id & client-secret for the integration; Note that the integration name is case-sensitive and must be uppercase and enclosed in single quotes. select SYSTEM$SHOW_OAUTH_CLIENT_SECRETS('OAUTH_KP_INT');
If you want to connect to Snowflake using key pair authentication, you need to generate a private key in your data source system. Celonis recommend using an encrypted private key with a passphrase. This passphrase is then encrypted by the Celonis Platform, giving you an extra layer of security.
Note
If you encrypt your private key with a passphrase, ensure you select the Is your key encrypted? checkbox in the Credentials configuration.
This authentication method can be used for both direct and uplink connections. When entering the Credentials > Private Key value in the configuration, include the value starting with -----BEGIN PRIVATE KEY-----
and ending with -----END PRIVATE KEY-----
.
For more information about key pair authentication, including generating an encrypted private key, see: Snowflake docs: Key pair authentication.
Configuring the Snowflake extractor
This section describes the basic setup of configuring the Snowflake extractor. To configure the extractor:
In the Celonis Platform left navigation, select Data > Data Integration.
On the Data Pools screen, select the data pool you want to use for the extraction.
Note
If you do not have a data pool to use for this extraction, see Creating and managing data pools for instructions on how to create one.
In the Data Integration section, select Connect to Data Source.
In the Add Data Connection window, select Connect to Data Source.
In the Connect to Data Source window, depending on your use case, select either Database – On Premise or Database – Cloud.
Note
Select Database – On Premise to connect to on-premise or private cloud databases.
If you selected Database – On Premise, follow the on-screen instructions.
In the New Database Data Connection window, fill in the following information:
For Name, provide a name for this configuration.
For Database Type, select Snowflake.
For Connection Type, select either Standard or Custom JDBC Connection String.
If you selected Standard:
For Host, enter the hostname or IP address of your server.
For Port, provide the port to connect to (Default is
433
).For Database Name, enter the name of the database that contains the data you want to extract.
(Optional) For Warehouse, enter the name of the Snowflake virtual warehouse to use for running queries during data extraction.
(Optional) For Additional Properties, enter any additional connection properties required by your database or driver. Separate each with
;
.
If you selected Custom JDBC Connection String:
Important
When using JDBC strings, there are specific guidelines to follow. For more information, see JDBC string_guildelines.
For JDBC Connection String, provide your string. Use the format:
jdbc:snowflake://<account_identifier>.snowflakecomputing.com/?db=<database>&schema=<schema>&warehouse=<warehouse>&role=<role>&property1=value1&property2=value2...
Note
For more information on connecting to Azure SQL with JDBC strings, see the Snowflake documentation.
Optionally, provide values for:
Schema Name: Enter the name of the schema that contains the tables to extract.
Additional Properties: Enter any additional connection properties required by your database or driver. Separate each with
;
.
For Credentials, select the type of credentials to use for this connection. For more information on the different credential types, see Authentication methods.
Note
Ensure the credentials used have sufficient permissions to access the data to be extracted.
If desired, select Advanced Settings, and update these parameters as needed.
Note
The Advanced Setting > Validate Certificate parameter (Default:
DISABLED
) controls whether the extractor validates the server’s SSL/TLS certificate:Disabled: Disables certificate validation (
validateCertificate=false
).Enabled: Enforces certificate validation (
validateCertificate=true
).Removed: Uses the driver’s default behavior. Check the driver documentation to confirm the default.
Select the Test Connection button to confirm the extractor can connect to the host system. If the test fails, adjust the data in the configuration fields as needed.
Once the test connection passes, select the Save button to continue.
Note
If using OAuth, the OAuth workflow starts, asking you to sign into your Snowflake account and authorize the connection.