Skip to main content

Celonis Product Documentation

Installing and updating the on-premise JDBC extractor

The JDBC extractor, available from the Celonis Platform Download Portal, allows you to connect to any SQL database.

From version 2.80.1 of the JDBC extractor, you're required to upgrade your Java version to at least Java 17.

Also ensure that all other system requirements of the extractor server are fulfilled. For the complete requirements, see System requirements.

The JDBC extractor is available to download from the Celonis Download portal for users with MANAGE DOWNLOAD PORTAL permissions in their Celonis Platform team.

With the necessary permissions, you can access the Download Portal by clicking Admin & Settings and then clicking Download Portal:

download_portal.png

Once in the Download Portal, you'll find the latest versions of the JDBC extractor:

download_jdbc.png

Celonis only provides fixes for the most recent version of the on-premise JDBC extractor. If you're running a JDBC extractor that's not the most recent version and you raise a support ticket, we'll check whether the bug or issue still exists in the most recent version. If it does, we’ll fix it in that version. In any case, you'll need to upgrade to the most recent version to get the fix.

Older versions of the JDBC extractor won't stop working - you can carry on using them until you need support, at which time you'll need to upgrade to get the bug fixes. We recommend you update the JDBC Extractor at least every quarter, even if you don't experience any issues, to get the latest capabilities and any fixes for security vulnerabilities.

On the download portal under the JDBC Extractor section, you can see the release date of the most recent on-premise JDBC extractor version at the start of the package name, which is in this format:

<YYYY-MM-DD>-package-name-<major version>.<minor version>.<patch version>

For example, the release date for the version shown here is 6 March 2024:

2024-03-06-dockerized-package-jdbc-2.92.1
JDBC_portal.png

You can confirm the release date of your currently installed JDBC extractor version by going to Admin & Settings > Uplink Integrations. The release date is shown in the Connector Version field for the “Cleo Uplink” connector.

JDBC_uplink.png

If you have any questions or need help, talk to your Celonis point of contact, or find how to get in touch with us: Contacting Support.

After ensuring you meet the latest system requirements and have downloaded the latest JDBC extractor, you can update this with the following steps:

  1. If you've customized the JDBC extractor's YAML configuration file, proxy configuration file, or XML configuration file, save a copy of these files from the existing extractor directory.

    • YAML configuration: application-local.yml

    • Proxy configuration: Proxy.yml

    • XML configuration: CelonisJdbcExtractor.xml

    save_config_files.png
  2. In a sandbox environment, create a new directory and install the new version of the extractor package into it.

    The package creates subdirectories and extracts the files from the jar file. It's important to do this in a new directory, rather than replacing the jar file in the existing directory, so that the new package can fully validate that it's been correctly installed.

  3. If updating from a version earlier than 2.82.0: Transfer your customizations for the file CelonisJdbcExtractor.xml to the CelonisJdbcExtractor.xml file supplied with the new version of the extractor package. You need to use the updated version of this file from 2.82.0 or later.

    If updating from version 2.82.0 or later: Replace the CelonisJdbcExtractor.xml file in your new directory with your customized version.

  4. Replace the following files in your new directory:

    • YAML configuration: application-local.yml

    • Proxy configuration: Proxy.yml

    You don't need to retain the updated versions of these files.

  5. Start the new version of the extractor following the instructions in "Step D: Run The Extractor" in How do I set up an on-premise Extractor?

  6. Verify that the extractor is working correctly in your sandbox environment. If it is, follow the above steps to install the extractor to your production environment.

  7. Once the updated version of the extractor is working in your production environment, stop and uninstall the older version of the extractor.

JDBC extractor change history

Version

Release date

Changes

3.3.0

2024-12-13

  • Added JOIN support for Snowflake bulk extractions.

  • Improvements in memory consumption during parquet files uploads.

3.2.0

2024-12-10

  • Limited availability release of VARCHAR optimization version 2 for on-prem extractors.

  • Limited availability release of Snowflake bulk export feature for JDBC full extractions on direct connections.

3.1.1

2024-11-27

  • Upgraded MSSQL JDBC driver to 12.8.1.

  • Improved messaging for Snowflake connection test if private key is stored outside of Extractor directory.

3.0.0

2024-10-30

2.106.2

2024-10-16

  • Added UI indicator when there is a filter for Join Config and removed warning related to the filter.

  • Added support for EXTERNAL table type in Athena.

2.105.0

2024-09-30

  • Internal improvements.

2.104.1

2024-09-05

  • Upgraded the Amazon Athena driver to version 2.1.5.1000.

2.103.0

2024-08-23

  • Fixed preview and limit for Impala connector.

  • Upgraded the IBM DB2 driver.

  • Fixed veracode security issue related to crypto algorithm used by Vault.

    • The existing vault.jar will not work from JDBC v2.103.0. Instead, upgrade to the latest version of the extractor for full functionality.

  • Fixed Extension Provider query for target tables with special characters.

  • Fixed vulnerabilities in JDBC Connection String.

  • Upgraded the Oracle driver.

2.102.1

2024-08-08

  • Added a readme file listing the libraries that the extractor uses.

  • Made the extractor version naming on the uplink integrations page consistent with the download portal.

  • Fixed some security vulnerabilities.

2.101.5

2024-07-23

  • Improved the date selection for downloading on-premise extractor logs.

  • Upgraded various third-party libraries to their latest version.

  • We now prevent the extractor starting if your proxy configuration is invalid.

2.100.0

2024-07-05

  • You can now search BigQuery tables using one or more projects from the search bar.

  • Upgraded MSSQL driver to the latest (12.6.3.jre11) and related msal4j dependency to 1.15.1.

  • Fixed a bug where on-premise extractor failed to shut down immediately after closing it.

  • Fixed a bug where on-premise extractor failed to report issues with broken proxy configuration at start up.

2.99.0

2024-06-19

  • We've added the capability to increase parallel executions up to 40 through local configurations. For more information, see: Configuring an on-premise extractor.

  • Upgrade the Google BigQuery driver to version 1.5.4.1008.

  • Fixed struct columns parsing in Oracle tables after Oracle JDBC driver upgrade.

2.98.5

2024-06-10

  • Added support for materialized views and snapshots in BigQuery.

  • Retry file upload on receiving HTTP status code 408 to overcome temporary CloudFlare issue.

  • Optimized the way file upload works to improve performance and avoid connection timeout issues.

  • Upgraded Snowflake JDBC driver to version 3.16.1 to fix issues with nested paths on Windows machines.

  • Upgraded several third party libraries to fix vulnerabilities.

2.97.1

2024-05-29

  • For Oracle extractions, you can now configure type casting between Integer and Float.

2.96.1

2024-05-14

  • If you’re extracting data from a Google BigQuery database, you can now get data from external tables, which reference data stored outside BigQuery, as well as from standard BigQuery tables.

  • Fixed an issue where a schedule consisting of an uplinked job was conflicting with a direct job.

2.95.2

2024-05-03

  • For filters, fixed an additional comma (,) that was being added to an IN clause.

  • Fixed an issue with table hash keys colliding for metadata caching, which skipped the extraction of one of the tables.

2.94.0

2024-04-08

  • Delta loads strictly default to canceling the job if the metadata from your source system changes, to avoid data inconsistency (Option A). Do a full load if this happens.

  • Fixed a bug where MAX_STRING_LENGTH was not getting applied to a column during the table configuration.

2.93.2

2024-03-25

  • Added caching for InformationSchema-based and SampleQuery-based retrieved metadata.

  • Updated third-party dependencies.

2.92.1

2024-03-06

  • Connect directly to Oracle EBS as a Cloud Connection - no need to use an uplink.

  • The extractor now shows tables only from the schema specified in the Connection Configuration.

  • MS SQL extractions for tables with clustered column index now work when the metadata source is SAMPLE_QUERY.

2.91.5

2024-02-16

  • Added support for Optimizer hints while extracting data from Oracle databases.

  • The Windows installer executable is now signed with a Celonis certificate.

  • Unified the three metadata retrieval approaches so that they all return consistent information.

  • Upgraded to newer versions of the logback and json libraries.

2.90.0

2024-01-26

  • Fixed Databricks extraction issues when using the default catalog and default database.

  • Fixed some security vulnerabilities.

2.89.0

2024-01-11

  • For driver metadata, fixed Microsoft SQL server extraction issue when table has clustered columnstore index.

  • Optimized memory allocations during extractions.

2.88.1

2023-12-13

  • Beta release of Oracle Smart Extraction, which parallelizes extractions of larger Oracle tables to reduce data extraction times. The feature is shipped disabled. If you want to try it out, we recommend that you do so in a sandbox environment. To get Oracle Smart Extraction enabled, talk to your Celonis point of contact or create a support ticket.

  • Fix for an SAP HANA filter parser error when concatenation is used.

2.87.0

2023-10-26

  • Fixed deviations for dates earlier than 1900 due to timezone changes.

  • Fixed an issue with the Snowflake driver for a new installation on Microsoft Windows.

  • SQL ID will now be logged for Oracle if debug mode is enabled.

2.86.0

2023-09-22

  • Enabled STRING to DATETIME conversion for BigQuery and Trino.

  • For Oracle, we’ve improved the query for the INFORMATION_SCHEMA metadata source.

  • Quotes in the filter statement are now recognized.

  • Fixed some security vulnerabilities.

  • Fixed an issue for the custom BigQuery driver where classes were not loaded in the correct order.

  • Fixed an issue for uplinked extractors using a proxy configuration.

2.85.0

(2023-09-01)

2023-09-01

  • Upgraded the JDBC extractor’s internal libraries. If you’re linking the BigQuery driver, you’ll need to exclude all SLF4J .jar files from the driver package.

  • On Microsoft Windows, we’ve changed the JDBC extractor’s dependency from the Microsoft Visual C++ 2010 Redistributable Package to the Microsoft Visual C++ 2015-2019 Redistributable Package (x64). Install that package when you install this version of the JDBC extractor.

  • Wildcards in Snowflake metadata calls are now escaped to improve load.

2.84.0

2023-08-10

  • Added support for Oracle CLOB (Character Large Object) and NCLOB (National Character Large Object) data types.

2.83.0

2023-06-21

  • Extractions that hang can be resumed from the last table, instead of restarting them.

  • Upgraded the driver for Snowflake to version 3.13.33.

  • Upgraded the driver for Athena to version 2.0.36.

  • Upgraded the driver for IBM DB2 to version 11.5.

  • Implemented TO_DATE functionality for Oracle filters.

2.82.0

and 2.80.1

2023-08-10

  • Upgraded the JDBC extractor to Java 17.

  • From version 2.80.1, you're required to upgrade your Java version to at least Java 17 to ensure compatibility and leverage the latest enhancements and security features.

  • With version 2.82.0, we’ve removed some additional steps from the upgrade process, so use this or a later version of the extractor package.

2.77.0

2023-02-27

  • Upgraded MSSQL-JDBC Driver to latest version.

  • Set trustServerCertificate=true and encrypt=false by default in case they are not set in the additional properties field (required by driver upgrade).

2.76.0

2023-02-15

  • Upgraded MySQL driver to latest version.

2.75.0

2023-02-01

  • Fixed security vulnerabilities.

2.71.0

2022-11-25

  • Added support for extractions from Databricks.

  • Oracle DB: Fix for scenarios where the driver metadata was used, even if Information Schema is selected.

  • Improved clean-up of changelog tables for real-time extractions by doing the clean-up in chunks.

2.70.0

2022-11-16

  • Improved metadata query for Oracle databases.

2.69.0

2022-10-27

  • Added support for Analytical Views for SAP HANA.

  • Fixed the feature to clear the metadata cache.

2.67.0

2022-09-28

  • Added support for Analytical Views for SAP HANA.

2.66.0

2022-09-15

  • Fixed Java-based vulnerabilities.

2.65.0

2022-09-09

  • Added support for key pair authentication for Snowflake.

2.64.0

2022-08-24

  • Extraction of synonyms for Oracle databases.

2.63.0

2022-06-30

  • Extended the logging messages.

2.62.0

2022-06-30

  • Enabled Materialized view for the Postgres database.

  • Added support for Attribute(Joined) views for SAP HANA.

2.61.0

2022-06-30

  • BigQuery Get Tables from Additional Projects.

  • BigQuery ADC authentication (hosted in GCP).

  • Fixed test connection issue for BigQuery.

  • Fixed input box for BigQuery data connection form.

2.60.0

2022-06-16

Minor improvements and fixes.

2.59.0

2022-06-02

  • Added SHA-256 and SHA-512 support.

  • Fixed the order of delete and insert executed at the same time.

2.57.0

2022-05-05

  • For Google BigQuery, fixed duplicate records caused by a LIMIT/OFFSET in SELECT queries without ORDER BY (primary key) clause not guaranteeing proper pagination.

  • Added validation to inform the user with an error message if a primary key is not selected.

  • Fixed NullPointerException in uploading results leading to duplicated push jobs.

  • Extended invalidate cache also for real-time integration column selection.

2.56.0

2022-05-05

  • Added initialization for JDBC real-time via Replication Cockpit.

  • Fixed out of memory error due to unlimited threads.

  • Fixed column order changes after deselecting some of the columns in JDBC.

2.55.0

2022-05-05

  • Added a database connection timeout setting in the UI. This overwrites the local timeout in application-local.yml in case of uplinked connections.

2.54.0

2022-05-05

  • Added an authentication option SERVICE_ACCOUNT_AUTHENTICATION for Google BigQuery database connection. Inputs are the service authentication account email ID and the service account key file.

  • Improved logging for JDBC extraction in DEBUG extraction mode.

  • Added support for Vertica database type.

  • Performance improvements for JDBC data extractions on the upload mechanism.

2.51.0

2022-05-05

  • Removed the option to include a changelog time stamp in JDBC real-time extractions, and made it the default, to support real-time transformations.

2.50.0

2022-05-05

  • Extended duplicate removal for all database types by adding properties in application-local.yml for uplink database connections. To enable this, add the following to the application-local.yml file:

    duplicate-removal: 
       enabled: true   
         strategy: CLOUD

2.49.0

2022-05-05

  • Logical change in reading change log tables in the JDBC real-time scenario to improve performance.

  • Fixed change in metadata source not being consistently reflected in the metadata query.