Connect your Databricks warehouse to Optimizely Analytics to unify customer and product data, create advanced segments, and measure key outcomes like revenue and retention. This direct connection keeps your analytics accurate, eliminates manual exports, and ensures experimentation insights align with your single source of truth.
This integration uses JCBC to connect your data to Optimizely Analytics. To complete the connection, you must use a service account to generate a personal access token (PAT) and retrieve the JDBC URL from your SQL warehouse in Databricks.
Prerequisites
- Databricks service account
- Optimizely Analytics account
- An SQL warehouse in Databricks
Generate a personal access token (PAT) in Databricks
A personal access token authenticates access to resources and APIs at the Databricks workspace level. You should use a service account to generate it and not generate a PAT from your user account for the following reasons:
- It ties automation to an individual user.
- It breaks access if that user leaves the organization or loses permissions.
- It is harder to audit and rotate credentials.
Click the service account's Databricks username and select Settings from the drop-down list in your Databricks workspace. Follow the steps in Databricks personal access tokens for workspace users and copy the displayed token to a secure location. You will use the copied token within Optimizely Analytics later.
Retrieve the JDBC URL from Databricks’ SQL warehouse
- Create a SQL warehouse or locate an existing SQL warehouse in Databricks that you want to connect to Optimizely Analytics.
- Open the Connection details tab.
- Copy the JDBC URL and save it in a safe location for later.
-
Add the following suffix to the JDBC URL string if not present:
;UID=token;The JDBC URL must be in the following format:
jdbc:databricks://<server-hostname>:443;httpPath=<http-path>;AuthMech=3;UID=token;
IP allowlisting (conditional)
Ensure the warehouse accepts incoming Optimizely Analytics requests over the public internet. This action is conditional and only required if the warehouse cluster is guarded by a security group (such as a firewall) that prevents access to the cluster from Optimizely Analytics.
Following is the IP allowlist:
- 35.196.71.222
- 34.73.142.185
- 34.148.77.115
- 34.73.63.141
- 34.74.199.69
- 34.74.109.219
- 34.139.128.201
- 35.243.168.58
Performance guidance
The following techniques are various approaches you can take to ensure that Optimizely Analytics runs at an optimal cost and performance profile:
- Ensure your events/conversions table is clustered by event date (not time), and event type (the same column selected in the Semantics tab of the dataset in Optimizely Analytics), in that order.
- (Experimentation Analytics only) Check if the warehouse is clustered by the experiment ID column and decision date (not time) if it has a separate decisions table.
- Ensure that you have created a new schema in your warehouse, give Optimizely Analytics read and write access to it, and then enter the name of that schema in the Optimizely Analytics app settings, in the Materialization section. This enables the materialization of repeated queries, which is a large cost/performance boost.
- Ensure your warehouse instance size is appropriate.
- Consider creating a new warehouse for Optimizely Analytics, if this same warehouse instance is also used by other workloads, so that Optimizely Analytics queries are isolated from other workloads.
Create a writable schema in Databricks
After you configure the warehouse and grant permissions, create a writable scratch space for Optimizely Analytics. This space optimizes internal operations, enhancing performance, resource utilization, and user experience, making it essential for advanced analytics in Databricks.
Please sign in to leave a comment.