Configure your Amazon Redshift warehouse

Connect your Redshift warehouse to Optimizely Analytics to unify customer and product data, create advanced segments, and measure key outcomes like revenue and retention. This direct connection ensures your analytics are accurate, eliminates the need for manual exports, and ensures experimentation insights align with your single source of truth.

You must log in with Account Administrator privileges to create a warehouse, assign roles, or add users to your organization.

Perform the following steps to configure your Redshift warehouse for Optimizely Analytics:

Retrieve your JDBC URL

The first parameter required to set up your Amazon Redshift connection is the JDBC connection string for your cluster. You can find the cluster connection string in the Amazon Redshift console, on a cluster's details page.

Create a Redshift user

Configure a Redshift user with a unique username and password. By default, only the admin user you create when you launch the data warehouse has access to the default database. To grant other users access, create one or more accounts. Read about authenticating with username and password.
Provide the Redshift user with permissions for the specific table, views, or schemas you want to analyze with Analytics.
Enable case-sensitive lookup for databases, tables, and column names using the following SQL:

ALTER USER <username> SET enable_case_sensitive_identifier = true
Update the maximum query execution time to five minutes:
- Go to the workgroup configuration.
- Select the Limits tab, and update the limit for Maximum query execution time from 14400s to 300s.

IP allowlisting (conditional)

Ensure the warehouse accepts incoming Optimizely Analytics requests over the public internet. This action is conditional and only required if the warehouse cluster is guarded by a security group (such as a firewall) that prevents access to the cluster from Optimizely Analytics.

Following is the IP allowlist:

35.196.71.222
34.73.142.185
34.148.77.115
34.73.63.141
34.74.199.69
34.74.109.219
34.139.128.201
35.243.168.58

Configure CUPED (Experimentation only)

To enable CUPED (Controlled-experiment using Pre-Experiment Data) in your Amazon Redshift environment, a one-time manual setup is required. This guide will walk you through creating and configuring the necessary AWS Lambda function and IAM roles.

Amazon Redshift is deprecating the creation of Python UDFs after January 31, 2026, requiring all custom Python-based logic to be executed through Lambda UDFs. As a result, running CUPED in Redshift requires a one-time manual configuration for each customer. Read more about the deprecation of Python UDF creation.

Create the Amazon Web Services (AWS) Lambda function

Create the basic Lambda function from the AWS console.

Go to the AWS Lambda Console.
Click Functions > Create function.
Ensure the Author from scratch option is selected.
Configure the basic function settings:
- Function name – Enter cuped_thetas.
- Runtime – Select Python 3.12.
- Architecture – Keep the default value, x86_64.
- Execution role – Select Create a new role with basic Lambda permissions.
Click Create function to finalize.

Configure the function code and layers

Add the Python code and the required dependencies (layers) to your new function.

Select the Code tab in your newly created function's console.

Replace the default code in the editor with the following snippet:

import numpy as np, pandas as pd, math, json

def create_matrix(triangleVals,n):
  out = []
  outIdx = 0
  for i in range(n):
    out.append([])
    for j in range(n):
      if j >= i:
        out[i].append(triangleVals[outIdx])
        outIdx += 1
      else:
        out[i].append(out[j][i])
  return out

def lambda_handler(event, context):
  input = event['arguments'][0][0]
  df = pd.DataFrame(np.array([0 if i is None else i for i in json.loads(input)]))
  m = len(df)
  n = int((-3 + math.sqrt(9 + 8*m)) / 2)
  XtX = create_matrix(df.iloc[0:m-n].values.flatten(), n)
  Xty = df.iloc[m-n:].values.flatten()
  beta_hat = np.linalg.inv(XtX).dot(Xty)
  results = dict()
  results['results'] = [json.dumps(beta_hat.tolist())]
  return json.dumps(results)

Scroll down to the Layers section and click Add a Layer.
Select AWS Layers under Choose a layer.
Choose AWSSDKPandas-Python312 from the list of AWS layers.
Click Add to attach the layer to your function.

Test the Lambda function

Before proceeding, verify that the function is working correctly.

Select Test in the Code tab.
Click Create new event.
Enter an Event name (e.g., TestEvent).

Paste the following test event data in the JSON editor:

{
    "arguments": [
        [
            "[1,2,3]"
        ]
    ]
}

Click Save, and then click Test again.
Verify that the function executes successfully and returns the following response:
```
"{\"results\": [\"[3.0]\"]}"
```

Configure the Redshift IAM role

Finally, you need to grant your Redshift cluster permission to invoke the Lambda function you just created.

Ensure a default IAM role – Confirm that your Redshift cluster has a default IAM role attached to it.

Attach a new policy – Attach a new policy to that role that grants permission for the lambda:InvokeFunction action on the cuped_thetas function.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "lambda:InvokeFunction",
      "Resource": "arn:aws:lambda:<REGION>:<AWS account>:function:cuped_thetas"
    }
  ]
}

Performance guidance

The following techniques are various approaches you can take to ensure that Optimizely Analytics runs at an optimal cost and performance profile:

Ensure your events/conversions table is clustered by event date (not time), and event type (the same column selected in the Semantics tab of the dataset in Optimizely Analytics), in that order.
(Experimentation Analytics only) Check if the warehouse is clustered by the experiment ID column and decision date (not time) if it has a separate decisions table.
Ensure that you have created a new schema in your warehouse, give Optimizely Analytics read and write access to it, and then enter the name of that schema in the Optimizely Analytics app settings, in the Materialization section. This enables the materialization of repeated queries, which is a large cost/performance boost.
Ensure your warehouse instance size is appropriate.
Consider creating a new warehouse for Optimizely Analytics, if this same warehouse instance is also used by other workloads, so that Optimizely Analytics queries are isolated from other workloads.

Create an empty scratch schema

After configuring your warehouse and granting permissions, create a writable scratch space for Optimizely Analytics. This space optimizes internal operations, enhancing performance, resource utilization, and user experience, making it essential for advanced analytics in Redshift.

Configure your Amazon Redshift warehouse

Retrieve your JDBC URL

Create a Redshift user

IP allowlisting (conditional)

Configure CUPED (Experimentation only)

Create the Amazon Web Services (AWS) Lambda function

Configure the function code and layers

Test the Lambda function

Configure the Redshift IAM role

Performance guidance

Create an empty scratch schema

<%= previousTitle %>

<%= nextTitle %>

In this article

<%= heading %>

<% if (!block.description) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

<%= heading %>

<% if (!block.description) { %> <%= parsed.title %> <% } else { %> <%= parsed.title %> <% } %>

User Research

Security Announcements

Still have questions?

Categories

Toggle navigation menu

<%= category.name %>