This topic outlines the complete process for Warehouse-Native Analytics, starting with creating a service account, accessing the platform, creating an application, connecting to your data warehouse, defining datasets, and building an exploration. See Terms and Concepts for explanations of these terms.
Log into Warehouse-Native Analytics
- Access the Warehouse-Native Analytics web application URL on any browser.
- Enter the username and password that the Warehouse-Native Analytics administrator provided and click Sign In.
If you encounter difficulties logging into Analytics, contact the Analytics administrator or Support.
Create a service account with read/write schema
You must create a service account for Analytics in the data warehouse to query the warehouse.
You should add a schema with read/write access granted. This schema is usually called netspring_rw and is used to improve performance by caching common computations, such as the drop-down list of unique values in a column, and reducing the warehouse cost of Analytics. This substantially improves the warehouse cost and the performance of Analytics.
- Request to create this schema. The request will not block you from continuing your setup; you can return to this when the schema is created.
- Go to Settings > General Settings > Materialization and enable the feature. Analytics creates materialized tables in the data warehouse, which contain intermediate results for improving performance.
- Configure the following fields:
- Database – The database name in the data warehouse where the materialized tables are created. The example shows netspring.
- Schema – The schema's name in the database where the materialized tables are created. The example shows netspring_rw.
-
Refresh Cron Schedule – The refresh periodicity of the materialized tables using the cron syntax. The recommended periodicity is daily, so the schedule is
0 0 * * *
.
Create an application
Create an application in your organization to store data and analysis generated in the subsequent steps.
- Click Application > +.
- Enter the application details in the Creating new application window. Provide a Name and Description for the application, select a color, and click Save.
- Select a data warehouse at Data > Connections > + New Connection.
- Enter the connection details.
- Test your connection by clicking the clockwise arrow icon next to the Health field. A successful connection is indicated in green and shows as good; in case of failure, it displays an error. The common sources of error are incorrect warehouse credentials or the Analytics service account's incorrect permissions.
- Click Confirm to finish the application setup. Click Application on the Menu panel to verify the new application is under your organization.
Learn how to set up Connections in Analytics.
Create datasets and a relationship for explorations
You must create user and event datasets and a relationship between them before creating explorations in Analytics.
Create event datasets
Before you create a dataset, you need to understand whether your events are represented by one or many tables in your warehouse. These categories are two common possibilities, but it is also possible that you may not fall under any of these categories.
Single table
- Click + > Dataset > Source Dataset. Source datasets read from a single table in your warehouse.
- Go to the Pick a Source section and expand the database using the tree view. You can also enter the table path directly.
- Select the required table or view and click Confirm.
- Provide a name for the dataset and click Save.
- Annotate the dataset as an event stream and tell Analytics that your created dataset is a stream of events performed by users, accounts, etc. To annotate your dataset as an event stream,
- Go to Semantics > Event Stream.
- Click + Add.
- Select a column that orders the event by time, such as event_ts in the New datasets are events that occur at field. For the with event type field, select the column that represents the event's name, for example, event_type. The Event Type of an event is the name that will show in the Analytics UI when you select events.
- Click Save.
Multiple tables
When you have multiple tables, create a Union Dataset. A union dataset combines multiple warehouse tables exposed to Analytics users as a single logical dataset. This dataset type unifies your view of events across multiple tables and queries them efficiently. You can create a union dataset by following the step-by-step procedure given in this section.
In the Semantics tab, annotate the dataset as an event stream. To do that:
- Choose Event Stream in the This is an drop-down list.
- Click + Add under Event Streams.
- Select a column that orders the event by time, such as event_ts in the New datasets are events that occur at field. For the with event type field, select the column that represents the event's name, for example, event_type.
- Click Save.
Create user datasets
You must create a User Dataset. Analytics requires a dimension table of users with one row per user.
- Click + > Dataset > Source Dataset.
- Go to the Pick a Source section and expand the database using the tree view. You can also enter the table path directly.
- Select the required table or view, for example, the USER table from the PUBLIC schema, and click Confirm.
- Provide a name for the dataset and click Save.
Create a relationship between a user and event datasets
Creating a relationship between datasets tells Analytics how to join these tables together.
- Go to Related Datasets > + Add Related Dataset.
- Select the event dataset – ProductEvents.
- Select one to many for Cardinality because one user will have multiple events.
- Select the columns. In the Users table, select the id column — this is usually the primary key. In the Events table, choose the column representing the ID of the user who performed that event – here, user_id.
- Click Save.
Create Column Actor datasets
A dataset represents an actor - typically, this maps to a dimension table in the data warehouse. However, in some cases, there is no such dimension table in the warehouse; rather, there is only an identifier column in an event table (Events.user_id
). In this case, you can create a Column Actor dataset. The Column Actor dataset does not have any direct mapping to a warehouse table; it is defined solely by its relationship to the identifiers in other tables.
- Go to Data > Datasets > + New Dataset > Column Actor Dataset.
- Give your dataset a name, add a description, and click Save.
- Click + Add Related Dataset to establish a relationship with another dataset.
- Click the Dataset dropdown and select the dataset you want to link to the current column actor dataset.
- Select the Cardinality for the relationship.
- Define the relationship. You only choose the value for the non-column actor because there are no physical columns in the column actor dataset. For example, if you choose user_id, the column actor dataset will use the user_id column of the chosen related dataset, with no underlying physical table.
- Click Save.
Build an exploration
Build an exploration using the datasets you created.
- Before you create an exploration, go to Settings > Defaults and select the Users dataset for the Actors dataset field, and for the Event Stream field, choose Product Events. When you set these defaults, users are populated by default as the selected actor for this measure. Now, you are all set to build your exploration.
- Click + > Event Segmentation.
- Select the Count of unique actors that performed event measure.
- Click Select Events and choose Play Content.
- Click Run to display a chart in the visualization window.
- Name the exploration and click Save.
Please sign in to leave a comment.