A derived column is a computed property linked to a dataset. You define it within Analytics, and it is not present in the source data. You can include various calculations in a derived column, such as joins with other datasets, filters, and aggregations, as long as the result applies to each record in the target dataset.
For example, if you have a users dataset and an events dataset with a boolean column named vote, you can define a derived column in the users dataset that counts the total number of true votes per user.
The configuration for a derived column is saved within Analytics and accessed through the Catalog. Like other columns in the dataset, you can use a derived column anywhere the catalog is accessible.
A derived column has the following configuration:
- Dataset – Associate the derived column with exactly one dataset, which is selected when defining it. It displays in the Catalog like any other column, and you can use it for any analysis with catalog access. You cannot move a derived column to another dataset.
- Name – Ensure the derived column's name is unique across columns (including derived columns and cohorts) within its dataset.
- Definition – Specify the computation for a derived column using the block editor.
- Default aggregation – Set a default aggregation type, such as count or sum, for contexts where you use the derived column as a measure.
Create derived columns
- Click + and select Derived Column.
- Choose one of the templates to get started.
- Aggregation/First/Last – Create an aggregate quantity for each actor.
- Formula – Write custom logic using the NetScript formula language.
- Conditional on Property – Assign values based on logical conditions.
- Custom – Calculate values by composing multiple blocks.
- Conditional on Aggregate – Calculate an aggregated quantity per actor and assign each actor to a bucket based on the aggregate value.
- Previous/Next Value – Look up the value of the temporally previous or next event for each actor, optionally filtering events.
- Time Since – Calculate how long ago a timestamp property is.
- Enter a name and description.
- Select a dataset for the derived column in the Column of field.
- Decide if you want the derived column to be a measure or attribute.
- Set the Default Aggregation.
- Use blocks to define the computation for your derived column. Ensure that the resulting value is at the grain of the selected dataset or can be grouped by the selected dataset.
- Click Run to preview the computed values.
- Click Save.
The derived column can now be used anywhere, such as inside exploration templates and the block editor.
Types of derived columns
Analytics offers several templates to help you get started. Choosing a template is the first step in creating a derived column, and you can change it at any time without losing your work.
Aggregation/first/last
Calculate an aggregate quantity for each actor. For example, you can create a derived column to calculate the signup date.
- Go to + > Derived Column > Aggregation/First/Last.
- Enter a name and description.
- Choose a dataset in the Column of field. For this example, select the Users dataset.
- Decide if this is a measure or an attribute. Choose Measure for this example.
- Select the default aggregation from the drop-down list. Choose avg.
- Select the aggregate block's Aggregator and a value from the drop-down list. For example, select min as the aggregator and set event_ts as the value. Group the results by Users.
- Click Run.
- Click Save.
Conditional on property
Assign values based on logical conditions. For example, you can create a derived column to determine whether users like or dislike the website's content.
- Go to + > Derived Column > Conditional on Property.
- Enter a name and description.
- Choose the Content dataset in the Column of field.
- Select avg from the Default Aggregation drop-down list.
- Configure the blocks using an If/Then/Else block. Set the condition as follows: if vote_avg >= 7.5, assign the value Thumbs Up; otherwise, assign Thumbs Down.
- Click Run.
- Click Save.
Formula
Write custom logic using the NetScript formula language.
- Go to + > Derived Column > Formula.
- Enter a name and description.
- Choose the Campaigns dataset in the Column of field.
- Choose Attribute in the Measure or Attribute field.
- Keep the Default Aggregation drop-down list unselected.
-
Enter the following formula to calculate the Numeric ID:
id = Data.Campaigns.id; if not(isnull(id) or strlen(id) == 0) then to_int(id) end - Add a bin block to categorize types using Numeric ID as the value. Create the following three bins:
- Bin 1 – Social Media, where x < 151.
- Bin 2 – Organic Search, where 151 <= x < 501.
- Bin 3 – Paid Search, where x >= 501.
- Enable the output toggle for this block.
- Click Run.
- Click Save.
Custom
Calculate values by composing multiple blocks. For example, you can create a derived column to determine the incremental campaign revenue.
- Go to + > Derived Column > Custom.
- Enter a name and description.
- Choose the Product Events dataset in the Column of field.
- Choose Measure in the Measure or Attribute field.
- Select avg from the Default Aggregation drop-down list.
- Add an aggregate block to calculate the number of Ads Viewed. Choose count as the aggregator and select the Product Events dataset. Group the result by Campaigns.
-
Add a formula block to calculate campaign revenue. Use the following formula:
"Ads Viewed" * Data.Campaigns.cpm / 1000 - Click Run.
- Click Save.
Conditional on aggregate
Calculate an aggregate quantity for each actor and categorize them into buckets based on that aggregate value. For example, you can identify high-value app users who actively engage by making purchases or playing content.
- Go to + > Derived Column > Conditional on aggregate.
- Enter a name and description.
- Choose the Users dataset in the Column of field.
- Choose Measure in the Measure or Attribute field.
- Select avg from the Default Aggregation drop-down list.
- Use the If/Then/Else block to categorize high-value users. Set the following conditions: if event_type is Play Content or Make Purchase. Name this block High Value Users.
- Add an Aggregate block. Choose count as the aggregator and use the If/Then/Else block as the value.
- Click Run.
- Click Save.
Previous/next value
Look up the value of each actor's previous or next event, optionally including only certain events. For example, you can find the previous value of the last campaign type in the product events time series.
- Go to + > Derived Column > Previous/Next Value.
- Enter a name and description.
- Choose the Product Events dataset in the Column of field.
- Choose Attribute in the Measure or Attribute field.
- Keep the Default Aggregation drop-down list unselected.
- Choose previous from the drop-down list and choose Product Events as the value in the Previous Value Lookup block.
- Choose Type (a custom derived column) as the value in the output specific property field.
- Add for each conditions if necessary. For example, add the Users dataset.
- Click Run.
- Click Save.
Time since
Calculate how long ago a timestamp property was. For example, you can calculate the time since the last signup.
- Go to + > Derived Column > Time Since.
- Enter a name and description.
- Choose the Users dataset in the Column of field.
- Choose Measure in the Measure or Attribute field.
- Select avg from the Default Aggregation drop-down list.
- Choose the Sign Up Date derived column using the drop-down list in the Timestamp block.
-
Enter the logic in the formula block:
now - Timestamp - Click Run.
- Click Save.
View derived columns
Derived columns filter
- Click Home.
- Select the Derived Columns checkbox.
- Use the search bar to find individual derived columns by name.
You can find a paginated listing of derived columns in the system, pooled across the datasets. Each entry contains the following information:
- Name (with description, if any).
- Created by.
- Updated by.
- Updated at.
- Type (refers to the template used when defining the cohort).
Dataset page
Alternatively, click Derived Columns inside the dataset page to access the derived columns for a given dataset.
Edit derived columns
- Click Home.
- Select the Derived Columns checkbox.
- Identify the derived column you want to edit and click the Name.
Alternatively, click Derived Columns inside the dataset page to access the derived columns for a given dataset and click the Name.
Delete derived columns
- Click Home.
- Select the Derived Columns checkbox.
- Identify the derived column you want to delete and click More (⋮) > Delete.
- Click Yes to confirm.
Alternatively, click the Name to open the derived column and click More (⋮) > Delete.
You can also go to Data > Datasets and click the Name to open the dataset. Click Derived Column > More (⋮) > Delete.
Please sign in to leave a comment.