Derived columns

  • Updated

A derived column is a computed property linked to a dataset. This property is not present in the source data. Instead, you define it within Analytics. You can include various calculations in a derived column, such as joins with other datasets, filters, and aggregations, as long as the result applies to each record in the target dataset.

For example, if you have a users dataset and an events dataset with a boolean column named vote, you can define a derived column in the users dataset that counts the total number of true votes per user.

The configuration for a derived column is saved within Analytics and accessed through the Catalog. Like other columns in the dataset, you can use a derived column anywhere the catalog is accessible.

A derived column has the following configuration:

  • Dataset – Associate the derived column with exactly one dataset, which is selected when defining it. It displays in the Catalog like any other column, and you can use it for any analysis with catalog access. You cannot move a derived column to another dataset.
  • Name – Ensure the derived column's name is unique across columns (including derived columns and cohorts) within its dataset.
  • Definition – Specify the computation for a derived column using the block editor.
  • Default aggregation – Set a default aggregation type, such as count or sum, for contexts where you use the derived column as a measure.

Create a derived column

  1. Click + and select Derived Column.
  2. Choose one of the templates to get started.
    • Aggregation/First/Last – Create an aggregate quantity for each actor.
    • Formula – Write custom logic using the NetScript formula language.
    • Conditional on Property – Assign values based on logical conditions.
    • Custom – Calculate values by composing multiple blocks.
    • Conditional on Aggregate – Calculate an aggregated quantity per actor and assign each actor to a bucket based on the aggregate value.
    • Previous/Next Value – Look up the value of the temporally previous or next event for each actor, optionally filtering events.
    • Time Since – Calculate how long ago a timestamp property is.
  3. (Optional) Click Unnamed Derived Column, and enter a name and description.
  4. Select a dataset for the derived column in the Column of field.

  5. Decide if you want the derived column to be a measure or attribute.
  6. Set the Default Aggregation.
  7. Use blocks to define the computation for your derived column. Ensure that the resulting value is at the grain of the selected dataset or can be grouped by the selected dataset.
  8. Click Run to preview the computed values.
  9. (Optional) Choose a different view for your visualization from the chart drop-down list.
  10. Click Save.

The derived column can now be used anywhere, such as inside exploration templates and the block editor.

Derived column

Types of derived columns

Analytics offers several templates to help you get started. Choosing a template is the first step in creating a derived column, and you can change it at any time without losing your work.

Aggregation/first/last

This template lets you calculate an aggregate quantity for each actor. For example, you can create a derived column to calculate the signup date.

  1. Go to + > Derived Column > Aggregation/First/Last.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose a dataset in the Column of field. For this example, select the Users dataset.
  4. Decide if this is a measure or an attribute. Choose Measure for this example.
  5. Select the default aggregation from the drop-down list. Choose avg.
  6. Select the aggregate block's Aggregator and a value from the drop-down list. For the example, select min as the aggregator and set event_ts as the value. Group the results by Users.
  7. Click Run.
  8. Click Save.

Aggregation/First/Last

Conditional on property

This template lets you assign values based on logical conditions. For example, you can create a derived column to determine whether users like or dislike the website's content.

  1. Go to + > Derived Column > Conditional on Property.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Content dataset in the Column of field.
  4. Select avg from the Default Aggregation drop-down list.
  5. Configure the blocks using an If/Then/Else block. Set the condition as follows: if vote_avg >= 7.5, assign the value Thumbs Up; otherwise, assign Thumbs Down.
  6. Click Run.
  7. Click Save.

Conditional on property

Formula

This template lets you write custom logic using the NetScript formula language.

  1. Go to + > Derived Column > Formula.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Campaigns dataset in the Column of field.
  4. Choose Attribute in the Measure or Attribute field.
  5. Keep the Default Aggregation drop-down list unselected.
  6. Enter the following formula to calculate Numeric ID:
    id = Data.Campaigns.id;
    if not(isnull(id) or strlen(id) == 0) then to_int(id) end
  7. Add a bin block to categorize types using Numeric ID as the value. Create the following three bins:

    • Bin 1 – Social Media, where x < 151.
    • Bin 2 – Organic Search, where 151 <= x < 501.
    • Bin 3 – Paid Search, where x >= 501.
  8. Enable the output toggle for this block.
  9. Click Run.
  10. Click Save.

Formula

Custom

This template lets you calculate values by composing multiple blocks. For example, you can create a derived column to determine the incremental campaign revenue.

  1. Go to + > Derived Column > Custom.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Product Events dataset in the Column of field.
  4. Choose Measure in the Measure or Attribute field.
  5. Select avg from the Default Aggregation drop-down list.
  6. Add an aggregate block to calculate the number of Ads Viewed. Choose count as the aggregator and select the Product Events dataset. Group the result by Campaigns.
  7. Add a formula block to calculate campaign revenue. Use the following formula:
    "Ads Viewed" * Data.Campaigns.cpm / 1000
  8. Click Run.
  9. Click Save.

Custom

Conditional on aggregate

This template lets you calculate an aggregate quantity for each actor and categorize them into buckets based on that aggregate value. For example, you can identify high-value app users who actively engage by making purchases or playing content.

  1. Go to + > Derived Column > Conditional on aggregate.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Users dataset in the Column of field.
  4. Choose Measure in the Measure or Attribute field.
  5. Select avg from the Default Aggregation drop-down list.
  6. Use the If/Then/Else block to categorize high-value users. Set the following conditions: if event_type is Play Content or Make Purchase. Name this block High Value Users.
  7. Add an Aggregate block. Choose count as the aggregator and use the If/Then/Else block as the value.
  8. Click Run.
  9. Click Save.

condtional on aggregate

Previous/next value

This template lets you look up the value of each actor's previous or next event, optionally including only certain events. For example, you can find the previous value of the last campaign type in the product events time series.

  1. Go to + > Derived Column > Previous/Next Value.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Product Events dataset in the Column of field.
  4. Choose Attribute in the Measure or Attribute field.
  5. Keep the Default Aggregation drop-down list unselected.
  6. Choose previous from the drop-down list and choose Product Events as the value in the Previous Value Lookup block.
  7. Choose Type (a custom derived column) as the value in the output specific property field.
  8. You can also add for each conditions if necessary. For example, you can add the Users dataset.
  9. Click Run.
  10. Click Save.

Prev/Next

Time since

This template lets you calculate how long ago a timestamp property is. For example, you can calculate the time since the last signup.

  1. Go to + > Derived Column > Time Since.
  2. (Optional) Click Unnamed Derived Column, and enter a name and description.
  3. Choose the Users dataset in the Column of field.
  4. Choose Measure in the Measure or Attribute field.
  5. Select avg from the Default Aggregation drop-down list.
  6. Choose the Sign Up Date derived column using the drop-down list in the Timestamp block.
  7. Enter the logic in the formula block:
    now - Timestamp
  8. Click Run.
  9. Click Save.

Time since

View derived columns

Derived columns filter

  1. Click Home.
  2. Select the Derived Columns checkbox.
  3. Use the search bar to find individual derived columns by name.

You can find a paginated listing of derived columns in the system, pooled across the datasets. Each entry contains the following information:

  1. Name (with description, if any).
  2. Created by.
  3. Updated by.
  4. Updated at.
  5. Type (refers to the template used when defining the cohort).

Derived column list

Dataset page

Alternatively, click Derived Columns inside the dataset page to access the derived columns for a given dataset.

Derived column for dataset