Primary metrics, secondary metrics, and monitoring goals in Optimizely Experimentation

  • Updated
  • Optimizely Web Experimentation
  • Optimizely Performance Edge
  • Optimizely Feature Experimentation
  • Optimizely Full Stack (Legacy)
This article is part of The Optimization Methodology series.

The right metrics can validate or disprove your hypothesis and help you progress toward your business goals.

In experience optimization, metrics have different roles depending on where you set them and what you want to know:

  • Primary metric – Determines whether the test wins or loses. It tracks how your changes affect visitor behavior.
  • Secondary metrics – Provide additional information about visitor behavior related to your change and across your site.
  • Monitoring goals – Are goals and events that are not your primary or secondary metrics. Monitoring goals do not affect the speed of primary or secondary metrics.

Here are a few tips for setting goals and events:

  • Focus on a direct visitor action that is on the same page as the changes you made.

  • Consider how your changes affect other parts of your site. Set goals and events to measure potential interaction to see if your test moves customers in the right direction.

  • Place different goals at different points in your funnel to gather timely data about visitor behavior.

See the Differences between events and metrics for more information.

Primary metric

See how to set up a primary metric in Optimizely Web Experimentation. Create events in Optimizely Feature Experimentation and track them in your experiments.

Optimizely Experimentation lets you set a primary metric for each experiment to determine its success. It is the most important goal of the experiment and proves or disproves your hypothesis.

The secondary metrics and monitoring goals do not affect how fast the primary metric reaches statistical significance. Stats Engine treats the primary metric separately because of its importance.

However, the more goals and variations you include in an experiment, the longer each takes to reach significance. Distinguish a primary metric from secondary metrics and monitoring goals. Stats Engine corrects for false discovery rate to help you make better business decisions.

Use these questions to choose a primary metric:

  • What visitor action indicates that this variation is a success? – Measure the action visitors take as a result of this test.
  • Does this event directly measure the behavior you are trying to influence? – Use an action directly affected by your change to decide whether your test helped.
  • Does the event fully capture the behavior you are trying to influence? – Consider whether your primary metric captures the behavior you are trying to influence.

Many optimization teams automatically track revenue per visitor as the primary metric. Top-level metrics like revenue and conversion rate are important, but the events involved are often far away from the changes made, and your test may take a long time to reach statistical significance or end up inconclusive.

For example, you test the design and placement of an Add-to-Cart button. Your business cares about revenue, measured five pages down the funnel. You may devote a large amount of traffic to this test and risk an inconclusive result.

You decide to measure clicks to the Add-to-Cart on product pages instead. This is a primary metric directly affected by the changes you make. With a goal tree, you know this metric is directly related to company goals.

If you are testing bulk discounts on your site, your primary metric might be conversion rate or average order value (AOV). Neither metric fully accounts for the behavior you are trying to affect.

The conversion rate could rise as customers are incentivized or decrease as customers wait to create large, discounted orders. AOV could rise as customers buy more in bulk or decrease as discounts replace full-price orders.

From this perspective, revenue-per-visitor is the best metric. It equals the conversion rate (how often customers purchase) multiplied by the AOV (how much they spend). It is the best overarching goal in this test, where smaller goals may provide conflicting information.

Secondary metrics

Secondary metrics help you gather insights for long-term success. Optimizely Experimentation lets you rank secondary metrics from two to five.

Secondary metrics track long-distance events and more ambitious metrics, such as end-of-funnel events like order value and order confirmation. These events provide valuable information but may reach significance slower as a primary metric.

Secondary metrics also give you visibility across the different funnel steps. For example, if you change your product page and display shipping costs, your secondary metric might measure the change in drop-offs from the shipping page in your funnel. Use secondary metrics to learn when visitors drop off or return to the home page and how these patterns compare between the original and variations.

Here is a list of common secondary metrics and the reason for tracking:

  • Searches submitted – See how many searches are submitted.
  • Category pageview – Discover whether visitors go to the site through category pages.
  • Subcategory pageview – Learn whether visitors reach subcategory pages.
  • Product pageview – Know the percentage of visitors who do or do not view a product during a visit.
  • Add-to-cart – Understand what percentage of visitors add-to-cart per test, category, or product type. 
  • Shopping cart pageview – See how many visitors progress to the shopping cart.
  • Checkout pageview – Understand how many visitors continue from the shopping cart to checkout. 
  • Payment pageview – Learn what percentage of visitors continue from checkout to payment.
  • Conversion rate – Know what percentage of visitors ultimately convert or complete payment.

If you use Optimizely Web Experimentation to test on a checkout page, you may need to configure your site for PCI compliance.

Estimate time to statistical significance for multiple secondary metrics

In Optimizely Experimentation's Sample Size Calculator, fill out your baseline conversion rate and minimum detectable effect (MDE). For the statistical significance threshold, enter 100 - (100 - S)/N, where S is your desired threshold (the default is 90) and N is the number of metrics multiplied by variations other than baseline.

For example, if you run an experiment with 2 metrics and 2 variations plus a baseline at 90 significance, your secondary metric must have the number of visitors reach 100 - (100 - 90)/(2*2) = 97.5 significance with 1 goal and 1 variation.

This is a higher number on average, which means you likely see significance sooner.

Monitoring goals

Monitoring goals are goals and events that are not your primary or secondary metrics. They help you gather insights that are key to long-term success but are diagnostic. They do not impact the speed of primary or secondary metrics.

Monitoring goals track whether your experiment moves visitors in the right direction. Your change might create adverse effects in another metric. Monitoring goals form a warning system that alerts you when you hurt another revenue path.

For example, you show visitors more products on the product category page. With your primary metric, you find that people view more products. Here are some other questions the monitoring goal can help you find out:

  • Are people more price-conservative when presented with more products? (Average order value)
  • Are people buying more products? (Conversion rate)
  • Are people frustrated and unable to find what they are looking for? (Subcategory filters)

Here are common monitoring goals and their reason for tracking:

  • Search bar opened – Learn what percentage of search bar interactions do not lead to submissions.
  • Top menu Clickthrough rate (CTR) – Discover how often visitors use the top menu per page or step in the funnel.
  • Home page CTR – See how often visitors exit to the Home page from any given page. 
  • Category page filter usage – Understand the frequency of filter usage.
  • Product page quantity selection – Understand the percentage of visitors interacting with quantity selection.
  • Product page more info – Understand how many visitors seek information about a product
  • Product page tabs – Discover how often visitors interact with each tab.
  • Payment type chosen – See which payment type users prefer per experiment.
  • Return/back button CTR – Learn how often visitors exit a page with a specific button.

Stats Engine approach to metrics and goals

When you run an experiment with many variations and metrics, you increase the chance that they give false positive results. Your test data may show a significant difference between your original and your variation, but it is just noise in the data. This makes it harder to declare winners.

Stats Engine uses false discovery rate control to reduce your chance of making an incorrect business decision or implementing a false positive among conclusive results. As a result, Stats Engine becomes more conservative when you add more metrics to an experiment.

Here is how Stats Engine ranks primary and secondary metrics and monitoring goals:

  1. The primary metric, independent of other metrics

  2. Secondary metrics as an independent group of up to four metrics (metrics #2-5)

  3. All monitoring goals together as one group (metrics #6+)

Revenue goals and skew correction with Stats Engine

Stats Engine works the same for revenue-per-visitor goals as for other goals. You can look at your results and get an accurate assessment of your error rates on winners and losers and confidence intervals on the average revenue per visitor (RPV).

However, when interpreting your results for RPV goals, you should be aware of some differences. Testing for a difference in average revenue between a variation and baseline is more challenging because revenue distributions are often heavily skewed. This impedes the distributional results for many techniques, including t-tests and Stats Engine. They have less power to detect differences in average revenue when those differences exist.

Optimizely Experimentation's Stats Engine regains some of this lost power through skew correction. Skew corrections were designed to work well with other aspects of Stats Engine. The underlying skewness of the distributions is correctly factored into the shape of the confidence interval. Detecting differences in average revenue is more reasonable for the visitor counts that Optimizely Experimentation regularly sees in A/B tests.

Strategies for metrics

Use these strategies to help you decide what metrics to use for your experiments.

Consider speed and impact

In a funnel, the most immediate effects are directly downstream from your changes. When an event is closer to the change, the measurable impact is bigger. As you move downstream, the signal lessens as visitors from different paths and motivations enter the stream. The effect may not be measurable at the end of the funnel.

Metrics with lower conversion rates require more visitors to reach statistical significance. Events further from the page have a lower improvement in conversion rates as visitors enter from different paths, leave the site before they convert, and more. Your test will take longer to reach significance.

Consider setting a primary metric on the same page as your change. This finds the impact of your change immediately, so you can find a winning variation faster. Quick wins help generate credibility and interest in your testing program and provide fast, reliable insights about visitor behavior. You can build a data-rich testing program to iterate on its insights.

Revenue and conversion rate make excellent secondary metrics and help keep your program focused on long-term success.

Choose high-signal goals

Optimizely Experimentation's Stats Engine reacts to the number of goals or events and variations in your experiment to align statistical significance with your risk in making business decisions from experiments. 

Adding more goals or events and variations to your experiment increases your chance of implementing a falsely significant result with traditional statistics, which Stats Engine corrects with false discovery rate control. However, not all goals and events are equal. High-signal goals and events are less likely to contribute to false discoveries because they are often less noisy.

Similarly, with Stats Engine, the more high-signal goals in your experiment, the faster your secondary metrics and monitoring goals reach significance. If speed to significance concerns your organization, consider limiting the number of non-primary metrics in your experiment and focusing on goals or events related to your variations.