- Optimizely Web Experimentation
- Optimizely Feature Experimentation
The new A/B results page builds on the capabilities of the classic results page and introduces advanced analytical features — including CUPED, configurable stats engines, cohort segmentation, and deeper data health checks — to give Web Experimentation and Feature Experimentation customers a more powerful way to measure and interpret A/B test performance.
Review each metric and variation to see how visitors respond to your site or application changes. Use segments to investigate visitor behavior in more detail.
The experiment results page is powered by Optimizely's Stats Engine and provides statistical calculations based on the — giving you a data-rich view of visitor interactions and variation performance.
Optimizely Experimentation's Stats Engine powers the Experiment Results page. It provides a data-rich view of visitor interactions, includes confidence intervals, and applies false discovery rate control based on the selected statistical method — sequential or fixed-horizon (frequentist).
The following details are important to know:
- Visitor bucketing – Visitors are randomly bucketed into variations based on your traffic distribution settings, so variation audiences may differ in size.
- Local time and data freshness – Results display in your computer’s time zone and typically become available within five to ten minutes of data ingestion. See Data freshness.
Access experiment results
To access the Optimizely results page:
-
Optimizely Web Experimentation – Go to Optimizations > Overview > Results > Real-time Results (New Version).
-
Optimizely Feature Experimentation – Go to Reports and select a report or go to Flags and select a flag. This would display the results for a specific rule within the flag.
The results page has a Summary and Explore tab. Learn how to access Experiments.
The Summary tab has key insights and gives you an overview of performance of all of your experiment variations against your metrics from the selected experiment to support decision-making.
The Explore tab has two functions:
- It lets you explore data and get deeper insights using features like segmentation and filtering.
- It lets you create a compelling narrative for experiment results using previously created explorations and adding explanatory comments.
Summary
The Summary tab provides an overview of the experiment information configured during experiment setup in Optimizely, including the following:
- Name – Shows the name of the experiment.
- Status – Shows the experiment type (A/B test) and the current live status. See the difference between publish, start, and pause.
- Visitors – Shows a high-level snapshot of total traffic exposure and conversion volume across all variations. See Target visitors with audience conditions.
- Project – Shows the Optimizely project this experiment belongs to.
- Environment (Feature experimentation only)– Shows the environment the experiment is running in.
-
Audiences – Lists audiences targeted in the A/B test.
-
Experiment health and Sample Ratio Mismatch (SRM) detection – Indicates experiment health. Unlike the legacy results page, SRM is not triggered automatically and you have to manually trigger it to see the result. See automatic experiment health indicator.
- Results last updated and last event – Shows when results were last updated and the timestamp of the most recent event based on your computer's time zone.
-
Days running – Shows the total full days the A/B test has run. Optimizely Experimentation truncates any floating-point number part of a day. For example, 17.8 displays as 17.
Under Advanced Settings, you have four configurable controls:
- Baseline – Compares the results for all variations against the variation you choose here. You can change this to any variation using the drop-down list.
- CUPED – Reduces metric variance using pre-experiment data to help reach significance faster.
- Stats engine – Signifies that the experiment uses Optimizely's sequential Stats Engine, which lets you peek at results at any time without inflating false positive rates. You can configure the stats engine when you set up your experiment and you have the following options: Sequential (default), Frequentist (Fixed Horizon), and Bayesian. The stats engine that you select displays on the results page.
-
Statistical significance threshold – Sets the significance threshold against which results are evaluated. You can configure the threshold when you set up your experiment. Results are considered statistically significant when they reach or exceed this level.
Edit experiment
You can change experiment settings and see the experiment within Optimizely. Click the external link icon to go back to your experiment setup.
Manage metrics
The new results page lets you sync metrics from the experiment setup. add a new. Unlike the classic results page, you can add decision-making or guardrail metrics and remove or edit previously added metrics directly within the results page.
Each metric has the following advanced options:
Add conversion window
Optimizely Experimentation calculates results by linking decision events (when a user is bucketed into a variation) with conversion events (actions the user takes, such as clicks or purchases). By default, all conversions that occur after the decision event are attributed to that variation, no matter how many days later they occur, as long as the experiment is still running.
However, with the conversion window feature, you can customize how long conversions are counted after a user is assigned to a variation (is bucketed).
For example, when creating a metric, you might define a window like the following: Only count conversions that occur within 1 day of the user being bucketed into the experiment.
This allows tighter control over what behavior is a valid conversion, focusing your analysis on the experiment's immediate impact rather than long-tail effects.
This is especially useful for actions expected to happen quickly (for instance, form submissions, clicks, and purchases) and gives you more flexibility in interpreting experiment performance.
Add cuped duration
This option changes the period of data that CUPED uses. By default, CUPED uses two weeks of historical data; however, you can change this to a custom period.
Outlier management
The scorecard presents metric results for your experiments. While displayed in the scorecard, the metrics themselves are treated as independent entities behind the scenes. In addition to these metrics, you can apply variance reduction techniques to enhance result reliability.
Outlier management helps improve the reliability and clarity of your metrics by adjusting extreme or anomalous values that could skew results and reduce metric variance. This is particularly useful for conversion metrics calculated as ratios (for example, total clicks per user or total purchase value per user).
The following are the two types of outlier management:
-
Percentile – Uses the Winsorization method that lets you perform robust data analysis by adjusting outlier values in a dataset. Initially, all metric values from users are collected and represented as a range of values. A user-specified percentile, such as the 99th percentile, is then calculated to determine the range that includes the most common values, covering 99% of the data. Values exceeding this range are adjusted to match the specified percentile value, ensuring that extreme outliers do not skew the analysis while maintaining the integrity of the dataset. This process helps achieve a more accurate and reliable representation of the data, facilitating better insights and informed decision-making.
-
Constant – Uses the metric capping method, where the extreme values are replaced with values that are more common for the observed distribution. This lets you limit metric values using user-defined constant thresholds, rather than percentiles (as in Winsorization). It is useful when you already know the acceptable range for your data and want to force all values to stay within a fixed minimum or maximum. Setting the upper bound replaces all values greater than this constant with the upper cap.
For outlier management methods (percentile and constant), you can choose to apply smoothing at the Users dataset level or the Product Events dataset level.
-
Users' level – Smooths outliers at the users' dataset level.
-
Product Events level – Smooths outliers at the product events dataset level.
Consider the following example with a constant outlier threshold of 500 USD:
Kate and Josh are shopping on an ecommerce website and make the following transactions:
- Kate – 200 USD
- Kate – 600 USD
- Josh – 800 USD
-
User-level smoothing
- Kate's total = 200 + 600 = 800 → smoothed to 500
- Josh's total = 800 → smoothed to 500
- Total purchase = 500 + 500 = 1,000 USD
-
Product Event-level smoothing
- Kate's purchases = 200 (ok), 600 → smoothed to 500
- Josh's purchase = 800 → smoothed to 500
- Total purchase = 200 + 500 + 500 = 1,200 USD
Guardrail alerts
The Set alert option lets you set thresholds on key experiment metrics and receive alerts when these thresholds are exceeded. It helps you detect negative impacts early, enabling informed decisions about whether to continue, halt, or adjust an experiment. There are two types of alert notifications: email and Slack. Alert checking occurs every six hours during the first 15 days of the experiment. After that, they occur once per day until day 30, after which no further checks are performed.
Enable guardrails
To set alerts, you must enable the Guardrails feature under Settings > General Settings > Optimizely Integration.
To add alerts,
-
Click More (⋮) > Set alert.
- Set the threshold in the Alert when threshold is breached field. This setting triggers an alert when outcomes surpass or fall below a specified threshold relative to the baseline. Define a percentage change (positive or negative) and assess variations for the selected metric using relative improvement, such as a 10% to 11% change, representing a 10% relative improvement.
- Add a visitor count in the Alert only if users amount is more than field. This triggers an alert after checking the difference between variations and the baseline for a metric, and only after the visitor count specified in this field is reached. When the visitor count is low, the metric is volatile and can fluctuate significantly. A higher number of visitors ensures that the metric value is more stable and closer to its true value. For example, you may require at least 10,000 users before an alert is sent, even if the threshold is breached earlier.
- Choose a list of users in the Notify field.
-
Check the Alert only when statistical significance is reached option so that alerts are only triggered if the indicated threshold was breached by a variation that reached statistical significance, reducing noise from early or incomplete data. Click Save.
Types of alerts
You can trigger two types of alerts when experiment outcomes exceed the threshold, Slack alerts and email alerts.
Slack alerts
-
Add the Optimizely app to Slack.
-
Click Login to Experimentation. When you log in, you will the following Slack commands that enable you to receive notifications in different channels:
-
/subscribe– Subscribe to a particular project in a channel. -
/unsubscribe– Unsubscribe from a particular project. -
/unsubscribe-all– Unsubscribe from all project notifications within the channel. -
/show-subscribed-projects– View all experimentation projects subscribed to the channel. -
/optimizely-help– Trigger the help prompt containing help guidelines.
-
-
Open a channel of your choice and invite the Optimizely app. Type @Optimizely and hit Send.
-
Type
/subscribe. When you subscribe to a project, all guardrail alerts that are set in any experiments of this project will display in the channel automatically. -
Click the Select a project drop-down list that displays the available projects. Select a project to receive alerts.
The following is an example of a Slack alert:
Email alerts
Email alerts are triggered for the users you list in the Notify field. For email notifications, you can add existing Optimizely Analytics users.
The following is an example of an email alert.
Sample Ratio Mismatch (SRM) detection
A Sample Ratio Mismatch (SRM) occurs when the distribution of users across your experiment's variations becomes unexpectedly imbalanced. This can signal potential issues with your experiment configuration or external factors, which may impact the validity of your results. Learn more about Sample Ratio Mismatch (SRM) detection.
Click Check SRM status in the Experiments section to view the latest health status of your experiment's traffic distribution.
Health check
Check health focuses specifically on checking the integrity of data used in experiments to ensure accurate results and reliable decision-making. It checks for the following:
- Dataset primary key uniqueness – Verifies the uniqueness and integrity of your actor identifiers by running a primary key check on the actor dataset. Learn more about the primary key health check.
- Actor identifier alignment – Assesses the alignment of actor identifiers across your event dataset, decision dataset, and actor dataset. Significant misalignment suggests an incorrect column selection during experiment configuration or a broader data integrity issue that needs to be addressed.
- Single variation per actor – Evaluates how many actors were assigned multiple (conflicting) variations within an experiment. It excludes such actors from the experiment analysis. A significant number of these conflicts often indicates a misconfiguration of the experiment.
There are four different health statuses possible for each check. Depending on the check, you may need to adjust different data configurations to ensure accurate results:
- Healthy – Indicates that the data integrity for the applicable check is good.
- Unhealthy – Indicates a critical data integrity issue was found.
- Warning – Indicates that a potential issue was detected that could impact the accuracy of your results.
- Skipped – Indicates that the health check could not be performed due to an invalid primary key configuration. This occurs when the selected columns are incompatible or misconfigured.
Share experiment results
You can share the Results page with key stakeholders using one of the following methods:
Click the share icon. The Share Results dialog displays the link to the results page. Click Copy Link to copy and send the provided URL. Click Reset Link to reset the URL.
Graphs
Graphs provide a granular view of the data. The following graph types are available:
- Improvement over time (default) – Improvement in this metric for each variation compared to the baseline. See Confidence intervals and improvement intervals.
- Metric over time – Conversions per day in this metric for each variation, including the original.
- Statistical significance over time – Cumulative statistical significance for the variation.
Explore
The Explore tab lets you perform segmentation comparisons, funnel analysis, and other explorations for additional insights.
There are three ways to access the Explore tab:
-
Click Explore in the experiment results page.
-
Click one of the following exploration options within the Summary tab: Change Date Range, Segment by Cohorts, Group By Property, and Filter out data.
- Change Date Range - Narrow or expand the results view to a specific time window within the experiment's runtime.
- Segment by Cohorts – Filter results to a defined group of actors to understand how a specific cohort responds to each variation.
- Group By Property – Break down results by one or more attributes (for example, device type or browser) to identify performance differences across audience segments.
- Filter out data – Exclude specific data points or conditions from the results view to focus analysis on relevant traffic.
-
Click Explore results data on a metric within the Summary tab to open the Explore tab and dive deeper into that metric's data.
Exploration summary with Optimizely Opal
Prerequisites
- You must use Opti ID to access Opal.
- Your Optimizely Web and Feature experimentation instances must be enabled for Opti ID.
- You must have generative AI enabled in Optimizely.
If you use Opti ID, administrators can turn off generative AI in the Opti ID Admin Center. See Turn generative AI off across Optimizely applications.
Using Optimizely Opal, you can interpret and summarize the data presented in the results module without having to scan through the result tables manually. Click the summarize icon in the visualization window to summarize your exploration.
The chat displays the following information:
- A brief summary of your experiment results
- Key takeaways
- Next steps and suggestions
Segment experiment results
You can segment your results by cohorts and attributes.
-
Segment – Click Segment and segment your results by a chosen cohort of actors. Filter results based on a dynamically defined behavioral cohort, such as visitors who performed a specific sequence of actions during the experiment (for example, visitors who did action A and then action B). Click Add to Explore to add it to the result analysis.
-
Group By – Refine your results using one or more attributes. When you group results by attributes, it creates a results breakdown based on the selected property. For example, you can compare the results for desktop vs mobile users, or users in different countries. Click Add to Explore to add it to the result analysis.
Add new tiles
Click + Add Tile to customize your visualization window.
- New Visualization – Add a new exploration to the Explore tab.
- Existing Visualization – Select an existing exploration and add it directly to the Explore tab.
- Filter – Add filters that you can use to narrow down data in a visualization.
- Parameter – Modify the value of any placeholder parameters used in the queries of linked visualization tiles.
- Text – Add blocks of text anywhere in the Explore tab to provide additional context.
Visualization options
The visualization module interprets the results of the metrics defined in the Summary tab. It displays a summary first, followed by individual metric interpretations. Select Explore results data on any metric to dive deeper into that metric's data in the Explore tab.
The Summary section provides a high-level view of how each variation performs across all metrics. For each variation, it displays:
- Visitors – The total number of visitors and their percentage share of experiment traffic.
- Metric value – The raw metric value for each variation across all configured metrics.
-
Improvement – The relative percentage change compared to the baseline (Original). The Original row displays – as it is the comparison point.
The Primary label identifies which metric the stats engine prioritizes for significance.
Following the summary, each metric is listed individually and numbered by priority. The primary metric is ranked 1 and labeled Primary. For each metric, the following columns are displayed:
- Metric value – The raw computed value of the metric for each variation.
- Unique conversion – The numerator (unique conversions) and denominator (total visitors) used to calculate the metric value.
- Improvement – The relative percentage change over the baseline. Displays – for the Original and a positive or negative percentage for each variation.
- Confidence interval – A visual bar representing the range within which the true improvement is estimated to fall. When the interval sits entirely above or below zero, the result is statistically significant.
-
Stat Sig Level – The statistical significance level reached by the variation for that metric.
Each metric includes an Analyze Over Time graph with three toggleable views:
- Improvement (Default) – Tracks how the relative lift of each variation evolves over the experiment duration.
- Metric value – Shows how the raw metric value changes over time for each variation.
-
Statistical significance – Displays how the significance level trends as data accumulates.
Comments
You can add comments about items in the visualization by clicking the Comment icon, entering your notes, and clicking Send.
To edit a comment, click More (...) > Edit Comment. Make your changes, and click Confirm to save.
To delete a comment, click More (...) > Delete Comment. Click Confirm to delete.
Status
Select Status to update the current state of the experiment using the drop-down list. The available options are:
- Running – Keeps the experiment live and actively collecting data.
- Pause – Temporarily stops the experiment without archiving it.
- Conclude – Ends the experiment and locks the results.
- Archive – Removes the experiment from the active view.
Download results
Click the download icon to export the current results as a CSV file. The export reflects the data currently displayed on the page, including any active filters, segments, or date range selections.
Other options
Click More (⋮) to access two additional actions:
- Switch to classic view – Returns to the classic experiment results page.
- Reset results – Clears the current results data and returns counts to zero. Note that resetting results does not affect variation bucketing — visitors continue to see the same variation they were previously assigned.
Please sign in to leave a comment.