Optimizely's automatic sample ratio mismatch detection

Optimizely Web Experimentation
Optimizely Performance Edge
Optimizely Feature Experimentation
Optimizely Full Stack (Legacy)

A sample ratio mismatch (SRM) occurs when the traffic distribution between variations in a Stats Engine A/B experiment becomes severely and unexpectedly unbalanced, often because of an implementation issue or third-party bots.

If an SRM does occur, it indicates a potential external influence affecting the traffic distribution. Exercise caution and do not overreact to traffic disparities. An imbalance does not automatically signify that an experiment is useless.

How Optimizely protects your Stats Engine A/B experiments with its automatic SRM detection

Optimizely Experimentation alerts you to experiment deterioration promptly. Early detection helps you decide the severity of the imbalance and stop a faulty experiment. This early detection can greatly reduce the number of potential users exposed to a faulty experiment.

To rapidly detect deterioration caused by mismanaged traffic distribution, Optimizely Experimentation's automatic SRM detection uses a statistical method called sequential sample ratio mismatch (SSRM). Optimizely's SSRM algorithm continuously checks traffic counts throughout an A/B experiment. It provides immediate detection at the beginning of an experiment's lifecycle instead of waiting until the experiment's end to test for an imbalance.

For information on why Optimizely does not use chi-squared tests to evaluate for imbalances, see A Better Way to Test for Sample Ratio Mismatches (SRMs) and Validate Experiment Implementations.

Going through your old experiences and trying to find imbalances using an online ratio mismatch calculator is not helpful. This retroactive or end-of-experiment imbalance check is not a recommended use of your time. Retroactive imbalance testing informs you about a possible implementation problem only after the experiment has collected all the data, which is far too late and goes against why most experimenters want imbalance detection in the first place. In another, possibly uncommon scenario, an SRM might occur during the experiment. However, by the conclusion of the experiment, the total traffic across variants becomes similar, effectively concealing the SRM due to the influx of additional data.

You should run automatic checks. The algorithm Optimizely created checks for imbalances daily (not just at the end of an experiment) so you can identify problems early.

Automatic SRM notifications are available only for A/B tests

With the traffic distribution set to Manual (Stats Accelerator is NOT enabled).
That are running for 45 days or less. Measured as total running time, not age. The days the experiment is paused do not count towards the day total.
That have at least 1000 visitors.

Segment experiment results

Optimizely Experimentation does not check for visitor imbalances when you segment your results.

Segments and filters should only be used for data exploration, not making decisions.

The Sure Thing Principle (STP)^1,2 states that if an action increases the chance of a negative outcome across segments, it increases the chance overall. So, Optimizely Experimentation does not check for visitor imbalances in segments.

Citations: ^1. Savage, L. J. (1954). The Foundations of Statistics. John Wiley & Sons. ^2. Joyce, J. M. (1999). The Foundations of Causal Decision Theory. Cambridge University Press.

Paused or archived experiments and flags

Optimizely Experimentation does not check for visitor imbalances for the following:

Optimizely Web Experimentation and Optimizely Performance – Paused or archived experiments.
Optimizely Feature Experimentation – Paused flag rules, flags you turned off, or archived flags.

Sample ratio mismatch

An SRM occurs when the traffic distribution between variations in a Stats Engine A/B experiment becomes significantly imbalanced. Optimizely Experimentation's Stats Engine does not generate SRMs, and its traffic-splitting mechanism is trustworthy. A severe traffic distribution imbalance may lead to experiment degradation and, in extreme cases, inaccurate results.

For example, in a Stats Engine A/B test, you set a 50/50 traffic split between Variation A and Variation B. But instead, you observe a 40/60 traffic distribution.

Remember, not every imbalance is a reason to panic and immediately abandon your experiment. If you properly understand the cause of the traffic distribution imbalance, you can still make concrete conclusions. An imbalance does not mean your experiment results are immediately invalid.

Evaluating experiments for traffic imbalances is most helpful at the start of your experiment launch period. Finding an experiment with an unknown source of a traffic imbalance lets you turn it off quickly and reduce the blast radius. Furthermore, performing an A/A test before running an actual experiment can help uncover potential issues in the configuration, thereby reducing the likelihood of encountering an SRM.

Optimizely Experimentation's automatic SRM detection leverages a sequential sample ratio mismatch algorithm. That algorithm continuously and efficiently checks traffic counts throughout an experiment. Optimizely Experimentation's automatic SRM detection is only for stats engine A/B experiments.

For information on why a traffic imbalance may be occurring, see Possible causes for traffic imbalances.

Optimizely's automatic sample ratio mismatch detection

How Optimizely protects your Stats Engine A/B experiments with its automatic SRM detection

Segment experiment results

Paused or archived experiments and flags

Sample ratio mismatch

<%= previousTitle %>

<%= nextTitle %>

In this article

<%= heading %>

<% if (!block.description) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

<%= heading %>

<% if (!block.description) { %> <%= parsed.title %> <% } else { %> <%= parsed.title %> <% } %>

User Research

Security Announcements

Still have questions?

Categories

Toggle navigation menu

<%= category.name %>