Imbalance detected: What to do if Optimizely's automatic SRM detection alerts you to an imbalance in your Stats Engine A/B test

  • Updated
  • Optimizely Web Experimentation
  • Optimizely Personalization
  • Optimizely Performance Edge
  • Optimizely Feature Experimentation
  • Optimizely Full Stack (Legacy)

A sample ratio mismatch (SRM) occurs when the traffic distribution between variations in a Stats Engine A/B experiment becomes severely imbalanced because of an implementation issue. This imbalance may lead to experiment degradation and, in extreme cases, inaccurate results. It is important to remember that not all imbalances should cause immediate panic and abandonment of the experiment.

Timing is important for determining when to do an imbalance check for an experiment. That is why Optimizely's automatic SRM detection evaluates an experiment continuously. An imbalance's detection indicates a symptom of various data quality issues.

Automatic SRM detection alerts

An automatic SRM detection alert does not necessarily mean your experiment is ruined. If Optimizely's automatic SRM detection alerts you to an imbalance in your A/B test, that may indicate an external influence affecting the distribution of traffic. It is important to exercise caution and refrain from overreacting to every traffic disparity, as this does not instantly signify that an experiment is useless.

The Experiment Health indicator on the Optimizely Experiment Results page alerts you if your experiment is experiencing an SRM.

SRM detected health status

An SRM detected experiment health status indicates that there is a visitor imbalance. 

SRM detected Experiment Health

This means that the experiment experiences a statistically significant difference in counts at a radically different probability than you intended, also known as a consistent and non-ignorable underlying assignment bias.

Because Optimizely's SRM detection can uncover a potential SRM earlier than a traditional chi-squared test, it is important to examine the data on the specific date when SRM is identified.

See Possible causes for traffic imbalances for reasons why a visitor imbalance may occur, and what you should do if your experiment reports an SRM detected status.

Good health status

A Good experiment health status indicates that no visitor imbalance is detected.

 You do not need to do anything, and your experiment is running smoothly.

Information about a normal traffic split

You may observe that the number of visitors assigned to each experiment variation is never exactly at a 50/50 split, yet the experiment shows a green checkmark and a Good health status. This is not a bug. Do not expect a perfect 50/50 split for every experiment you run. There is always some slight deviation.

An imbalance occurs when the actual proportion of traffic does not match the intended size you assigned to a variation. You cannot visually check the probability of the severity or lack of assignment bias to a variation across the life of the experiment. When an experiment shows a Good health status and a slightly imperfect split, the algorithm determines nothing unusual from what you intended.

Optimizely's hashing function supports this and determines which variation to show to a user. It uses a Murmurhash function to assign visitors. For bucketing, Optimizely assigns each user a number between 0 and 10,000 to determine if they qualify for the experiment and, if so, which variation they see.

Think of it as a coin flip per user, but that coin flip always gives the same result for the same user. If you flip a coin 10,000 times, it is extremely unlikely you get precisely 5,000 heads and 5,000 tails.

  • Achieving a perfect, exact 50/50 split of 5,000 heads has a probability of 0.008 (or 0.8%) if you repeat that process indefinitely.
  • You get approximately 5,000 heads in 10,000 independently and identically distributed (IID) fair coin flips.

Good traffic splits and SRM imbalances

In a live experiment, the actual split of traffic you assigned to a variation relatively aligns with the intended traffic split. An SRM is a statistically significant difference in traffic splits from the intended experiment design. This does not imply any visible difference. 

Optimizely Experimentation's automatic SRM detection algorithm considers the following:

  • The severity of the imbalance.
  • The consistency of the imbalance over time.

For the detection algorithm to declare an imbalance, it must observe the traffic distribution settle into a consistent pattern weighting one variation more than another. Eventually, this results in a cumulative effect that drives the statistical significance below the test's threshold cutoff.

Not all traffic splits are the same. In the initial stages of an experiment, if the traffic volume is relatively low, your percentage differences can be exaggerated. However, according to Optimizely's statistical imbalance detection test analysis, they may not be statistically extreme. This is partly why the automatic SRM detection algorithm does not start analysis until more than 1,000 visitors arrive at a Stats Engine experiment.