- Optimizely Web Experimentation
- Optimizely Performance Edge
- Optimizely Feature Experimentation
- Optimizely Full Stack (Legacy)
The determination of a traffic imbalance by Optimizely Experimentation's automatic sample ratio mismatch (SRM) detection system is a symptom of various data quality issues. Implementation errors and third-party bots are the most common culprits behind experiment imbalances. To minimize the likelihood of an imbalance, you should set up and run your experiment carefully.
Specific experiment configurations pose a greater risk of an imbalance occurring. Assess the following scenarios to see if they are relevant to your experiment structure.
Redirect experiments
Redirect experiments are a known and reasonable cause of traffic imbalance. In Optimizely Web Experimentation or Optimizely Performance Edge, you can compare two separate URLs as variations in a Stats Engine A/B test. For example, you might create a redirect experiment that compares two landing page versions.
Due to the nature of redirects, users may close the window or tab and leave a page before the redirect finishes executing. The Optimizely Experimentation code does not activate in this situation, preventing the transmission of the event data to Optimizely. Optimizely never receives the data, so Optimizely does not count the user. Redirect experiments are valid experiments, but it is reasonable a slight imbalance may occur.
URL redirects can vary, and you cannot rely on them to behave consistently. It is unreasonable to expect a specific fixed rate of redirects. You must not make ad-hoc adjustments with over or under-correction of traffic for redirect experiments. You must not run redirect experiments for an extended period solely to rebalance visitor counts.
There are two major reasons Optimizely Experimentation delays event tracking until after the redirect is completed:
-
Performance – Optimizely Experimentation's redirect hides the initial page's content through CSS. Naturally, there is a delay between a user accessing a webpage and receiving any content. End users are rightfully sensitive to site performance. The delay could be exacerbated if Optimizely Experimentation waits until the event is sent. If changes are applied to the page a customer is redirected to, then the snippet still needs to apply, and the user gets pushed out from receiving an experience. Optimizely Experimentation minimizes this extra time by waiting until the customer is on the second page.
-
Accuracy – The only way Optimizely Experimentation knows the redirect is complete, the user is bucketed, and they receive the variation is to send the event when the second page loads. You might think giving the snippet time to send the event and confirm receipt, then redirect, ensures accurate results. However, that is not the case if Optimizely Experimentation counts users in the redirect variation inaccurately and includes their data (or lack thereof) in the results processing. That would distort the reliability and precision of metrics reported on the Experimentation Results page.
There is a multitude of reasons why a redirect may fail that are out of your control. For example,
- The browser may reject it if there are too many redirects. Optimizely Experimentation may not be the only thing redirecting the user, and it may be one step in a series of redirects.
- A user can have a browser setting or extension that rejects redirects.
- The delay can be long enough that a user closes the tab before the redirect has finished.
Reduce and then increase traffic allocation
There is a high risk of corrupting your experiment data if the percentage of overall traffic allocation is moved down and then back up. You can directly cause a traffic imbalance in your experiments if you (or a teammate) down-ramp traffic (reduce traffic allocation) and then up-ramp traffic (increase traffic allocation). Bucketing at Optimizely is:
- Deterministic – The way Optimizely Experimentation hashes user IDs, a returning user is not reassigned to a new variation.
- Sticky unless reconfigured – If you reconfigure a "live," running experiment, for example, by decreasing and then increasing traffic, a user may get rebucketed into a different variation.
If you down and then up-ramp traffic, re-bucketing occurs, and you can irreparably harm the results of your experiment. When users are rebucketed because you down-ramped and then up-ramped your traffic allocation, it distorts visitor counts for each variation of that experiment. This results in a traffic imbalance caused by you.
For example, if you launch a few Optimizely Feature Experimentation experiments and allocate 80% of an audience, then you down-ramp traffic allocation to 50%. And then you up-ramp traffic allocation to 80%. Users previously exposed to the Optimizely Feature Experimentation flag may no longer see it when you ramp up the traffic allocation.
Check your experiment history for Web Experimentation or the flag's history for Feature Experimentation to troubleshoot traffic allocation changes. This lets you determine if traffic allocation was changed and by whom.
The easiest way to avoid imbalances associated with traffic allocation is to refrain from decreasing and increasing total traffic while an experiment is live. To avoid this issue entirely in Optimizely Feature Experimentation, take a proactive approach by raising the traffic monotonically (ramping up traffic) in one direction or implementing a user profile service (UPS).
Reset the results of an experiment that already has visitors
If you manually reset the results on the Experiment Results page, you can cause a traffic imbalance in your experiment. Resetting the results resets the decision and conversion counts from zero, but it does not change the bucketing of visitors who viewed the experiment before the reset. If pre-reset visitors return to one variation more than the others after the reset, the visitor counts are skewed toward that variation.
For example, you run an experiment for a while with a control and two additional variations. After this time, you add a third variation, reset the results, and let the experiment run again. The visitors bucketed into the experiment before the reset will continue to see the one of the pre-reset variations. This can cause those variations to receive more visitors than the third variation after the reset, especially if the experiment had a lot of pre-reset traffic. When this happens, an SRM is triggered.
|
Pre-Reset |
Post-Reset |
||
Variation |
Pre-Reset Visitors |
Returning Visitors |
New Visitors |
Total Visitors |
Control |
100,000 |
25,000 |
50,000 |
75,000 |
Variation 1 |
100,000 |
25,000 |
50,000 |
75,000 |
Variation 2 |
100,000 |
25,000 |
50,000 |
75,000 |
Variation 3 |
0 |
0 |
50,000 |
50,000 (imbalance) |
This can also occur if the variations remain the same after the reset. If one or more variations provided a better experience than the others, more of those happy visitors return to the page after the results reset, and they are bucketed into those same better-performing variations. While the new traffic may be evenly split, the returning visitor behavior causes an imbalance between the variations.
|
Pre-Reset |
Post-Reset |
||
Variation |
Pre-Reset Visitors |
Returning Visitors |
New Visitors |
Total Visitors |
Control |
100,000 |
25,000 |
50,000 |
75,000 |
Variation 1 |
100,000 |
50,000 |
50,000 |
100,000 (imbalance) |
Variation 2 |
100,000 |
25,000 |
50,000 |
75,000 |
Forced-bucketing
If a user is first bucketed in an Optimizely Web Experimentation experiment and then that decision is used to force-bucket them in a legacy Full Stack Experimentation experiment, then the results of that Full Stack Experimentation experiment become imbalanced.
See the following example of how force-bucketing can cause an imbalance:
An experiment has two variations: Variation A and Variation B.
Variation A provides a superior user experience in comparison to Variation B. Visitors assigned to Variation A find it enjoyable, and many of them continue to log in and land in the Full Stack Experimentation experiment, where they are force-bucketed to Variation A.
In contrast, visitors assigned to Variation B do not have a good experience, and only a few proceed to log in and land in the Full Stack Experimentation experiment, where they are assigned to Variation B.
As a result, there are significantly more visitors in Variation A than in Variation B. These Variation A visitors are more likely to convert to the Full Stack Experimentation experiment because they are happier with their experience. In addition to visitor traffic split imbalance, metrics and conversion rates are also skewed.
Delayed or failed Optimizely API calls
The Event API sends event data directly to Optimizely Experimentation. A traffic imbalance may occur if anything happens that causes the calls to be delayed or not fire. For example,
- Changes can misfire in variation code.
- Redirect experiments, where visitors are sent to a different domain, can prevent the call to Optimizely from taking place. But this can also happen to visitors of the same origin. See Redirect experiments section.
- Errors can occur in event batching and cause queued events to not send correctly in the Optimizely Feature Experimentation SDKs. See event batching for your Feature Experimentation SDK:
Differences in IDs across devices
In some cases, the chosen user ID is not a consistent ID that works across devices (like a customer ID for logged-in users). So, the user does not see the same variation across devices.
Differences in the snippet or event dispatch timing
A traffic imbalance may occur if something causes the Optimizely Web Experimentation snippet code to misfire. Additionally, if you use the holdEvents
or sendEvents
JavaScript APIs in a location other than in the project JavaScript, the script may not load properly, resulting in a traffic distribution imbalance. Adding more scripts to your webpage may cause implementation or loading rates to differ across variations, particularly in the case of redirects.
The Optimizely Feature Experimentation SDKs make HTTP requests for every decision event or conversion event that gets triggered. Each SDK has a built-in event dispatcher for handling these events. A traffic distribution may occur if the events are dispatched incorrectly due to misconfiguration or other dispatching issues.
For information, refer to your Feature Experimentation SDK's configure event dispatcher documentation:
- Android
- C#
- Flutter
- Go
- Java
- JavaScript (Browser)
- JavaScript (Node)
- PHP
- Python
- React
- React Native
- Ruby
- Swift
Highly specific audience targeted
When there are conditions and constraints on exposure to an experiment, such as targeting a very specific audience, a slight imbalance can emerge before the first business cycle of an experiment is completed (usually seven days). These experiments can offer an experiment program valid results, but it is the experimenter's decision and their tolerance level for imbalances before the completion of one business cycle. These imbalances do resolve after one business cycle.
What to do if Optimizely identifies an imbalance in your experiment
If Optimizely Experimentation identifies an imbalance in your experiment, troubleshooting depends on the cause of the imbalance and if you can correct the problem directly.
- If you can determine the root cause – Stop your experiment, fix the underlying issue, duplicate the experiment, and start that new experiment. You can continue monitoring your fix in the new experiment to verify that you have corrected the problem.
- If you cannot determine the root cause of the imbalance – Stop the experiment or remove the variations to lessen the negative impact to customers while you investigate further.
For information, see Imbalance detected: What to do next if Optimizely identifies an SRM in your Stats Engine A/B test.
Please sign in to leave a comment.