- Optimizely Feature Experimentation
- Optimizely Web Experimentation
- Optimizely Analytics
Optimizely offers the following statistical analysis methods for running A/B tests:
- Frequentist (Fixed Horizon) – Uses a predetermined sample size and fixed plan.
- Bayesian – Updates your beliefs about a hypothesis as new evidence arrives.
- Sequential (Optimizely Stats Engine) – Uses sequential testing that lets you monitor results continuously.
Choosing the right method depends on your traffic levels, experiment goals, and how flexible you need to be while the experiment runs. The following sections explain each method, compare their strengths and limitations, and help you decide which fits your needs.
Frequentist (Fixed Horizon)
Frequentist (Fixed Horizon) overview
Frequentist (Fixed Horizon) is a traditional statistical framework based on the idea that probabilities represent the long-term frequency of events. In this approach, you make decisions using fixed rules and predetermined sample sizes.
Frequentist (Fixed Horizon) methods use p-values, confidence intervals, and statistical significance to determine whether an observed result is likely due to chance. Fixed horizon testing requires you to define the sample size and analysis plan before the experiment starts. You evaluate results only after all data arrives.
Frequentist (Fixed Horizon) experiments
Frequentist (Fixed Horizon) experiments require you to calculate the sample size before you start, and you can only view results after the experiment reaches the required sample size and minimum duration.
Frequentist (Fixed Horizon) is strict and controlled. Remember the following:
- No mid-test peeking – Do not check full results before the experiment reaches the required sample size and duration. Early checks increase false positives.
- Single analysis at the end – Review outcomes only after you meet the experiment conditions (sample size and duration).
- Rigorous statistical standards – Maintain strict statistical standards through a traditional A/B testing process.
Bayesian
Bayesian overview
Bayesian statistics is a framework in which probabilities represent degrees of belief. The framework updates these beliefs as evidence arrives. It provides the probability of a given variation being better than a baseline.
In simpler terms, if you say, "I believe there is an 80% chance Variation A is better than the baseline", you are expressing a degree of belief. Bayesian methods update these beliefs as the A/B test progresses and you collect data.
Bayesian experiments
Using the Bayesian method in A/B tests lets you do the following:
- Monitor results continuously during the experiment.
- Update results as new data comes in.
- Match how you typically understand probability. For example, the statement 'There is a 90% chance Variation A beats Variation B' is easier to communicate than the p-values used by Frequentist (Fixed Horizon) and Sequential methods.
Sequential (Optimizely Stats Engine)
Sequential overview
Sequential testing has several variants. Each one lets you analyze experiment results continuously as data arrives, rather than waiting until the experiment ends. You can monitor performance in real time, make decisions earlier, and stop an experiment once there is enough evidence to declare a winner or make changes.
Sequential testing at Optimizely is based on Optimizely's proprietary Stats Engine. See the links in the Sequential section for more information.
Sequential experiments
Using the sequential method in A/B tests lets you do the following:
- Monitor results continuously during the experiment. The results are always valid because Optimizely controls the false discovery rate.
- Stop the experiment early if you have reached a clear winner.
- Skip doing pre-experiment work, such as calculating the sample size before launching the experiment.
Sequential testing provides flexibility and speed but uses more advanced statistical methods to maintain accuracy.
Difference between Frequentist (Fixed Horizon), Bayesian, and Sequential
The three methods differ in how often you can check results and how much advance planning they require. The following baking analogy makes the trade-off concrete:
- Frequentist (Fixed Horizon) testing – You must first decide how long to bake the cake before putting it in the oven. After putting it in the oven, you must wait for the timer to go off before you look inside, even if the cake might have been done earlier. If you take the cake out early, even if it looks ready on the outside, it may still be raw inside.
- Bayesian testing – You can put the cake in the oven with an initial idea of how long it might take. You can open the oven door at any time to check its progress, and each check updates your belief about whether it is done. If it looks done, you can take it out, and your confidence in its readiness is based on all the information gathered so far.
- Sequential testing – You can put the cake in the oven without deciding how long it needs to bake. You can open the oven door at any time to see if the cake is ready without ruining it. If it looks done, it is ready.
When to use each method
Use the following table to compare how Frequentist (Fixed Horizon), Bayesian, and sequential methods differ in planning, peeking, speed, and decision-making:
| Feature | Frequentist (Fixed Horizon) | Bayesian | Sequential |
|---|---|---|---|
| Sample size calculation | Required before starting | Not required |
Not required |
| Peeking at results mid-test | Not allowed | Allowed | Allowed |
| Early stopping | Not allowed | Allowed | Allowed |
| Best for | Fixed duration with an estimable sample size. | Flexibility and intuitive results. |
Flexible results analysis with Frequentist-style interpretation. |
When to use Frequentist (Fixed Horizon)
Use Frequentist (Fixed Horizon) if the following are true:
- You have strong statistical expertise – Your team has a collaborator who understands traditional A/B testing and can design and analyze a Frequentist (Fixed Horizon) experiment.
- You can accurately estimate the sample size – You must be able to estimate the required sample size before starting your experiment. If the estimation is inaccurate, your test may be underpowered, or the required sample size may be bigger than necessary.
- You expect small, precise changes – You are testing subtle optimizations where tight control matters.
- You can wait until the experiment ends to review results – You do not need to monitor or adjust results mid-test.
When to use Bayesian
Use Bayesian if the following are true:
- You need flexibility to check results anytime – You want to monitor results continuously throughout the experiment and make decisions as soon as sufficient evidence is gathered, without invalidating the test.
- You prefer intuitive interpretation – You want results expressed as direct probabilities (for example, "there is a 90% chance that Variant B is better than Variant A"), which can be easier to communicate to stakeholders.
- You want to make early decisions or reallocate traffic mid-test – You plan to act on results before the experiment ends, potentially leading to faster iterations and optimizations.
When to use sequential
Use sequential if the following are true:
- You need flexibility to check results anytime – You want to monitor results continuously throughout the experiment.
- You make early decisions or reallocate traffic mid-test – You plan to act on results before the experiment ends.
Next steps
See the following documentation to learn more about the different statistical methods Optimizely employs.
Frequentist (Fixed Horizon)
- Configure a Frequentist (Fixed Horizon) A/B test – Follow the step-by-step guide to configure a Frequentist (Fixed Horizon) A/B test.
- Frequentist (Fixed Horizon) statistics – Learn the mathematical and statistical concepts behind Frequentist (Fixed Horizon) testing.
Bayesian
- Configure a Bayesian A/B test – Follow the step-by-step guide to configure a Bayesian A/B test.
- Bayesian statistics – Learn the mathematical and statistical concepts behind Bayesian testing.
Sequential
- Stats Engine resources — Lists documentation showing how Stats Engine works, including comparisons with classical statistical methods.
- Statistical concepts — Defines a deeper explanation of confidence intervals, improvement intervals, false discovery rate, outliers, and so on.
- How and why statistical significance changes over time in Stats Engine — Explains how evidence builds and results evolve during an experiment.
- Calculate the statistical likelihoods of variations using Stats Engine — Understand what variation lifts and confidence levels mean.
- Why Stats Engine results sometimes differ from classical statistics — Clarifies the differences you may see if you are used to traditional statistical methods.
Article is closed for comments.