Section rollups in multivariate tests for Optimizely Web Experimentation

  • Updated
Relevant Products:
  • Optimizely Web Experimentation

This topic describes how to Interpret and understand section rollups, and determine how interaction effects might affect your experiment.

In a multivariate test, section rollups allow you to see how each of a section’s variations performed against the others, summing across all the combinations it appears in. 

Section rollups can help you:

  • Get insight into why a combination won or lost by understanding how a specific section variation contributed to the performance of combinations that used it

  • Understand how interaction effects impact the performance of combinations (knowing this can be critical to making responsible business decisions based on the outcome of a MVT)

  • Get insights from "noisy" MVT results where none of the combinations reached significance (because section rollups are more powerful than combinations)

As an example, imagine your MVT has two sections (1 and 2) and two variations (A and B):

This matrix would generate the following combinations:

  • 1A and 2A

  • 1A and 2B

  • 1B and 2A

  • 1B and 2B

Combination Section 1 Section 2
1 Variation A Variation A
2 Variation A Variation B
3 Variation B Variation A
4 Variation B Variation B

If you were to roll up on section one, you would see a comparison for combinations 1 and 2 vs combinations 3 and 4. From there, Optimizely measures the overall conversion rate of 1A vs. 1B, averaging across the different values for section 2.

Section rollups are only available for full factorial MVTs. This is because rolling up in a partial factorial MVT would be “unbalanced,” which means that there would not be an equal representation of all combinations.

Optimizely treats these section rollup A/B tests just like any normal A/B test created outside of a MVT. It calculates significance and confidence intervals the same way, and applies the same FDR corrections to account for multiple variations (and multiple metrics if applicable) within that section.

The one crucial difference between a section rollup A/B test and an A/B test on that section run in isolation is the possibility of interaction effects appearing in the section rollup A/B test results.

Interaction effects

Interactions are effects that occur only in the presence of two or more combinations together. They can be positive or negative, but either way, they make it more difficult to understand what causes the lift you see in a rollup view.

For example, suppose that you’re using a MVT to test a button’s text color in one section and the button’s background color in another section at the same time. If you test black variations for both text and background simultaneously, the combined effect will likely have a far stronger negative effect than the effects of each variation on its own.

That additional kick from the two black variations appearing together is the interaction effect.

If you had run an A/B test on text color alone, the improvement reported on the Results page would show the expected gain in conversions from changing the text color to black.

However, the section rollup A/B test improvement should not be interpreted this way. Since a large chunk of the visitors in that test came from the MVT combination showing black text against a black background, their conversion rate is much lower than we would expect to see in cases where the text is black but the background is not. And because this effect is absent in the control for the text color section, this would cause the improvement (of a black text over the control text color) reported at the rollup level to be biased downwards.

In other words, the negative response to black text on a black background will make it appear as though you should expect a smaller improvement from changing the text color to black than you are actually likely to get.

Find interaction effects in your results

To help protect yourself against interaction effects unduly influencing your business decisions, we recommend a section rollup workflow that starts by looking at MVT combinations, and proceeding to section rollups from there.

  1. Examine MVT combinations for statistically significant winners

    The main reason you would use a MVT in the first place (instead of a standard A/B test) is that the MVT will give you information on potential interaction effects between sections. So the first step should always be to look at the MVT combinations. If you find statistically significant results here, use this combination as a basis for your decision, and do not proceed to the next step in this workflow.

    Otherwise, move on to the section rollup level.

  2. Examine section rollups for statistically significant winners

    Interaction effects make it more difficult to interpret  results at the section rollups level. However, these interaction effects can easily be quite small, or they may even have no practical impact whatsoever. The severity of the interaction effects will determine how you interpret what’s reported on your Results page.

To determine whether strong interactions exist, start by comparing the estimated improvement from the variations in the section in question at both:

  • the section rollup level, and

  • the slice of the MVT containing combinations with controls in all sections except the section in question

For example, say you are working with a MVT containing three sections, each with two variations:


Button copy

Button background color

Button text color

































In this example, you want to make a decision based on the rollup for the button copy section. To determine if there are interaction effects in your sections, compare the improvement shown in the button copy section rollup to the metrics shown for combinations 1 and 5 (the combinations where both button background color and button text color use their original colors).

If the MVT is free of interactions, the estimated improvement from these two levels should be close. If they’re not—especially when the sample size within each MVT cell is large—then that should be taken as a sign that interactions are present, and you should be cautious in interpreting results at the section rollup level.

If there are no interaction effects, then the improvement reported in the section rollup is equivalent to the improvement that you would see from running this A/B test separately. The improvement you actually observe from deploying the variation should closely match the improvement reported on the results page, regardless of what variations you deploy in the other sections.

If there are interaction effects, then the improvement reported in this section rollup is essentially an average over all the candidate variations in the other sections. The improvement you actually observe from deploying the variation may or may not deviate greatly from the improvement reported on the results page, depending on exactly which combination of variations you deploy from the other sections.

View results for section rollups

To see the results of an experiment broken out on the section rollup level, follow these steps:

  1. In your experiment, navigate to the Results tab.

  2. From the Section dropdown menu, select the section you are interested in.

Screen Shot 2018-06-07 at 2.03.27 PM.png

The results for the rollup you selected appear.

Screen Shot 2018-06-07 at 2.05.13 PM.png