Section rollups in multivariate tests for Optimizely Web Experimentation

  • Updated
  • Optimizely Web Experimentation

In a multivariate test (MVT) in Web Experimentation, using section rollups lets you see how each variation of a section performed against the others, summing across all the combinations it appears in. 

Section rollups let you:

  • Get an insight into why a combination won or lost by understanding how a specific section variation contributes to the performance of combinations that use it.

  • Understand how interaction effects impact the performance of combinations. You can make business decisions based on the outcome of an MVT.

  • Get insights from noisy MVT results where none of the combinations reach significance because section rollups are more powerful than combinations.

Section rollups are only available in Optimizely Web Experimentation.

For example, a MVT has two sections (1 and 2) and two variations (A and B). This would generate the following combinations:

  • Section 1: Variation A and Section 2: Variation A – 1A and 2A

  • Section 1: Variation A and Section 2: Variation B –  1A and 2B

  • Section 1: Variation B and Section 2: Variation A – 1B and 2A

  • Section 1: Variation B and Section 2: Variation B –  1B and 2B

Or explained in a table:

Combination Section 1 Section 2
1 Variation A Variation A
2 Variation A Variation B
3 Variation B Variation A
4 Variation B Variation B

If you roll up on Section 1, you can see a comparison for combinations 1 and 2 against combinations 3 and 4. From there, Optimizely measures the overall conversion rate of 1A versus 1B, and averages across the different values for Section 2.

Section rollups are only available for full factorial MVTs. This is because rolling up in a partial factorial MVT is unbalanced, and so there is not an equal representation of all combinations. See Traffic allocation in MVTs: Full factorial versus partial factorial.

Optimizely treats these section rollup A/B tests just like any normal A/B test created outside of an MVT. It calculates significance and confidence intervals in the same way and applies the same False discovery rate (FDR) corrections to account for multiple variations (and multiple metrics if applicable) within that section.

The main difference between a section rollup A/B test and an A/B test on that section run in isolation is that interaction effects can appear in the section rollup A/B test results.

Interaction effects

Interactions are effects that occur only in the presence of two or more combinations together. They can be positive or negative, but either way can make it more difficult to understand what causes the lift in a rollup view.

For example, suppose that you are using a MVT to test a button’s text color in one section and the button’s background color in another section at the same time. If you test black variations for both text and background simultaneously, the combined effect can have a stronger negative effect than the effects of each variation on its own.

The interaction effect is the additional effect from the two black variations appearing together.

If you run an A/B test on text color alone, the improvement reported on the experimentation results page shows the expected gain in conversions from changing the text color to black.

However, the section rollup A/B test improvement should not be interpreted this way. Since many visitors in that test came from the MVT combination showing black text against a black background, their conversion rate is much lower than expected in cases where the text is black, but the background is not.

Because this effect is absent in the control for the text color section, this would cause the improvement (of a black text over the control text color) reported at the rollup level to be biased downwards.

In other words, the negative response to black text on a black background will make it appear that you should expect a smaller improvement from changing the text color to black than you are likely to get.

Find interaction effects in your results

To help protect yourself against interaction effects unduly influencing your business decisions, Optimizely recommends a section rollup workflow that begins by looking at MVT combinations and proceeds to section rollups.

  1. Examine MVT combinations for statistically significant winners.

    The main reason you would use a MVT in the first place (instead of a standard A/B test) is that the MVT will give you information on potential interaction effects between sections. So the first step should always be to look at the MVT combinations. If you find statistically significant results here, use this combination as a basis for your decision, and do not proceed to the next step in this workflow.

    Otherwise, move on to the section rollup level.

  2. Examine section rollups for statistically significant winners.

    Interaction effects make it more difficult to interpret results at the section rollup level. However, these interaction effects can easily be quite small, or they may have no practical impact. The severity of the interaction effects will determine how you interpret what is reported on your experimentation results page.

To determine whether strong interactions exist, start by comparing the estimated improvement from the variations in the section in question at both:

  • The section rollup level.
  • The slice of the MVT containing combinations with controls in all sections except the section in question.

For example, say you are working with a MVT containing three sections, each with two variations:

Combination Button copy Button background color Button text color
1 Original Original Original
2 Original Original Variation
3 Original Variation Original
4 Original Variation Variation
5 Variation Original Original
6 Variation Original Variation
7 Variation Variation Original
8 Variation Variation Variation

In this example, you want to make a decision based on the rollup for the button copy section. To determine if there are interaction effects in your sections, compare the improvement shown in the button copy section rollup to the metrics shown for combinations 1 and 5 (the combinations where both button background color and button text color use their original colors).

If the MVT is free of interactions, the estimated improvement from these two levels should be close. If they are not—especially when the sample size within each MVT cell is large—then that should be taken as a sign that interactions are present, and you should be cautious in interpreting results at the section rollup level.

If there are no interaction effects, then the improvement reported in the section rollup is equivalent to the improvement that you would see from running this A/B test separately. The improvement you actually observe from deploying the variation should closely match the improvement reported on the results page, regardless of what variations you deploy in the other sections.

If there are interaction effects, then the improvement reported in this section rollup is essentially an average over all the candidate variations in the other sections. The improvement you actually observe from deploying the variation may or may not deviate greatly from the improvement reported on the results page, depending on exactly which combination of variations you deploy from the other sections.

View results for section rollups

To see the results of an experiment broken out on the section rollup level, follow these steps:

  1. In Optimizely Web Experimentation, select your MVT experiment and navigate to the Results.

  2. From the Section dropdown menu, select the section you are interested in.

The results for the rollup you selected appear.