Building an Experiment Scorecard

  • Updated

The Experiment Scorecard offers a streamlined approach to comparing experiment variations with respect to business metrics stored in your data warehouse. It makes it easier to link experiments to business outcomes, delivering deeper insights into user behavior and experiment performance.

As an example, let's create new algorithms for personalized content in the Flix streaming app to boost user engagement and increase revenue. To create this example, we use the Flix app and the demo datasets within it. You can also see these examples inside the app.

  1. On the left navigation panel, click + and select the Experiment Scorecard template.
  2. On the definition page, select the metric type using the drop-down - here, we select Summary.
  3. Then, choose the preferred Experiment using the selector. This can be an Optimizely Feature or Web experiment. - here, we choose an experiment name. Next, select the dataset - this is the actor dataset that are the users in the experiment, and set the significance threshold - here, we set them to Users and 95% respectively.
exp-scr-1.gif
  1. Then, configure the Decision-making Metric. In this section, you can choose to create a new metric or use a previously created metric. To create a new metric, you have three options - Conversion, Numeric aggregation, and Ratio. If you want to use an already created metric, you can select one from the catalog or add a new block. Learn more about metrics in Experiment Scorecard. For the example, let's configure a Conversion metric. For the Measure Type, choose Conversion Rate and select Play Content as the Event. Select the format for the metric. The format is set to Percentage by default. You can also add Secondary metrics, if required.
exp-scr-2.gif
  1. (Optional) Then, add Guardrail Metrics - these are critical business indicators you closely monitor during experiments. You can choose from three options - Conversion, Numeric aggregation (the options explained in the previous step), or choose an existing Metric, in this section as well. For our example, let's select an existing metric - Total Ad Revenue
  2. Choose a Baseline from the drop-down. The option selected under Baseline is the variant against which other variants should be compared.
exp-scr-3.gif
  1. The time interval is automatically configured to match the experiment's duration when selected, using metadata from Optimizely - here, it is set to Since Sat Feb 12 2022.
  1. On the header, click Run Exploration to run the analysis. The scorecard displays a statistical significance stats table, showing which variation performed better based on the selected metrics.
  2. Give a descriptive name to the analysis and save it.
exp-scr-4.gif