Use contextual bandits with Optimizely to maximize your conversions by delivering variations that are personalized by user based on the primary metric performance in combination with user attributes. While multi-armed bandits (MABs) look for the single best-performing variation for all users, contextual bandits pick a winning variation for each user based on their contextual profile and impact on the primary metric.
The benefits include the following:
- Provides the most personalized experience for every user.
- Increases the chances of conversions on the primary metric.
- Adapts to changes in visitor behavior, dynamically serving the best variation in every session.
Like MABs, contextual bandits optimize toward the primary metric but also account for user attributes in that optimization. They do not have sticky bucketing like A/B tests. See Experimentation distribution modes for information on Optimizely's automated distribution modes and use cases.
Configuration
How do you configure a contextual bandit?
You can create a contextual bandit within an Optimizely Personalization campaign by selecting Contextual Bandit as the Distribution Mode. This requires you to choose your distribution goal and attach at least one user attribute.
See Configure a contextual bandit.
Are there any restrictions after you start a contextual bandit?
You cannot change the distribution goal while the contextual bandit is running, but you can update the distribution goal while the campaign is paused.
You cannot add or remove any user attributes when the contextual bandit has started, even in a paused state.
What does each distribution goal mean?
- Automated – No static values defined. The value of the exploration and exploitation rates are dynamically updated over time by gradually increasing the exploitation rate as the model receives more data.
- Maximize Personalization – Static values of 10% exploration rate and 90% exploitation rate.
- Manual – Configure exploration and exploitation rates as you choose.
See Distribution goals.
What is the difference between the exploration rate and the exploitation rate?
- Exploration rate – The probability of the model serving a random variation to a visitor, also known as the learning rate.
- Exploitation rate – The probability of the model serving the most personalized variation to a visitor.
For example, setting the exploration rate to 10% means that one out of every ten visits Optimizely intentionally serves a random variation to observe user behavior, letting the model learn continuously.
What are the thresholds on the exploration rate and exploitation rate, and why?
When configured manually, the minimum exploration rate is 5%, as it is necessary for the model to continuously learn to increase its confidence and accuracy over time. The maximum exploration rate is 50%. Otherwise, the model serves random variations most of the time, defeating the purpose of running a contextual bandit.
As a result, the exploitation rate limits for manual configurations are 50% for the minimum and 95% for the maximum. When set to automatic, the exploration rate adjusts from 100% to 5%, decreasing with a cadence dependent on several parameters, like the volume of events.
Is there an optimal number of variations to add to a contextual bandit? What about user attributes?
While there is no set limit, adding more user attributes requires more events for the model to learn appropriately and also increases the size of the model which can result in slower responses for bucketing decisions. Ideally, you should create a maximum of 20 variations and 50 attributes.
While there is no set limit, adding more for both has impacts on model size which requires more events for the model to learn appropriately before exploiting traffic. Optimizely recommends to keep the number of variations at a maximum of 20 and the attributes at a maximum of 50 for optimal performance.
Attributes (the user profile)
Where is the list of user attributes coming from?
The list of user attributes comes from custom attributes created using the API and from external attributes (only includes table, not list) uploaded using Dynamic Customer Profiles (DCP). You can find these on the Audiences > Attributes page of your project. Several standard attributes are also available by default.
Why are list attributes not available to add to a contextual bandit?
List attributes are used to upload a list (like IDs and zip codes) to target as audience conditions. Contextual bandits need associated values to the list of IDs (key-value pairs) for the contextual bandit model to process them.
What user attributes should you use?
The user attributes you should use depend on the use case of your contextual bandit. You should use user attributes that are directly related to the use case. Ask yourself, "What data do we have that would be important for the model to know to effectively distribute the variations?"
It is critical that the attributes you feed into the model are enriched with actual user data. Otherwise, they do not significantly impact the model distribution.
How does the model prioritize the user attributes you select?
The way tree-based models work is that they can differentiate whether a given attribute contributes positively to predictability. This model type has a built-in capability to quantify and order the importance of attributes. Because of this, Optimizely can report the attributes of importance on the Contextual Bandit Results page.
Does Optimizely send user profile information as part of the Web Experimentation snippet?
No, Optimizely only sends identifiers to the snippet. Based on those identifiers, the snippet calls the Dynamic Customer Profiles (DCP) API to evaluate the audience targeting conditions and return true or false for the audience conditions. You can choose which information to share in the browser. DCP lets you send DCP user profile information to the Web Experimentation snippet through the DCP API. You need to configure this while making the DCP table and enabling the content-enabled.
Model
Which type of machine learning (ML) model do contextual bandits use?
Optimizely developed tree-based models for binary classification (deciding between binary categories) and regression (predicting numerical values) tasks.
How does the ML model balance the impact on the primary metric and user attributes with dynamic traffic allocation?
The ML model optimizes towards the selected primary metric and uses the user attributes accordingly to determine the best variation to serve to a given visitor.
How long is the learning period (where the exploration rate is 100%) for the model?
Optimizely does not quantify the learning period in terms of the number of days. Instead, it depends on traffic and volume of events, as the model begins exploiting when it reaches a threshold for quality control.
Where do contextual bandits run?
The contextual bandit model (owned by the Optimizely Data Science team) runs on a server behind a prediction API, not on the browser.
When you activate a contextual bandit experiment and the system has timed out the visitor, the snippet asynchronously calls the prediction API endpoint. This endpoint evaluates the visitor information and returns the variation ID to display to the visitor. There is no prior performance impact with this asynchronous API call.
The Experimentation snippet does not include the model’s information apart from the prediction API URL.
How does Optimizely treat new users with no context? Do known users influence the variation shown to new users?
Optimizely does not require visitors to have previous visits or sessions. Each visitor has attributes, and the more complete the attributes are, the better the model performs. The model uses knowledge obtained from previous visitors with similar attributes to enhance its performance.
Metrics
What metric does a contextual bandit move?
The contextual bandits' optimization is based on the primary metric. You should choose a primary metric that is tracked on the same page as the contextual bandit or one that is close to the page. This helps the model understand whether a user converted from the contextual bandit. Because of this, overall revenue is not a good option to use as a primary metric for contextual bandits.
Do contextual bandits get to statistical significance?
Contextual bandits (like multi-armed bandits (MABs)) do not achieve statistical significance because their goal is to optimize traffic allocation dynamically to maximize conversions rather than focus on testing for statistical significance. MAB algorithms do not rely on a fixed sample size or equal traffic allocation, making traditional statistical significance measures less relevant.
Do contextual bandits perform better than multi-armed bandits (MABs)?
Ideally, contextual bandits get better results than regular MABs, but that depends on multiple factors like data (for example, events, number, and quality of attributes).
Article is closed for comments.