Preferred output examples

  • Updated

Preferred output examples are agent outputs you save as quality benchmarks for Output Evaluation. Opal uses these examples to score future runs of the same agent against your defined standard.

In Optimizely Opal, find preferred output examples on the Output Evaluation sub-tab of the Quality tab. They display in the Output Evaluation Examples (Single-Turn) section. For multi-turn agents, use the Conversation Evaluation Samples (Multi-Turn) section instead.

For details on adding examples and configuring the rest of Output Evaluation, see Configure Output Evaluation.

How preferred output examples work

Preferred output examples drive Opal's automated quality scoring. When you add an example, Opal stores both the input variables and the resulting output. Opal compares future runs of the agent against your examples using an LLM-as-Judge model. An LLM-as-Judge model is a Large Language Model (LLM) trained to evaluate other model outputs against quality standards. Each run produces an Evaluation Score between 0% and 100%.

Opal marks runs at or above your baseline evaluation score as Passed. Opal marks runs below the baseline as Failed. Set the baseline between 75% and 95% in 5% increments.

Preferred output examples work alongside the following other Output Evaluation inputs:

  • Evaluation criteria – Statements that define what a quality output looks like.
  • Conversation samples – Reference conversations for multi-turn agents.
  • Default rubric – Quality dimensions covering accuracy, completeness, format consistency, and usefulness.

When to use preferred output examples

Preferred output examples work best when your agent produces consistent, structured outputs. Use them in the following scenarios:

  • You have an exemplary output that future runs should match.
  • The output format is consistent in structure, length, or style.
  • You want quality scoring to focus on similarity to specific examples rather than abstract criteria.

Evaluation criteria work better than examples for agents with high output variation, such as creative writing or brainstorming.

Add preferred output examples

Add up to five preferred output examples per agent. Use one of two methods to add them.

  • Manually – Click Add Example in the Output Evaluation Examples section and provide the input variables and preferred output.
  • From execution logs – Open the run in the Logs tab, click More, and select Link as Output Eval.

See Configure Output Evaluation for the full procedure.

Related articles

If you use Opti ID, administrators can turn off generative AI in the Opti ID Admin Center. See Turn generative AI off across Optimizely applications.