Test assertions

Assertions are used to test the output of a language model (LLM) against expected values or conditions. While they are not required, they are a useful way to automate prompt engineering analysis.

Different types of assertions can be used to validate the output in various ways, such as checking for equality, JSON structure, similarity, or custom functions.

Using assertions

To use assertions in your test cases, add an assert property to the test case with an array of assertion objects. Each assertion object should have a type property indicating the assertion type and any additional properties required for that assertion type.

Example:

tests:
  - description: "Test if output is equal to the expected value"
    vars:
      example: "Hello, World!"
    assert:
      - type: equals
        value: "Hello, World!"

Assertion properties

Property	Type	Required	Description
type	string	Yes	Type of assertion
value	string	No	The expected value, if applicable
threshold	number	No	The threshold value, only applicable for similarity
provider	string	No	Some assertions (similarity, llm-rubric) require an LLM provider

Assertion Types

Assertion Type	Returns true if...
`equals`	output matches exactly
`contains`	output contains substring
`icontains`	output contains substring, case insensitive
`regex`	output matches regex
`contains-some`	output contains some in list of substrings
`contains-all`	output contains all list of substrings
`is-json`	output is valid json
`contains-json`	output contains valid json
`javascript`	provided Javascript function validates the output
`webhook`	provided webhook returns `{pass: true}
`similar`	embeddings and cosine similarity are above a threshold
`llm-rubric`	LLM output matches a given rubric, using a Language Model to grade output

tip

Every test type can be negated by prepending not-. For example, not-equals or not-regex.

Equality

The equals assertion checks if the LLM output is equal to the expected value.

Example:

assert:
  - type: equals
    value: "The expected output"

Here are the new additions to the "Assertion Types" section:

Contains

The contains assertion checks if the LLM output contains the expected value.

Example:

assert:
  - type: contains
    value: "The expected substring"

The icontains is the same, except it ignores case:

assert:
  - type: icontains
    value: "The expected substring"

Regex

The regex assertion checks if the LLM output matches the provided regular expression.

Example:

assert:
  - type: regex
    value: "\\d{4}" # Matches a 4-digit number

Contains-Some

The contains-some assertion checks if the LLM output contains at least one of the specified values.

Example:

assert:
  - type: contains-some
    value:
      - "Value 1"
      - "Value 2"
      - "Value 3"

Contains-All

The contains-all assertion checks if the LLM output contains all of the specified values.

Example:

assert:
  - type: contains-all
    value:
      - "Value 1"
      - "Value 2"
      - "Value 3"

Is-JSON

The is-json assertion checks if the LLM output is a valid JSON string.

Example:

assert:
  - type: is-json

Contains-JSON

The contains-json assertion checks if the LLM output contains a valid JSON structure.

Example:

assert:
  - type: contains-json

Javascript

The javascript assertion allows you to provide a custom JavaScript function to validate the LLM output. The function should return true if the output passes the assertion, and false otherwise.

Example:

assert:
  - type: javascript
    value: "output.includes('Hello, World!')"

You may also return a number, which will be treated as a score:

assert:
  - type: javascript
    value: Math.log(output.length) * 10

Webhook

The webhook assertion sends the LLM output to a specified webhook URL for custom validation. The webhook should return a JSON object with a pass property set to true or false.

Example:

assert:
  - type: webhook
    value: "https://example.com/webhook"

The webhook will receive a POST request with a JSON payload containing the LLM output and the context (test case variables). For example, if the LLM output is "Hello, World!" and the test case has a variable example set to "Example text", the payload will look like:

{
  "output": "Hello, World!",
  "context": {
    "vars": {
      "example": "Example text"
    }
  }
}

The webhook should process the request and return a JSON response with a pass property set to true or false, indicating whether the LLM output meets the custom validation criteria. Optionally, the webhook can also provide a reason property to describe why the output passed or failed the assertion.

Example response:

{
  "pass": true,
  "reason": "The output meets the custom validation criteria"
}

If the webhook returns a pass value of true, the assertion will be considered successful. If it returns false, the assertion will fail, and the provided reason will be used to describe the failure.

You may also return a score:

{
  "pass": true,
  "score": 0.5,
  "reason": "The output meets the custom validation criteria"
}

Similarity

The similarity assertion checks if the LLM output is semantically similar to the expected value, using a cosine similarity threshold.

Example:

assert:
  - type: similar
    value: "The expected output"
    threshold: 0.8

LLM-Rubric

The llm-rubric assertion checks if the LLM output matches a given rubric, using a Language Model to grade the output based on the rubric.

Example:

assert:
  - type: llm-rubric
    value: "The expected output"

Here's an example output that indicates PASS/FAIL based on LLM assessment (see example setup and outputs):

Load an external tests file

The Tests file is an optional format that lets you specify test cases outside of the main config file.

To add an assertion to a test case in a vars file, use the special __expected column.

Here's an example tests.csv:

text,__expected
"Hello, world!","Bonjour le monde"
"Goodbye, everyone!","fn:output.includes('Au revoir');"
"I am a pineapple","grade:doesn't reference any fruits besides pineapple"

All assertion types can be used in __expected. The column supports exactly one assertion.

is-json and contains-json are supported directly, and do not require any value
fn indicates javascript type. For example: fn:output.includes('foo')
similar takes a threshold value. For example: similar(0.8):hello world
grade indicates llm-rubric. For example: grade: does not mention being an AI
By default, __expected will use type equals

When the __expected field is provided, the success and failure statistics in the evaluation summary will be based on whether the expected criteria are met.

For more advanced test cases, we recommend using a testing framework like Jest or Mocha and using promptfoo as a library.

Reusing assertions with templates

If you have a set of common assertions that you want to apply to multiple test cases, you can create assertion templates and reuse them across your configuration.

assertionTemplates:
  containsMentalHealth:
    type: javascript
    value: output.toLowerCase().includes('mental health')

prompts: [prompt1.txt, prompt2.txt]
providers: [openai:gpt-3.5-turbo, localai:chat:vicuna]
tests:
  - vars:
      input: Tell me about the benefits of exercise.
    assert:
      - $ref: "#/assertionTemplates/containsMentalHealth"
  - vars:
      input: How can I improve my well-being?
    assert:
      - $ref: "#/assertionTemplates/containsMentalHealth"

In this example, the containsMentalHealth assertion template is defined at the top of the configuration file and then reused in two test cases. This approach helps maintain consistency and reduces duplication in your configuration.

Test assertions

Using assertions​

Assertion properties​

Assertion Types​

Equality​

Contains​

Regex​

Contains-Some​

Contains-All​

Is-JSON​

Contains-JSON​

Javascript​

Webhook​

Similarity​

LLM-Rubric​

Load an external tests file​

Reusing assertions with templates​

Using assertions

Assertion properties

Assertion Types

Equality

Contains

Regex

Contains-Some

Contains-All

Is-JSON

Contains-JSON

Javascript

Webhook

Similarity

LLM-Rubric

Load an external tests file

Reusing assertions with templates