hbllmutils.testing.alive
This module provides functionality for testing LLM models with basic binary tests.
The module implements simple binary tests that verify if an LLM model can respond to basic interactions. It uses the BinaryTest framework to perform single or multiple test runs and returns structured results.
- Classes:
_HelloTest: Internal test class that implements a basic greeting test. _PingTest: Internal test class that implements a ping-pong response test.
- Functions:
hello: Performs a hello test on an LLM model. ping: Performs a ping-pong test on an LLM model.
hello
- hbllmutils.testing.alive.hello(model: str | LLMModel, n: int = 1) MultiBinaryTestResult | BinaryTestResult[source]
Perform a hello test on an LLM model.
This function tests whether the given LLM model can respond to a basic greeting (“hello!”). It can run the test once or multiple times to gather statistical results.
- Parameters:
model (LLMModelTyping) – The LLM model to test.
n (int) – The number of times to run the test. Defaults to 1.
- Returns:
If n=1, returns a single BinaryTestResult. If n>1, returns a MultiBinaryTestResult containing all test results and statistics.
- Return type:
- Example::
>>> # Single test >>> result = hello(my_model) >>> print(result.passed) True
>>> # Multiple tests >>> results = hello(my_model, n=10) >>> print(results.passed_count) 10 >>> print(results.passed_ratio) 1.0
ping
- hbllmutils.testing.alive.ping(model: str | LLMModel, n: int = 1) MultiBinaryTestResult | BinaryTestResult[source]
Perform a ping-pong test on an LLM model.
This function tests whether the given LLM model can respond appropriately to a “ping!” message by including “pong” in its response. It can run the test once or multiple times to gather statistical results.
- Parameters:
model (LLMModelTyping) – The LLM model to test.
n (int) – The number of times to run the test. Defaults to 1.
- Returns:
If n=1, returns a single BinaryTestResult. If n>1, returns a MultiBinaryTestResult containing all test results and statistics.
- Return type:
- Example::
>>> # Single test >>> result = ping(my_model) >>> print(result.passed) True >>> print(result.content) 'Pong!'
>>> # Multiple tests >>> results = ping(my_model, n=5) >>> print(results.passed_count) 5 >>> print(results.passed_ratio) 1.0