hbllmutils.testing.alive

Alive tests for verifying basic LLM responsiveness.

This module provides minimal, binary checks that validate whether an LLM model can respond to simple interactions. The public API offers two convenience functions, hello() and ping(), which use internal BinaryTest implementations to execute one or multiple test runs. Results are returned as either a single BinaryTestResult or a MultiBinaryTestResult with aggregated statistics.

The module contains the following public components:

  • hello() - Run a basic greeting test (expects any non-empty response).

  • ping() - Run a ping-pong test (expects a response containing “pong”).

Note

These tests are intentionally lightweight and do not validate the semantic correctness of responses beyond the basic criteria described.

Example:

>>> from hbllmutils.testing.alive import hello, ping
>>> result = hello(my_model)
>>> result.passed
True
>>> results = ping(my_model, n=3)
>>> results.passed_ratio
1.0

hello

hbllmutils.testing.alive.hello(model: str | LLMModel, n: int = 1) MultiBinaryTestResult | BinaryTestResult[source]

Perform a hello test on an LLM model.

This function tests whether the given LLM model can respond to a basic greeting ("hello!"). It can run the test once or multiple times to gather statistical results.

Parameters:
  • model (LLMModelTyping) – The LLM model to test.

  • n (int) – The number of times to run the test. Defaults to 1.

Returns:

If n == 1, returns a single BinaryTestResult. If n > 1, returns a MultiBinaryTestResult containing all test results and statistics.

Return type:

Union[MultiBinaryTestResult, BinaryTestResult]

Example:

>>> # Single test
>>> result = hello(my_model)
>>> print(result.passed)
True
>>> # Multiple tests
>>> results = hello(my_model, n=10)
>>> print(results.passed_count)
10
>>> print(results.passed_ratio)
1.0

ping

hbllmutils.testing.alive.ping(model: str | LLMModel, n: int = 1) MultiBinaryTestResult | BinaryTestResult[source]

Perform a ping-pong test on an LLM model.

This function tests whether the given LLM model can respond appropriately to a "ping!" message by including "pong" in its response. It can run the test once or multiple times to gather statistical results.

Parameters:
  • model (LLMModelTyping) – The LLM model to test.

  • n (int) – The number of times to run the test. Defaults to 1.

Returns:

If n == 1, returns a single BinaryTestResult. If n > 1, returns a MultiBinaryTestResult containing all test results and statistics.

Return type:

Union[MultiBinaryTestResult, BinaryTestResult]

Example:

>>> # Single test
>>> result = ping(my_model)
>>> print(result.passed)
True
>>> print(result.content)
'Pong!'
>>> # Multiple tests
>>> results = ping(my_model, n=5)
>>> print(results.passed_count)
5
>>> print(results.passed_ratio)
1.0