hbllmutils.model.base

Abstract base interfaces for Large Language Model (LLM) implementations.

This module defines the LLMModel abstract base class, which serves as the contract for LLM backends in the hbllmutils package. Implementations are expected to provide synchronous and streaming query methods while supporting optional reasoning output.

The module contains the following main components:

  • LLMModel - Abstract interface for LLM model implementations

Example:

>>> from hbllmutils.model.base import LLMModel
>>> class MyLLM(LLMModel):
...     @property
...     def _logger_name(self) -> str:
...         return "my-llm"
...
...     def ask(self, messages, with_reasoning=False, **params):
...         return "Hello"
...
...     def ask_stream(self, messages, with_reasoning=False, **params):
...         raise NotImplementedError
...
...     def _params(self):
...         return ("my-llm",)
...
>>> model = MyLLM()
>>> model.ask([{"role": "user", "content": "Hi"}])
'Hello'

LLMModel

class hbllmutils.model.base.LLMModel[source]

Abstract base class for Large Language Model implementations.

This class defines the interface that all LLM model implementations must follow. It provides two main methods: ask() and ask_stream() for different interaction patterns with language models. Subclasses must implement both methods to provide concrete LLM functionality.

The class supports both synchronous and streaming responses, as well as optional reasoning output for models that support chain-of-thought or similar capabilities.

Subclasses should also implement _params() to provide a stable, hashable representation of their configuration, enabling reliable equality checks and usage as dictionary keys.

Example:

>>> class EchoModel(LLMModel):
...     @property
...     def _logger_name(self) -> str:
...         return "echo"
...
...     def ask(self, messages, with_reasoning=False, **params):
...         return messages[-1]["content"]
...
...     def ask_stream(self, messages, with_reasoning=False, **params):
...         raise NotImplementedError
...
...     def _params(self):
...         return ("echo",)
...
>>> model = EchoModel()
>>> model.ask([{"role": "user", "content": "Hello"}])
'Hello'
__eq__(other: object) bool[source]

Check equality between this model and another object.

Two LLMModel instances are considered equal if they are of the same class and have the same parameters as returned by _values().

Parameters:

other (object) – The object to compare with.

Returns:

True if the objects are equal, False otherwise.

Return type:

bool

__hash__() int[source]

Get the hash value of this model instance.

The hash is computed from the values returned by _values(), which includes the model’s class and parameters. This allows LLMModel instances to be used as dictionary keys or in sets.

Returns:

The hash value of this model instance.

Return type:

int

ask(messages: List[dict], with_reasoning: bool = False, **params) str | Tuple[str | None, str][source]

Ask a question to the language model and get a response.

This method provides a higher-level interface for querying the model. It can optionally return reasoning information along with the answer, which is useful for models that support explicit reasoning steps.

Parameters:
  • messages (List[dict]) – A list of message dictionaries containing the conversation history. Each dictionary typically contains 'role' and 'content' keys. Example: [{"role": "user", "content": "What is 2+2?"}]

  • with_reasoning (bool) – If True, return both reasoning and answer as a tuple. If False, return only the answer string. Default is False.

  • params (dict) – Additional parameters to pass to the model implementation. These may include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

If with_reasoning is False, returns the answer as a string. If with_reasoning is True, returns a tuple of (reasoning, answer), where reasoning can be None if not available or not supported by the model.

Return type:

Union[str, Tuple[Optional[str], str]]

Raises:

NotImplementedError – This method must be implemented by subclasses.

Example::
>>> model = SomeLLMModel()
>>> messages = [{"role": "user", "content": "What is 2+2?"}]
>>> model.ask(messages)
'4'
>>> model.ask(messages, with_reasoning=True)
('Adding 2 and 2', '4')
ask_stream(messages: List[dict], with_reasoning: bool = False, **params) ResponseStream[source]

Ask a question to the language model and get a streaming response.

This method allows for real-time streaming of the model’s response, which is useful for long responses or interactive applications where immediate feedback is desired. The response is delivered incrementally as it is generated by the model.

Parameters:
  • messages (List[dict]) – A list of message dictionaries containing the conversation history. Each dictionary typically contains 'role' and 'content' keys. Example: [{"role": "user", "content": "Tell me a story"}]

  • with_reasoning (bool) – If True, the stream should include reasoning information. If False, only the answer is streamed. Default is False.

  • params (dict) – Additional parameters to pass to the model implementation. These may include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

A ResponseStream object that can be iterated to receive response chunks. The stream yields text chunks as they become available from the model.

Return type:

ResponseStream

Raises:

NotImplementedError – This method must be implemented by subclasses.

Example::
>>> model = SomeLLMModel()
>>> messages = [{"role": "user", "content": "Tell me a story"}]
>>> stream = model.ask_stream(messages)
>>> for chunk in stream:
...     print(chunk, end='')
# Prints the story as it's generated, chunk by chunk