hbllmutils.model.base

This module defines the abstract base class for Large Language Model (LLM) implementations.

The module provides a common interface that all LLM implementations should follow, including methods for chat interactions, question-answering, and streaming responses. It serves as a contract that ensures consistent behavior across different LLM implementations.

LLMModel

class hbllmutils.model.base.LLMModel[source]

Abstract base class for Large Language Model implementations.

This class defines the interface that all LLM model implementations must follow. It provides two main methods: ask and ask_stream for different interaction patterns with language models. Subclasses must implement both methods to provide concrete LLM functionality.

The class supports both synchronous and streaming responses, as well as optional reasoning output for models that support chain-of-thought or similar capabilities.

__eq__(other)[source]

Check equality between this model and another object.

Two LLMModel instances are considered equal if they are of the same class and have the same parameters as returned by _values().

Parameters:

other (object) – The object to compare with.

Returns:

True if the objects are equal, False otherwise.

Return type:

bool

__hash__()[source]

Get the hash value of this model instance.

The hash is computed from the values returned by _values(), which includes the model’s class and parameters. This allows LLMModel instances to be used as dictionary keys or in sets.

Returns:

The hash value of this model instance.

Return type:

int

ask(messages: List[dict], with_reasoning: bool = False, **params) str | Tuple[str | None, str][source]

Ask a question to the language model and get a response.

This method provides a higher-level interface for querying the model. It can optionally return reasoning information along with the answer, which is useful for models that support explicit reasoning steps.

Parameters:
  • messages (List[dict]) – A list of message dictionaries containing the conversation history. Each dictionary typically contains ‘role’ and ‘content’ keys. Example: [{“role”: “user”, “content”: “What is 2+2?”}]

  • with_reasoning (bool) – If True, return both reasoning and answer as a tuple. If False, return only the answer string. Default is False.

  • params (dict) – Additional parameters to pass to the model implementation. These may include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

If with_reasoning is False, returns the answer as a string. If with_reasoning is True, returns a tuple of (reasoning, answer), where reasoning can be None if not available or not supported by the model.

Return type:

Union[str, Tuple[Optional[str], str]]

Raises:

NotImplementedError – This method must be implemented by subclasses.

Example::
>>> model = SomeLLMModel()
>>> messages = [{"role": "user", "content": "What is 2+2?"}]
>>> model.ask(messages)
'4'
>>> model.ask(messages, with_reasoning=True)
('Adding 2 and 2', '4')
ask_stream(messages: List[dict], with_reasoning: bool = False, **params) ResponseStream[source]

Ask a question to the language model and get a streaming response.

This method allows for real-time streaming of the model’s response, which is useful for long responses or interactive applications where immediate feedback is desired. The response is delivered incrementally as it’s generated by the model.

Parameters:
  • messages (List[dict]) – A list of message dictionaries containing the conversation history. Each dictionary typically contains ‘role’ and ‘content’ keys. Example: [{“role”: “user”, “content”: “Tell me a story”}]

  • with_reasoning (bool) – If True, the stream should include reasoning information. If False, only the answer is streamed. Default is False.

  • params (dict) – Additional parameters to pass to the model implementation. These may include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

A ResponseStream object that can be iterated to receive response chunks. The stream yields text chunks as they become available from the model.

Return type:

ResponseStream

Raises:

NotImplementedError – This method must be implemented by subclasses.

Example::
>>> model = SomeLLMModel()
>>> messages = [{"role": "user", "content": "Tell me a story"}]
>>> stream = model.ask_stream(messages)
>>> for chunk in stream:
...     print(chunk, end='')
# Prints the story as it's generated, chunk by chunk