hbllmutils.model.task

LLM task management utilities.

This module defines a high-level wrapper for executing Large Language Model (LLM) tasks using a model implementation and an optional conversation history. It provides convenience methods to send prompts, retrieve responses, and stream content while keeping conversation context available for multi-turn interaction patterns.

The module contains the following public components:

  • LLMTask - Abstract base class for LLM task management and execution

Key features provided by LLMTask include:

  • Standard question-and-answer interactions with optional reasoning output

  • Streaming responses for real-time consumption

  • Conversation history management via LLMHistory

  • Flexible model initialization via load_llm_model()

Note

This module does not mutate history automatically. Returned content should be appended to history by callers that want to persist conversation state.

Warning

Because history is retained across the lifetime of a task, long-running sessions may grow memory usage. Consider truncating or resetting history periodically.

Example:

>>> from hbllmutils.model.task import LLMTask
>>> from hbllmutils.history import LLMHistory
>>> from hbllmutils.model.load import load_llm_model
>>>
>>> model = load_llm_model('gpt-4')
>>> history = LLMHistory().with_system_prompt('You are a helpful assistant.')
>>> task = LLMTask(model, history)
>>>
>>> # Ask a standard question
>>> answer = task.ask("What is the capital of France?")
>>> print(answer)
The capital of France is Paris.
>>>
>>> # Stream a response
>>> stream = task.ask_stream("Tell me a short story")
>>> for chunk in stream:
...     print(chunk, end='', flush=True)
Once upon a time...

LLMTask

class hbllmutils.model.task.LLMTask(model: str | LLMModel, history: LLMHistory | None = None)[source]

Abstract base class for managing LLM task execution and conversation history.

This class provides a high-level interface for interacting with language models, handling both standard and streaming responses while automatically managing conversation context. It wraps an LLM model and maintains a conversation history, simplifying the process of multi-turn interactions.

The class supports:

  • Standard question-answer interactions with automatic history updates

  • Streaming responses for real-time output processing

  • Optional reasoning output for models supporting chain-of-thought

  • Flexible model initialization from various input types

  • Conversation history persistence and management

Parameters:
  • model (LLMModelTyping) – The LLM model to use. Can be a model name string, an LLMModel instance, or None to load the default model from configuration.

  • history (Optional[LLMHistory]) – Optional conversation history. If not provided, a new empty history is created. The history maintains the context of the conversation.

Variables:
  • model (LLMModel) – The initialized LLM model instance used for generating responses.

  • history (LLMHistory) – The conversation history tracking all messages in the task.

Note

This is an abstract base class. While it can be instantiated directly, subclasses may provide additional specialized functionality.

Example:

>>> # Initialize with model name
>>> task = LLMTask('gpt-4')
>>>
>>> # Initialize with existing model and history
>>> model = load_llm_model('gpt-4')
>>> history = LLMHistory().with_system_prompt('You are helpful.')
>>> task = LLMTask(model, history)
>>>
>>> # Basic usage
>>> response = task.ask("Hello!")
>>> print(response)
Hello! How can I help you today?
__eq__(other: object) bool[source]

Check equality between this LLMTask and another object.

Two LLMTask instances are considered equal if they have the same class type and the same model and history parameters. This allows for proper comparison of task instances in collections and conditional logic.

The comparison is based on the values returned by _values(), which includes both the class type and the internal parameters (model and history).

Parameters:

other (object) – The object to compare with.

Returns:

True if the objects are equal (same class and same parameters), False otherwise.

Return type:

bool

Example:

>>> model = load_llm_model('gpt-4')
>>> history = LLMHistory()
>>> task1 = LLMTask(model, history)
>>> task2 = LLMTask(model, history)
>>> task1 == task2
True
>>>
>>> task3 = LLMTask(model, history.with_user_message("Hello"))
>>> task1 == task3
False
__hash__() int[source]

Get the hash value of this LLMTask instance.

The hash is computed based on the class type and the model and history parameters. This allows LLMTask instances to be used as dictionary keys or in sets, provided the underlying model and history are also hashable.

The hash is derived from the values returned by _values(), ensuring consistency with the equality comparison implemented in __eq__.

Returns:

The hash value of this task instance.

Return type:

int

Raises:

TypeError – If the underlying model or history is not hashable.

Example:

>>> model = load_llm_model('gpt-4')
>>> history = LLMHistory()
>>> task = LLMTask(model, history)
>>> hash_value = hash(task)
>>> isinstance(hash_value, int)
True
>>>
>>> # Can be used in sets and as dict keys
>>> task_set = {task}
>>> task_dict = {task: "some_value"}
__init__(model: str | LLMModel, history: LLMHistory | None = None)[source]

Initialize the LLMTask with a model and optional conversation history.

The model parameter is flexible and can accept various input types:

  • A string representing the model name (loaded from configuration)

  • An existing LLMModel instance

  • None to load the default model from configuration

If no history is provided, a new empty LLMHistory instance is created.

Parameters:
  • model (LLMModelTyping) – The LLM model specification. Can be a model name string, an LLMModel instance, or None for the default model.

  • history (Optional[LLMHistory]) – Optional conversation history. If None, creates a new empty history to track the conversation.

Raises:
  • TypeError – If model is not a valid type (string, LLMModel, or None).

  • ValueError – If model name is invalid or not found in configuration.

Example:

>>> # With model name
>>> task = LLMTask('gpt-4')
>>>
>>> # With existing model
>>> model = load_llm_model('gpt-4')
>>> task = LLMTask(model)
>>>
>>> # With model and history
>>> history = LLMHistory().with_system_prompt('Be concise.')
>>> task = LLMTask('gpt-4', history)
ask(input_content: str | None = None, with_reasoning: bool = False, **params: Any) str | Tuple[str | None, str][source]

Ask a question to the LLM model and receive a response.

This method sends the current conversation history (optionally with new user input) to the model and retrieves a response. The conversation history is used as context but is not automatically updated - use the returned response to update history manually if needed.

The method supports two response formats:

  • Standard mode (with_reasoning=False): Returns only the response text

  • Reasoning mode (with_reasoning=True): Returns a tuple of (reasoning, response)

Parameters:
  • input_content (Optional[str]) – Optional user input to add to the history before asking. If None, uses the existing history without modification. The original history is not modified; a temporary copy is used.

  • with_reasoning (bool) – If True, returns both reasoning and response as a tuple. If False, returns only the response string. Defaults to False.

  • params (dict) – Additional parameters to pass to the model’s ask method. May include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

If with_reasoning is False, returns the response string. If with_reasoning is True, returns a tuple of (reasoning, response) where reasoning may be None if not supported by the model.

Return type:

Union[str, Tuple[Optional[str], str]]

Note

This method does not modify the task’s history. If you want to maintain the conversation context, you need to manually update the history with the input and response.

Example:

>>> task = LLMTask('gpt-4')
>>>
>>> # Simple question
>>> response = task.ask("What is 2+2?")
>>> print(response)
4
>>>
>>> # With reasoning
>>> reasoning, response = task.ask(
...     "Explain quantum entanglement",
...     with_reasoning=True
... )
>>> print(f"Reasoning: {reasoning}")
>>> print(f"Response: {response}")
>>>
>>> # With additional parameters
>>> response = task.ask(
...     "Write a poem",
...     temperature=0.9,
...     max_tokens=100
... )
ask_stream(input_content: str | None = None, with_reasoning: bool = False, **params: Any) ResponseStream[source]

Ask a question to the LLM model and receive a streaming response.

This method sends the current conversation history (optionally with new user input) to the model and retrieves a streaming response. This is useful for long responses or interactive applications where immediate feedback is desired. The response is delivered incrementally as it’s generated by the model.

The stream can optionally include reasoning information when with_reasoning=True, which will be separated from the regular content using configurable splitters.

Parameters:
  • input_content (Optional[str]) – Optional user input to add to the history before asking. If None, uses the existing history without modification. The original history is not modified; a temporary copy is used.

  • with_reasoning (bool) – If True, the stream includes reasoning information separated from the regular content. If False, only the response content is streamed. Defaults to False.

  • params (dict) – Additional parameters to pass to the model’s ask_stream method. May include temperature, max_tokens, top_p, etc., depending on the specific model implementation.

Returns:

A ResponseStream object that can be iterated to receive response chunks in real-time. The stream yields text chunks as they become available.

Return type:

ResponseStream

Note

This method does not modify the task’s history. The stream must be fully consumed before the response content is available via stream properties.

Warning

The ResponseStream can only be iterated once. After iteration completes, attempting to iterate again will raise a RuntimeError.

Example:

>>> task = LLMTask('gpt-4')
>>>
>>> # Basic streaming
>>> stream = task.ask_stream("Tell me a story")
>>> for chunk in stream:
...     print(chunk, end='', flush=True)
Once upon a time, there was...
>>>
>>> # With reasoning
>>> stream = task.ask_stream(
...     "Solve this problem",
...     with_reasoning=True
... )
>>> for chunk in stream:
...     print(chunk, end='', flush=True)
>>>
>>> # Access full content after streaming
>>> print(stream.reasoning_content)
>>> print(stream.content)
>>>
>>> # With additional parameters
>>> stream = task.ask_stream(
...     "Write a poem",
...     temperature=0.9,
...     max_tokens=200
... )