hbllmutils.model.remote
This module provides a remote LLM (Large Language Model) client implementation.
It offers a unified interface for interacting with OpenAI-compatible API endpoints, supporting both synchronous and asynchronous operations, streaming responses, and customizable parameters.
- Classes:
RemoteLLMModel: Main class for managing remote LLM API interactions.
RemoteLLMModel
- class hbllmutils.model.remote.RemoteLLMModel(base_url: str, api_token: str, model_name: str, organization_id: str | None = None, timeout: int = 30, max_retries: int = 3, headers: Dict[str, str] | None = None, **default_params)[source]
A client for interacting with remote Large Language Model APIs.
This class provides a unified interface for communicating with OpenAI-compatible API endpoints. It supports both synchronous and asynchronous operations, streaming responses, and allows customization of request parameters.
- Variables:
base_url (str) – API base URL (e.g., “https://api.openai.com/v1”)
api_token (str) – API access token for authentication
model_name (str) – Name of the model to use (e.g., “gpt-3.5-turbo”, “claude-3-opus”)
organization_id (Optional[str]) – Organization ID (required by some APIs)
timeout (int) – Request timeout in seconds
max_retries (int) – Maximum number of retry attempts
headers (Dict[str, str]) – Custom request headers
default_params (Dict[str, Any]) – Default parameters for API requests
- __init__(base_url: str, api_token: str, model_name: str, organization_id: str | None = None, timeout: int = 30, max_retries: int = 3, headers: Dict[str, str] | None = None, **default_params)[source]
Initialize the RemoteLLMModel instance.
- Parameters:
base_url (str) – API base URL (e.g., “https://api.openai.com/v1”)
api_token (str) – API access token for authentication
model_name (str) – Name of the model to use (e.g., “gpt-3.5-turbo”)
organization_id (Optional[str]) – Organization ID (optional, required by some APIs)
timeout (int) – Request timeout in seconds (default: 30)
max_retries (int) – Maximum number of retry attempts (default: 3)
headers (Optional[Dict[str, str]]) – Custom request headers (optional)
default_params – Default parameters for API requests (optional)
- Raises:
ValueError – If base_url format is invalid
ValueError – If api_token is empty
ValueError – If model_name is empty
ValueError – If timeout is not positive
ValueError – If max_retries is negative
- Example::
>>> model = RemoteLLMModel( ... base_url="https://api.openai.com/v1", ... api_token="sk-xxx", ... model_name="gpt-3.5-turbo" ... )
- __repr__() str[source]
Return a string representation of the RemoteLLMModel instance.
All constructor parameters including default_params are displayed at the same level. The API token is masked for security purposes.
- Returns:
String representation of the instance
- Return type:
str
- Example::
>>> model = RemoteLLMModel( ... base_url="https://api.openai.com/v1", ... api_token="sk-xxx", ... model_name="gpt-3.5-turbo", ... max_tokens=1000 ... ) >>> repr(model) 'RemoteLLMModel(base_url=..., api_token=..., max_tokens=1000, ...)'
- ask(messages: List[dict], with_reasoning: bool = False, **params) str | Tuple[str | None, str][source]
Send a chat request and get the text response.
- Parameters:
messages (List[dict]) – List of message dictionaries for the conversation
with_reasoning (bool) – Whether to return reasoning content along with the response (default: False)
params (Any) – Additional parameters to pass to the API
- Returns:
If with_reasoning is False, returns the content string. If with_reasoning is True, returns a tuple of (reasoning_content, content).
- Return type:
Union[str, Tuple[Optional[str], str]]
- Example::
>>> model = RemoteLLMModel(base_url="...", api_token="...", model_name="...") >>> messages = [{"role": "user", "content": "Explain quantum computing"}] >>> # Get only the response content >>> response = model.ask(messages) >>> print(response) >>> # Get both reasoning and response content >>> reasoning, response = model.ask(messages, with_reasoning=True) >>> print(f"Reasoning: {reasoning}") >>> print(f"Response: {response}")
- ask_stream(messages: List[dict], with_reasoning: bool = False, **params) ResponseStream[source]
Send a chat request and get a streaming response.
- Parameters:
messages (List[dict]) – List of message dictionaries for the conversation
with_reasoning (bool) – Whether to include reasoning content in the stream (default: False)
params (Any) – Additional parameters to pass to the API
- Returns:
A ResponseStream object for iterating over the streaming response
- Return type:
- Example::
>>> model = RemoteLLMModel(base_url="...", api_token="...", model_name="...") >>> messages = [{"role": "user", "content": "Write a story"}] >>> stream = model.ask_stream(messages) >>> for chunk in stream: ... print(chunk, end='', flush=True)
- create_message(messages: List[dict], **params) ChatCompletionMessage[source]
Send a chat request and get the complete message response.
- Parameters:
messages (List[dict]) – List of message dictionaries for the conversation
params (Any) – Additional parameters to pass to the API
- Returns:
The message object from the first choice in the response
- Return type:
ChatCompletionMessage
- Example::
>>> model = RemoteLLMModel(base_url="...", api_token="...", model_name="...") >>> messages = [{"role": "user", "content": "What is AI?"}] >>> response = model.create_message(messages) >>> print(response.content)