hbllmutils.entry.code.pydoc

Python Documentation Generation CLI Utilities.

This module provides command-line utilities and supporting functions for generating comprehensive Python documentation using an LLM model. It focuses on processing individual Python files via generate_pydoc_for_file() while exposing a CLI subcommand for directory-level batch processing.

The module contains the following main components:

generate_pydoc_for_file() - Generate documentation for a single Python file

Note

This module requires an LLM model configuration to be available either through command-line parameters or environment variables (OPENAI_MODEL_NAME, LLM_MODEL_NAME, or MODEL_NAME).

Warning

Documentation generation may consume significant API tokens for large files or directories. Monitor your API usage when processing multiple files.

Example:

>>> # Command line usage
>>> # Generate docs for a single file
>>> # hbllmutils code pydoc -i mymodule.py -m gpt-4
>>>
>>> # Generate docs for a directory
>>> # hbllmutils code pydoc -i mypackage/ -m gpt-4 --timeout 300
>>>
>>> # With additional parameters
>>> # hbllmutils code pydoc -i myfile.py --param max_tokens=128000

generate_pydoc_for_file

hbllmutils.entry.code.pydoc.generate_pydoc_for_file(file: str, model_name: str | None = None, timeout: int = 240, extra_params: Dict[str, str | int | float] | None = None, ignore_modules: Tuple[str, ...] | None = None, no_ignore_modules: Tuple[str, ...] | None = None, docstyle: Literal['sphinx', 'google', 'numpy', 'epytext', 'pep257'] = 'sphinx') → None[source]

Generate Python documentation for a single file using LLM.

This function reads a Python source file, generates comprehensive documentation using an LLM model, and writes the documented code back to the same file, replacing the original content. The documentation includes module-level docstrings, class and function documentation in reStructuredText format.

The function uses the cached LLM task from _get_llm_task() to perform the documentation generation. The generated documentation is automatically formatted and validated before being written back to the file.

Parameters:

file (str) – Path to the Python file to document
model_name (Optional[str]) – Name of the LLM model to use. If None, uses default from configuration.
timeout (int) – Timeout in seconds for LLM API requests. Defaults to 240 seconds.
extra_params (Optional[Dict[str, Union[str, int, float]]]) – Additional parameters to pass to the LLM model as a dictionary. Common parameters include ‘max_tokens’, ‘temperature’, etc.
ignore_modules (Optional[Tuple[str, ...]]) – Tuple of module names to explicitly ignore during dependency analysis.
no_ignore_modules (Optional[Tuple[str, ...]]) – Tuple of module names to never ignore during dependency analysis.
docstyle (Literal['sphinx', 'google', 'numpy', 'epytext', 'pep257']) – Documentation style for generation. Supported values are ‘sphinx’, ‘google’, ‘numpy’, ‘epytext’, and ‘pep257’.

Raises:

FileNotFoundError – If the specified file does not exist
PermissionError – If the file cannot be read or written
ValueError – If the file is not a valid Python file
RuntimeError – If documentation generation fails

Warning

This function overwrites the original file with the documented version. Ensure you have backups or version control in place before running.

Note

The function uses max_retries=0 when calling the LLM task, meaning it will not retry on failure. Any errors during generation will be propagated.

Example:

>>> from hbllmutils.entry.code.pydoc import generate_pydoc_for_file
>>>
>>> # Generate docs for a single file
>>> generate_pydoc_for_file('mymodule.py', model_name='gpt-4')
>>>
>>> # With custom timeout and parameters
>>> params = {'max_tokens': 128000, 'temperature': 0.7}
>>> generate_pydoc_for_file(
...     'mymodule.py',
...     model_name='gpt-4',
...     timeout=300,
...     extra_params=params
... )
>>>
>>> # With module filtering
>>> generate_pydoc_for_file(
...     'mymodule.py',
...     model_name='gpt-4',
...     ignore_modules=('numpy', 'pandas'),
...     no_ignore_modules=('mypackage',)
... )