hbllmutils.entry.code.pydoc

Python Documentation Generation Module

This module provides command line interface functionality for generating Python documentation using LLM models. It includes functionality to process Python files or directories and generate comprehensive documentation with proper formatting and structure.

The module contains the following main components:

  • generate_pydoc_for_file() - Generate documentation for a single Python file

  • _add_pydoc_subcommand() - Register the pydoc CLI subcommand

  • _get_llm_task() - Create and cache LLM task instances for documentation generation

Note

This module requires an LLM model configuration to be available either through command-line parameters or environment variables (OPENAI_MODEL_NAME, LLM_MODEL_NAME, or MODEL_NAME).

Warning

Documentation generation may consume significant API tokens for large files or directories. Monitor your API usage when processing multiple files.

Example:

>>> # Command line usage
>>> # Generate docs for a single file
>>> # hbllmutils code pydoc -i mymodule.py -m gpt-4
>>> 
>>> # Generate docs for a directory
>>> # hbllmutils code pydoc -i mypackage/ -m gpt-4 --timeout 300
>>> 
>>> # With additional parameters
>>> # hbllmutils code pydoc -i myfile.py --param max_tokens=128000

generate_pydoc_for_file

hbllmutils.entry.code.pydoc.generate_pydoc_for_file(file: str, model_name: str | None = None, timeout: int = 240, extra_params: Dict[str, str | int | float] | None = None, ignore_modules: Tuple[str, ...] | None = None, no_ignore_modules: Tuple[str, ...] | None = None) None[source]

Generate Python documentation for a single file using LLM.

This function reads a Python source file, generates comprehensive documentation using an LLM model, and writes the documented code back to the same file, replacing the original content. The documentation includes module-level docstrings, class and function documentation in reStructuredText format.

The function uses the cached LLM task from _get_llm_task() to perform the documentation generation. The generated documentation is automatically formatted and validated before being written back to the file.

Parameters:
  • file (str) – Path to the Python file to document

  • model_name (Optional[str]) – Name of the LLM model to use. If None, uses default from configuration.

  • timeout (int) – Timeout in seconds for LLM API requests. Defaults to 240 seconds.

  • extra_params (Optional[Dict[str, Union[str, int, float]]]) – Additional parameters to pass to the LLM model as a dictionary. Common parameters include ‘max_tokens’, ‘temperature’, etc.

  • ignore_modules (Optional[Tuple[str, ...]]) – Tuple of module names to explicitly ignore during dependency analysis.

  • no_ignore_modules (Optional[Tuple[str, ...]]) – Tuple of module names to never ignore during dependency analysis.

Raises:
  • FileNotFoundError – If the specified file does not exist

  • PermissionError – If the file cannot be read or written

  • ValueError – If the file is not a valid Python file

  • RuntimeError – If documentation generation fails

Warning

This function overwrites the original file with the documented version. Ensure you have backups or version control in place before running.

Note

The function uses max_retries=0 when calling the LLM task, meaning it will not retry on failure. Any errors during generation will be propagated.

Example:

>>> from hbllmutils.entry.code.pydoc import generate_pydoc_for_file
>>> 
>>> # Generate docs for a single file
>>> generate_pydoc_for_file('mymodule.py', model_name='gpt-4')
>>> 
>>> # With custom timeout and parameters
>>> params = {'max_tokens': 128000, 'temperature': 0.7}
>>> generate_pydoc_for_file(
...     'mymodule.py',
...     model_name='gpt-4',
...     timeout=300,
...     extra_params=params
... )
>>> 
>>> # With module filtering
>>> generate_pydoc_for_file(
...     'mymodule.py',
...     model_name='gpt-4',
...     ignore_modules=('numpy', 'pandas'),
...     no_ignore_modules=('mypackage',)
... )