hbllmutils.meta.code.task

LLM task implementations for Python code generation with validation and parsing capabilities.

This module provides specialized LLM task classes for generating and validating Python code using Large Language Models. It extends the base parsable task framework to provide Python-specific functionality with automatic syntax validation and comprehensive source file analysis.

The module contains the following main components:

Key features include:

  • Automatic Python syntax validation using AST parsing

  • Configurable retry mechanisms for code generation failures

  • Integration with source file analysis and dependency tracking

  • Support for detailed code generation with customizable prompts

  • Module directory tree visualization support

  • Comprehensive error handling and logging

Note

All code generation tasks validate Python syntax using the ast module before returning results. This ensures generated code is syntactically correct.

Warning

Code generation may require multiple LLM API calls if parsing fails, which can increase costs and response times. Configure max_retries appropriately.

Example:

>>> from hbllmutils.model import LLMModel
>>> from hbllmutils.meta.code.task import PythonCodeGenerationLLMTask
>>> 
>>> # Basic code generation
>>> model = LLMModel(...)
>>> task = PythonCodeGenerationLLMTask(model, default_max_retries=3)
>>> code = task.ask_then_parse(input_content="Write a function to add two numbers")
>>> print(code)
def add(a, b):
    return a + b
>>> 
>>> # Detailed code generation with source analysis
>>> from hbllmutils.meta.code.task import PythonDetailedCodeGenerationLLMTask
>>> task = PythonDetailedCodeGenerationLLMTask(
...     model=model,
...     code_name="calculator",
...     description_text="Generate comprehensive unit tests",
...     show_module_directory_tree=True
... )
>>> code = task.ask_then_parse(input_content="path/to/calculator.py")

PythonCodeGenerationLLMTask

class hbllmutils.meta.code.task.PythonCodeGenerationLLMTask(model: str | LLMModel, history: LLMHistory | None = None, default_max_retries: int = 5, force_ast_check: bool = True)[source]

An LLM task for generating and validating Python code with automatic syntax checking.

This task extends ParsableLLMTask to provide Python-specific code generation capabilities. It automatically extracts code from the model’s response and validates it using Python’s AST parser to ensure syntactic correctness. The task will retry on parsing failures up to the configured maximum number of retries.

The validation process:

  1. Extracts code from the model’s response (handles both plain code and fenced code blocks)

  2. Parses the code using ast.parse() to validate Python syntax

  3. Returns the validated code if successful

  4. Raises an exception and retries if parsing fails

The task catches SyntaxError and ValueError exceptions during parsing, which trigger automatic retries. Other exceptions will propagate immediately.

Parameters:
  • model (LLMModelTyping) – The LLM model to use for code generation.

  • history (Optional[LLMHistory]) – Optional conversation history. If None, a new history will be created.

  • default_max_retries (int) – Maximum number of retry attempts for code generation and parsing. Defaults to 5.

  • force_ast_check (bool) – If True, always validate code with AST parsing. If False, skip AST validation (useful for code snippets that may not be complete valid Python modules). Defaults to True.

Variables:

force_ast_check (bool) – Whether to enforce AST validation on generated code.

Note

The task preserves trailing whitespace stripping on extracted code to ensure clean output formatting.

Warning

AST validation only checks syntax, not semantic correctness. The generated code may still contain logical errors or runtime issues.

Example:

>>> from hbllmutils.model import LLMModel
>>> from hbllmutils.meta.code.task import PythonCodeGenerationLLMTask
>>> 
>>> model = LLMModel(...)
>>> task = PythonCodeGenerationLLMTask(model, default_max_retries=3)
>>> 
>>> # Generate a simple function
>>> code = task.ask_then_parse(input_content="Write a function to add two numbers")
>>> print(code)
def add(a, b):
    return a + b
>>> 
>>> # Generate with forced AST checking
>>> task = PythonCodeGenerationLLMTask(model, force_ast_check=True)
>>> code = task.ask_then_parse(input_content="Write a class for a calculator")
>>> print(code)
class Calculator:
    def add(self, a, b):
        return a + b
    def subtract(self, a, b):
        return a - b
>>> 
>>> # Handle generation failures
>>> try:
...     code = task.ask_then_parse(
...         input_content="Generate invalid code",
...         max_retries=2
...     )
... except OutputParseFailed as e:
...     print(f"Failed after {len(e.tries)} attempts")
__init__(model: str | LLMModel, history: LLMHistory | None = None, default_max_retries: int = 5, force_ast_check: bool = True)[source]

Initialize the PythonCodeGenerationLLMTask.

Parameters:
  • model (LLMModelTyping) – The LLM model to use for code generation.

  • history (Optional[LLMHistory]) – Optional conversation history. If None, creates a new history.

  • default_max_retries (int) – Maximum retry attempts for parsing. Defaults to 5.

  • force_ast_check (bool) – Whether to enforce AST validation. Defaults to True.

PythonDetailedCodeGenerationLLMTask

class hbllmutils.meta.code.task.PythonDetailedCodeGenerationLLMTask(model: str | LLMModel, code_name: str, description_text: str, history: LLMHistory | None = None, default_max_retries: int = 5, show_module_directory_tree: bool = False, skip_when_error: bool = True, force_ast_check: bool = True, ignore_modules: Iterable[str] | None = None, no_ignore_modules: Iterable[str] | None = None)[source]

An advanced LLM task for generating Python code with comprehensive source file analysis.

This task extends PythonCodeGenerationLLMTask to provide detailed code generation capabilities that include:

  • Full source file analysis with package namespace information

  • Dependency analysis showing all imports and their implementations

  • Optional module directory tree visualization

  • Customizable code generation prompts and descriptions

  • Configurable error handling behavior

The task preprocesses input by generating a comprehensive prompt that includes the source file content, its dependencies, and optional contextual information like the module directory structure. This enriched context helps the LLM generate more accurate and contextually appropriate code.

The generated prompt includes:

  • Source file location and package namespace

  • Complete source code of the target file

  • Optional directory tree visualization

  • Dependency analysis with import statements and implementations

  • Custom description text for additional context

Parameters:
  • model (LLMModelTyping) – The LLM model to use for code generation.

  • code_name (str) – The name/label for the code section in the generated prompt. Used as a prefix for the title (e.g., “primary” results in “Primary Source Code Analysis”). If None, uses “Source Code Analysis”.

  • description_text (str) – Descriptive text to include in the prompt, providing additional context or instructions for code generation.

  • history (Optional[LLMHistory]) – Optional conversation history. If None, a new history will be created.

  • default_max_retries (int) – Maximum number of retry attempts for code generation and parsing. Defaults to 5.

  • show_module_directory_tree (bool) – If True, include a directory tree visualization of the module structure in the generated prompt. Defaults to False.

  • skip_when_error (bool) – If True, skip imports that fail to load during analysis and issue warnings instead of raising exceptions. Defaults to True.

  • force_ast_check (bool) – If True, always validate generated code with AST parsing. Defaults to True.

  • ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names that should be explicitly ignored during dependency analysis regardless of download count or other criteria.

  • no_ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names that should never be ignored during dependency analysis regardless of download count or other filtering criteria.

Variables:
  • code_name (str) – The name/label for the code section in prompts.

  • description_text (str) – Description text for prompt context.

  • show_module_directory_tree (bool) – Whether to include directory tree in prompts.

  • skip_when_error (bool) – Whether to skip failed imports during analysis.

  • ignore_modules (Optional[Iterable[str]]) – Module names to explicitly ignore during analysis.

  • no_ignore_modules (Optional[Iterable[str]]) – Module names to never ignore during analysis.

Note

This task is particularly useful for generating documentation, unit tests, or refactored code that requires understanding of the full module context.

Warning

Analyzing large modules with many dependencies can generate very long prompts, which may exceed token limits for some LLM models. Consider the model’s context window when using this task.

Example:

>>> from hbllmutils.model import LLMModel
>>> from hbllmutils.meta.code.task import PythonDetailedCodeGenerationLLMTask
>>> 
>>> model = LLMModel(...)
>>> 
>>> # Generate unit tests with full context
>>> task = PythonDetailedCodeGenerationLLMTask(
...     model=model,
...     code_name="calculator",
...     description_text="Generate comprehensive unit tests for this module",
...     show_module_directory_tree=True,
...     default_max_retries=3
... )
>>> code = task.ask_then_parse(input_content="path/to/calculator.py")
>>> print(code)
import unittest
from calculator import add, subtract, multiply, divide

class TestCalculator(unittest.TestCase):
    def test_add(self):
        self.assertEqual(add(2, 3), 5)
    ...
>>> 
>>> # Generate documentation with context
>>> task = PythonDetailedCodeGenerationLLMTask(
...     model=model,
...     code_name="api_handler",
...     description_text="Generate comprehensive API documentation",
...     skip_when_error=True
... )
>>> docs = task.ask_then_parse(input_content="mypackage/api.py")
>>> 
>>> # Handle analysis errors gracefully
>>> task = PythonDetailedCodeGenerationLLMTask(
...     model=model,
...     code_name="module",
...     description_text="Analyze this code",
...     skip_when_error=False
... )
>>> try:
...     code = task.ask_then_parse(input_content="problematic_module.py")
... except Exception as e:
...     print(f"Analysis failed: {e}")
__init__(model: str | LLMModel, code_name: str, description_text: str, history: LLMHistory | None = None, default_max_retries: int = 5, show_module_directory_tree: bool = False, skip_when_error: bool = True, force_ast_check: bool = True, ignore_modules: Iterable[str] | None = None, no_ignore_modules: Iterable[str] | None = None)[source]

Initialize the PythonDetailedCodeGenerationLLMTask.

Parameters:
  • model (LLMModelTyping) – The LLM model to use for code generation.

  • code_name (str) – The name/label for the code section in the generated prompt.

  • description_text (str) – Descriptive text providing context for code generation.

  • history (Optional[LLMHistory]) – Optional conversation history. If None, creates a new history.

  • default_max_retries (int) – Maximum retry attempts for parsing. Defaults to 5.

  • show_module_directory_tree (bool) – Whether to include directory tree in the prompt. Defaults to False.

  • skip_when_error (bool) – Whether to skip failed imports during analysis. Defaults to True.

  • force_ast_check (bool) – Whether to enforce AST validation. Defaults to True.

  • ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names to explicitly ignore during dependency analysis.

  • no_ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names to never ignore during dependency analysis.