hbllmutils.meta.code.prompt

Code prompt generation for LLM analysis and documentation tasks.

This module provides functionality for generating comprehensive code prompts suitable for Large Language Model (LLM) analysis. It creates structured prompts containing source code and its dependencies, formatted in Markdown with clear sections for primary source code and dependency analysis.

The module is designed to support various LLM tasks including:

Automated documentation generation
Unit test creation
Code analysis and review
Code understanding and explanation
Refactoring suggestions

The generated prompts include:

Primary source code with file location and package namespace information
Optional module directory tree visualization
Comprehensive dependency analysis with import statements
Full implementation source code for each dependency
Proper Markdown formatting with hierarchical headers and code blocks

The module contains the following main components:

is_python_code() - Validate if text is syntactically correct Python code
is_python_file() - Check if a file contains valid Python code
get_prompt_for_source_file() - Generate comprehensive code prompts for LLM analysis

Note

This module requires the source file to be part of a valid Python package structure with proper __init__.py files for accurate package name resolution.

Warning

Large projects with many dependencies may generate very large prompts that could exceed token limits of some LLM models. Consider filtering dependencies or processing files individually.

Example:

>>> from hbllmutils.meta.code.prompt import get_prompt_for_source_file
>>> 
>>> # Generate a basic prompt for documentation
>>> prompt = get_prompt_for_source_file('mymodule.py')
>>> print(prompt[:200])
'## Primary Source Code Analysis\n\n**Source File Location:** `mymodule.py`...'
>>> 
>>> # Generate with custom description and directory tree
>>> prompt = get_prompt_for_source_file(
...     'mymodule.py',
...     description_text='Generate comprehensive pydoc for this module.',
...     show_module_directory_tree=True
... )

is_python_code

hbllmutils.meta.code.prompt.is_python_code(code_text: str) → bool[source]

Check if the given text is valid Python code.

This function attempts to parse the provided text using Python’s AST parser to determine if it represents syntactically valid Python code. It does not execute the code or check for semantic correctness, only syntax validity.

Parameters:: code_text (str) – The text string to check for Python code validity.
Returns:: True if the text is valid Python code, False otherwise.
Return type:: bool

Note

This function only validates syntax, not semantics. Code that parses successfully may still have runtime errors or logical issues.

Example:

>>> is_python_code("print('hello')")
True
>>> is_python_code("def foo(): return 42")
True
>>> is_python_code("invalid python code {{{")
False
>>> is_python_code("x = 1 + 2")
True
>>> is_python_code("")
True

is_python_file

hbllmutils.meta.code.prompt.is_python_file(code_file: str) → bool[source]

Check if a file contains valid Python code.

This function first checks if the file is binary, then reads its content and validates whether it contains syntactically valid Python code using AST parsing. It combines binary file detection with Python syntax validation.

Parameters:

code_file (str) – The path to the file to check.

Returns:

True if the file is a text file containing valid Python code, False if it’s binary or contains invalid Python syntax.

Return type:

bool

Raises:

FileNotFoundError – If the specified file does not exist.
PermissionError – If the file cannot be read due to permissions.

Note

This function reads the entire file content into memory, which may be inefficient for very large files.

Example:

>>> is_python_file('module.py')
True
>>> is_python_file('data.json')
False
>>> is_python_file('script.sh')
False
>>> is_python_file('image.png')
False

get_prompt_for_source_file

hbllmutils.meta.code.prompt.get_prompt_for_source_file(source_file: str, level: int = 2, code_name: str | None = 'primary', description_text: str | None = None, show_module_directory_tree: bool = True, skip_when_error: bool = True, min_last_month_downloads: int = 1000000, no_imports: bool = False, ignore_modules: Iterable[str] | None = None, no_ignore_modules: Iterable[str] | None = None, warning_when_not_python: bool = True, return_imported_items: bool = False) → str | Tuple[str, List[str]][source]

Generate a comprehensive code prompt for LLM analysis.

This function creates a structured prompt containing source code and its dependencies, suitable for various LLM tasks like documentation generation, unit testing, or code analysis. The prompt includes:

Primary source code analysis section with file location, package namespace, and complete source
Optional module directory tree visualization showing the file’s location in the project structure
Dependency analysis section with all imported dependencies and their implementations
For each import, includes the import statement, source file location, full package path, and either the implementation source code or object representation

The generated prompt is formatted in Markdown with code blocks and hierarchical headers, making it easy for LLMs to parse and understand the code structure and dependencies.

Dependencies can be filtered based on popularity (download count) and explicit inclusion/exclusion lists to control the size and relevance of the generated prompt.

Parameters:

source_file (str) – The path to the Python source file to generate a prompt for.
level (int) – The heading level for the main sections in the generated Markdown. Defaults to 2 (##). Subsections will use level+1.
code_name (Optional[str]) – The name to use for the code section title. If None, uses ‘Source Code Analysis’. Defaults to ‘primary’.
description_text (Optional[str]) – Optional description text to include after the title and before the source file information. Can be used to provide context or instructions for the LLM.
show_module_directory_tree (bool) – If True, include a directory tree visualization of the module structure with the current file highlighted. Defaults to True. For non-Python files, this parameter is ignored.
skip_when_error (bool) – If True, skip imports that fail to load and issue warnings instead of raising exceptions. Defaults to True. For non-Python files, this parameter is ignored.
min_last_month_downloads (int) – Minimum monthly downloads threshold for including a dependency in the prompt. Dependencies with higher downloads may be ignored to reduce prompt size. Defaults to 1000000. For non-Python files, this parameter is ignored.
no_imports (bool) – If True, skip the dependency analysis section entirely and only include the primary source code. Defaults to False. For non-Python files, this parameter is ignored.
ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names that should be explicitly ignored regardless of download count or other criteria. For non-Python files, this parameter is ignored.
no_ignore_modules (Optional[Iterable[str]]) – Optional iterable of module names that should never be ignored regardless of download count or other filtering criteria. For non-Python files, this parameter is ignored.
warning_when_not_python (bool) – If True, issue warnings when Python-specific parameters are set to non-default values for non-Python files. Defaults to True.
return_imported_items (bool) – If True, return a tuple of (prompt_text, imported_items_list) instead of just the prompt text. The imported_items_list contains the full package paths of all imported dependencies that were included in the prompt. Defaults to False.

Returns:

A formatted Markdown string containing the comprehensive code prompt. If return_imported_items is True, returns a tuple of (prompt_text, imported_items_list).

Return type:

str or tuple[str, list[str]]

Warns UserWarning:

When Python-specific parameters are set for non-Python files and warning_when_not_python is True.

Note

The function uses get_source_info() to analyze the source file and extract import information. Import failures can be handled gracefully with skip_when_error.

Warning

Large dependency trees can generate very large prompts. Consider using min_last_month_downloads to filter out common dependencies or set no_imports=True to exclude all dependencies.

Example:

>>> # Generate a prompt for a Python module
>>> prompt = get_prompt_for_source_file('mypackage/mymodule.py')
>>> print(prompt[:100])
'## Primary Source Code Analysis\n\n**Source File Location:** `mypackage/mymodule.py`\n\n**Package...'

>>> # Generate with custom heading level
>>> prompt = get_prompt_for_source_file('mymodule.py', level=3)
>>> # Will use ### for main sections and #### for subsections

>>> # Use the prompt for LLM tasks
>>> prompt = get_prompt_for_source_file('calculator.py')
>>> # Feed this prompt to an LLM for documentation generation, testing, etc.

>>> # Generate without code name prefix
>>> prompt = get_prompt_for_source_file('mymodule.py', code_name=None)
>>> # Title will be 'Source Code Analysis' instead of 'Primary Source Code Analysis'

>>> # Skip errors when analyzing problematic imports
>>> prompt = get_prompt_for_source_file('module_with_issues.py', skip_when_error=True)
>>> # Warnings will be issued for failed imports, but processing continues

>>> # Add custom description text
>>> prompt = get_prompt_for_source_file(
...     'mymodule.py',
...     description_text='This module implements core business logic for user authentication.'
... )
>>> # The description will appear after the title

>>> # Include module directory tree visualization
>>> prompt = get_prompt_for_source_file('mymodule.py', show_module_directory_tree=True)
>>> # The prompt will include a tree view showing the module's location in the project structure

>>> # Filter dependencies and preserve specific modules
>>> prompt = get_prompt_for_source_file(
...     'mymodule.py',
...     min_last_month_downloads=5000000,
...     no_ignore_modules=['mypackage.utils', 'mypackage.config']
... )
>>> # Only includes popular dependencies (>5M downloads) plus the specified modules

>>> # Explicitly ignore certain modules
>>> prompt = get_prompt_for_source_file(
...     'mymodule.py',
...     ignore_modules=['deprecated_module', 'legacy_code']
... )
>>> # The specified modules will be excluded from the dependency analysis

>>> # Generate prompt without any dependencies
>>> prompt = get_prompt_for_source_file('mymodule.py', no_imports=True)
>>> # Only the primary source code will be included, no dependency analysis

>>> # Generate prompt for a non-Python file
>>> prompt = get_prompt_for_source_file('config.yaml')
>>> # Will generate a simplified prompt without Python-specific analysis

>>> # Get both prompt and list of imported items
>>> prompt, imports = get_prompt_for_source_file(
...     'mymodule.py',
...     return_imported_items=True
... )
>>> print(f"Generated prompt with {len(imports)} dependencies")
>>> print(imports)
['mypackage.utils.helper', 'mypackage.config.settings']