hbllmutils.meta.code.tree
File path pattern management and directory tree building utilities for Python projects.
This module provides comprehensive functionality for managing file path patterns and determining which files should be ignored based on Python project conventions and custom patterns. It enables building filtered directory tree structures while respecting gitignore-style patterns commonly used in Python projects.
The module contains the following main components:
is_file_should_ignore()- Check if a file should be ignored based on patternsbuild_python_project_tree()- Build filtered directory tree structureget_python_project_tree_text()- Generate formatted text tree representation
Key features include:
Pattern matching using gitignore-style patterns via pathspec library
Comprehensive default Python gitignore patterns covering common artifacts
Support for custom additional ignore patterns
Directory tree building with optional focus item highlighting
Text-based tree visualization with box-drawing characters
LRU caching of pattern matchers for performance optimization
Note
The module uses pathspec library for gitignore-style pattern matching, which provides robust and standards-compliant pattern evaluation.
Warning
Large directory structures may require significant time to traverse. Consider using extra_patterns to filter out unnecessary directories early.
Example:
>>> from hbllmutils.meta.code.tree import build_python_project_tree, get_python_project_tree_text
>>>
>>> # Build a directory tree structure
>>> root, tree = build_python_project_tree('/path/to/project')
>>> print(root)
'project'
>>>
>>> # Generate formatted text representation
>>> print(get_python_project_tree_text('/path/to/project'))
project
├── src
│ ├── main.py
│ └── utils.py
└── tests
└── test_main.py
>>>
>>> # Highlight specific files with focus labels
>>> print(get_python_project_tree_text(
... '/path/to/project',
... focus_items={'entry': 'src/main.py', 'config': 'config.yaml'}
... ))
project
├── src
│ ├── main.py <-- (entry)
│ └── utils.py
├── tests
│ └── test_main.py
└── config.yaml <-- (config)
is_file_should_ignore
- hbllmutils.meta.code.tree.is_file_should_ignore(path: str | Path, extra_patterns: List[str] | None = None) bool[source]
Determine whether a file should be ignored based on Python gitignore patterns.
This function checks if the given file path matches any of the default Python gitignore patterns or any additional custom patterns provided. It uses a cached PathSpec matcher for efficient pattern matching. The function handles both string paths and pathlib.Path objects, converting them to POSIX-style paths for consistent pattern matching across platforms.
- Parameters:
path (Union[str, pathlib.Path]) – The file path to check against ignore patterns. Can be absolute or relative.
extra_patterns (Optional[List[str]]) – Optional list of additional patterns to check beyond the default Python gitignore patterns. Patterns follow gitignore syntax.
- Returns:
True if the file should be ignored (matches any pattern), False otherwise.
- Return type:
bool
Note
The extra_patterns list is sorted and converted to a tuple for caching purposes. This ensures consistent cache keys regardless of the original list order.
Example:
>>> is_file_should_ignore('__pycache__/test.pyc') True >>> is_file_should_ignore('main.py') False >>> is_file_should_ignore('test.txt', extra_patterns=['*.txt']) True >>> >>> # Works with pathlib.Path objects >>> from pathlib import Path >>> is_file_should_ignore(Path('build/output.so')) True >>> >>> # Custom patterns can be added >>> is_file_should_ignore('data.csv', extra_patterns=['*.csv', '*.json']) True
build_python_project_tree
- hbllmutils.meta.code.tree.build_python_project_tree(root_path: str, extra_patterns: List[str] | None = None, focus_items: dict | None = None) Tuple[str, List][source]
Build a directory tree structure for a Python project while respecting ignore patterns.
This function recursively traverses the directory structure starting from the root path, filtering out files and directories that match the Python gitignore patterns or any additional custom patterns provided. It returns a tree structure representation of the project suitable for visualization or further processing. Optionally, specific files or directories can be highlighted with focus labels to draw attention to important items.
- Parameters:
root_path (str) – The root directory path to start building the tree from. Can be absolute or relative to the current working directory.
extra_patterns (Optional[List[str]]) – Optional list of additional patterns to ignore beyond the default Python gitignore patterns. Patterns follow gitignore syntax.
focus_items (Optional[dict]) – Optional dictionary mapping focus labels to file/directory paths that should be highlighted. The paths must be within the root_path or its subdirectories. Paths can be either absolute or relative to root_path. Focus items are marked with “ <– (label)” suffix in their names.
- Returns:
A tuple containing: - The name of the root directory (str) - A list of tree nodes representing the directory structure Each tree node is a tuple of (name, children) where: - name (str) is the file/directory name, optionally with focus suffix - children (list) is a list of child nodes (empty for files)
- Return type:
Tuple[str, List]
- Raises:
ValueError – If a focus item path is not within the root path or its subdirectories.
PermissionError – If a directory cannot be accessed due to permissions (caught and marked in tree as “(Permission Denied)”).
Note
Empty directories (after filtering) are excluded from the tree structure. Only directories containing at least one non-ignored file are included.
Warning
Large directory structures may take significant time to traverse. Consider using extra_patterns to filter out large directories early in the traversal.
Example:
>>> root, tree = build_python_project_tree('/path/to/project') >>> print(root) 'project' >>> print(tree) [('src', [('main.py', []), ('utils.py', [])]), ('tests', [('test_main.py', [])])] >>> >>> # With focus items to highlight specific files >>> root, tree = build_python_project_tree( ... '/path/to/project', ... focus_items={'entry': 'src/main.py', 'config': 'config.yaml'} ... ) >>> print(tree) [('src', [('main.py <-- (entry)', []), ('utils.py', [])]), ('tests', [('test_main.py', [])]), ('config.yaml <-- (config)', [])] >>> >>> # With extra ignore patterns >>> root, tree = build_python_project_tree( ... '/path/to/project', ... extra_patterns=['*.md', 'docs/'] ... )
get_python_project_tree_text
- hbllmutils.meta.code.tree.get_python_project_tree_text(root_path: str, extra_patterns: List[str] | None = None, focus_items: dict | None = None, encoding: str | None = None) str[source]
Generate a formatted text representation of a Python project’s directory tree.
This function builds a directory tree structure for a Python project and formats it as a text string with tree-like visual formatting using box-drawing characters (UTF-8) or ASCII characters depending on the encoding. It respects Python gitignore patterns and can optionally highlight specific files or directories with focus labels.
- Parameters:
root_path (str) – The root directory path to start building the tree from. Can be absolute or relative to the current working directory.
extra_patterns (Optional[List[str]]) – Optional list of additional patterns to ignore beyond the default Python gitignore patterns. Patterns follow gitignore syntax.
focus_items (Optional[dict]) – Optional dictionary mapping focus labels to file/directory paths that should be highlighted. The paths must be within the root_path or its subdirectories. Focus items are marked with “ <– (label)” suffix.
encoding (Optional[str]) – Encoding to be used for tree formatting. Default is None which means system encoding. When ASCII encoding is used, ASCII characters will be used instead of UTF-8 box-drawing characters for wider compatibility.
- Returns:
A formatted string representation of the directory tree with visual tree structure using box-drawing characters (├──, └──, │) or ASCII equivalents.
- Return type:
str
- Raises:
ValueError – If a focus item path is not within the root path or its subdirectories.
Note
The function automatically selects appropriate characters based on encoding: - UTF-8 encoding uses Unicode box-drawing characters for better visual appearance - ASCII encoding uses simple ASCII characters (+, |, -) for maximum compatibility
Example:
>>> print(get_python_project_tree_text('/path/to/project')) project ├── src │ ├── main.py │ └── utils.py └── tests └── test_main.py >>> >>> # With focus items to highlight specific files >>> print(get_python_project_tree_text( ... '/path/to/project', ... focus_items={'entry': 'src/main.py', 'test': 'tests/test_main.py'} ... )) project ├── src │ ├── main.py <-- (entry) │ └── utils.py └── tests └── test_main.py <-- (test) >>> >>> # With ASCII encoding for compatibility >>> print(get_python_project_tree_text('/path/to/project', encoding='ASCII')) project +-- src | +-- main.py | +-- utils.py +-- tests +-- test_main.py >>> >>> # With extra ignore patterns >>> print(get_python_project_tree_text( ... '/path/to/project', ... extra_patterns=['*.md', 'docs/', 'examples/'] ... ))