Performance Benchmarks¶

tdom-path is highly optimized for real-world usage, particularly Static Site Generation (SSG) workflows where components are reused across multiple pages. The library uses LRU caching for module loading, providing 17.9x speedup for cached accesses.

Quick Benchmark¶

# Run standalone performance benchmark
just benchmark

# Run pytest-based performance tests
just test -m slow

Real-World Performance Results¶

Based on benchmarks simulating typical SSG workflows (120+ component tree, multiple pages):

Operation	Cold Cache	Warm Cache	Speedup
`make_traversable()` - module access	25.8μs	1.4μs	17.9x faster
`make_traversable()` - package path	~25μs	1.3μs	19x faster
`make_path_nodes()` - tree transform	758μs	758μs	(no change)
`render_path_nodes()` - per page	684μs	684μs	(no change)

Cache Impact: 1688% faster (17.9x) with warm cache ✓ EXCELLENT

Why This Matters¶

SSG Scenario: Building 100 pages with the same component:

Without cache: 100 × 25μs = 2,500μs = 2.5ms
With cache: 1 × 25μs + 99 × 1.4μs = 164μs = 0.16ms
Savings: 94% faster (2.34ms saved)

For sites with 1000+ pages, the savings are even more dramatic.

Performance Characteristics¶

Excellent:

Path resolution (cached): 1.4μs ✓ EXCELLENT
Module loading optimization: 17.9x speedup ✓ EXCELLENT

Good:

Tree traversal: ~450μs for 120+ components ✓ GOOD
Multi-page rendering: 684μs/page ✓ GOOD

How the Cache Works¶

The library uses @lru_cache(maxsize=128) for module loading via importlib.resources.files():

from functools import lru_cache
from importlib.resources import files
from tdom_path.webpath import Traversable

@lru_cache(maxsize=128)
def _get_module_files(module_name: str) -> Traversable:
    """Cache Traversable roots to avoid repeated module loading."""
    return files(module_name)

First access (cold cache):

Loads module metadata: ~20μs
Sets up resource reader: ~5μs
Total: ~25μs

Subsequent accesses (warm cache):

Dictionary lookup: ~1.4μs
Total: ~1.4μs

Cache benefits:

Zero overhead on first use
Massive speedup on repeated use
Automatic cleanup (LRU eviction)
Thread-safe (Python’s LRU cache is lock-based)

Running Benchmarks¶

Standalone Benchmark¶

just benchmark

This runs a comprehensive benchmark suite that measures:

Cold vs warm cache performance
SSG workflow simulation (multi-page rendering)
Clear performance analysis with thresholds
Real-world usage patterns

Pytest-Based Tests¶

# Run performance tests
just test -m slow

# Run with free-threaded Python (regression detection)
just test-freethreaded -m slow

# Run with parallel execution (8 threads, 10 iterations)
just test-freethreaded -m slow --threads=8 --iterations=10

Benchmark Infrastructure¶

The test infrastructure uses:

pytest-benchmark for standardized timing
tracemalloc for memory profiling
Realistic test data (100+ component trees)
Free-threaded Python compatibility
Baseline metrics documented in tests

Free-threaded Python Testing¶

The library is tested with Python’s free-threaded mode (GIL-less Python) to ensure:

No threading regressions
Thread-safe cache operations
Consistent performance across Python versions

just test-freethreaded -m slow

Performance Optimization Tips¶

Reuse components - Same component across pages = cache hits
Build incrementally - Keep Python process alive between builds
Use package paths - Already optimized with cache
Profile your workflow - Use just benchmark to measure your patterns
Monitor cache - Check _get_module_files.cache_info() for hit/miss ratio

The library is designed for the common case: building multiple pages with shared components. The LRU cache ensures this workflow is extremely fast.

Profiling Tools¶

The library includes standalone profiling tools for performance analysis:

# Run comprehensive benchmark suite
just benchmark

# Profile specific operations
uv run python -m tdom_path.profiling.benchmark

Benchmark features:

Cold vs warm cache comparison
SSG workflow simulation (multi-page rendering)
Clear performance analysis with thresholds
Real-world usage patterns

Optimization Details¶

What was optimized:

Module loading via importlib.resources.files() (80% of transformation time)
Added LRU cache for Traversable module roots
One-line change at call sites

What wasn’t optimized (and why):

Tree traversal - already efficient (~2μs per node)
Path calculations - necessary operations
isinstance() checks - highly optimized in CPython

Memory Usage¶

LRU cache: ~128 entries × ~1KB = ~128KB max
Per operation: Minimal overhead (~10-50KB)
Tree operations: Linear with tree size (~1-5MB for 100+ components)

When to Expect Peak Performance¶

Best case (warm cache):

SSG workflows (reusing components)
Long-running servers (modules stay loaded)
Component libraries (shared across pages)
Development with hot reload (cache persists)

First-time use (cold cache):

Initial page build
Fresh Python process
New module references
Still fast (25μs), just not cached

Monitoring Cache Performance¶

You can monitor cache statistics to understand hit rates:

from tdom_path.webpath import _get_module_files

# Check cache info
info = _get_module_files.cache_info()
print(f"Cache hits: {info.hits}")
print(f"Cache misses: {info.misses}")
print(f"Hit rate: {info.hits / (info.hits + info.misses):.1%}")

Performance Thresholds¶

The library targets these performance thresholds:

Operation	Target	Status
Path resolution (warm)	< 2μs	✅ 1.4μs
Tree transformation	< 1ms	✅ 758μs
Page rendering	< 1ms	✅ 684μs
Memory overhead	< 1MB	✅ 128KB

Conclusion¶

tdom-path is optimized for the common SSG use case: building multiple pages with shared components. The LRU cache provides massive speedups for repeated operations, making it ideal for:

Static site generators
Component libraries
Reusable web components
Framework-agnostic asset management

The library achieves 17.9x speedup with warm cache while maintaining simple, clean APIs.