Performance Benchmarks¶
tdom-path is highly optimized for real-world usage, particularly Static Site Generation (SSG) workflows where components are reused across multiple pages. The library uses LRU caching for module loading, providing 17.9x speedup for cached accesses.
Quick Benchmark¶
# Run standalone performance benchmark
just benchmark
# Run pytest-based performance tests
just test -m slow
Real-World Performance Results¶
Based on benchmarks simulating typical SSG workflows (120+ component tree, multiple pages):
Operation |
Cold Cache |
Warm Cache |
Speedup |
|---|---|---|---|
|
25.8μs |
1.4μs |
17.9x faster |
|
~25μs |
1.3μs |
19x faster |
|
758μs |
758μs |
(no change) |
|
684μs |
684μs |
(no change) |
Cache Impact: 1688% faster (17.9x) with warm cache ✓ EXCELLENT
Why This Matters¶
SSG Scenario: Building 100 pages with the same component:
Without cache: 100 × 25μs = 2,500μs = 2.5ms
With cache: 1 × 25μs + 99 × 1.4μs = 164μs = 0.16ms
Savings: 94% faster (2.34ms saved)
For sites with 1000+ pages, the savings are even more dramatic.
Performance Characteristics¶
Excellent:
Path resolution (cached): 1.4μs ✓ EXCELLENT
Module loading optimization: 17.9x speedup ✓ EXCELLENT
Good:
Tree traversal: ~450μs for 120+ components ✓ GOOD
Multi-page rendering: 684μs/page ✓ GOOD
How the Cache Works¶
The library uses @lru_cache(maxsize=128) for module loading via importlib.resources.files():
from functools import lru_cache
from importlib.resources import files
from tdom_path.webpath import Traversable
@lru_cache(maxsize=128)
def _get_module_files(module_name: str) -> Traversable:
"""Cache Traversable roots to avoid repeated module loading."""
return files(module_name)
First access (cold cache):
Loads module metadata: ~20μs
Sets up resource reader: ~5μs
Total: ~25μs
Subsequent accesses (warm cache):
Dictionary lookup: ~1.4μs
Total: ~1.4μs
Cache benefits:
Zero overhead on first use
Massive speedup on repeated use
Automatic cleanup (LRU eviction)
Thread-safe (Python’s LRU cache is lock-based)
Running Benchmarks¶
Standalone Benchmark¶
just benchmark
This runs a comprehensive benchmark suite that measures:
Cold vs warm cache performance
SSG workflow simulation (multi-page rendering)
Clear performance analysis with thresholds
Real-world usage patterns
Pytest-Based Tests¶
# Run performance tests
just test -m slow
# Run with free-threaded Python (regression detection)
just test-freethreaded -m slow
# Run with parallel execution (8 threads, 10 iterations)
just test-freethreaded -m slow --threads=8 --iterations=10
Benchmark Infrastructure¶
The test infrastructure uses:
pytest-benchmark for standardized timing
tracemalloc for memory profiling
Realistic test data (100+ component trees)
Free-threaded Python compatibility
Baseline metrics documented in tests
Free-threaded Python Testing¶
The library is tested with Python’s free-threaded mode (GIL-less Python) to ensure:
No threading regressions
Thread-safe cache operations
Consistent performance across Python versions
just test-freethreaded -m slow
Performance Optimization Tips¶
Reuse components - Same component across pages = cache hits
Build incrementally - Keep Python process alive between builds
Use package paths - Already optimized with cache
Profile your workflow - Use
just benchmarkto measure your patternsMonitor cache - Check
_get_module_files.cache_info()for hit/miss ratio
The library is designed for the common case: building multiple pages with shared components. The LRU cache ensures this workflow is extremely fast.
Profiling Tools¶
The library includes standalone profiling tools for performance analysis:
# Run comprehensive benchmark suite
just benchmark
# Profile specific operations
uv run python -m tdom_path.profiling.benchmark
Benchmark features:
Cold vs warm cache comparison
SSG workflow simulation (multi-page rendering)
Clear performance analysis with thresholds
Real-world usage patterns
Optimization Details¶
What was optimized:
Module loading via
importlib.resources.files()(80% of transformation time)Added LRU cache for Traversable module roots
One-line change at call sites
What wasn’t optimized (and why):
Tree traversal - already efficient (~2μs per node)
Path calculations - necessary operations
isinstance() checks - highly optimized in CPython
Memory Usage¶
LRU cache: ~128 entries × ~1KB = ~128KB max
Per operation: Minimal overhead (~10-50KB)
Tree operations: Linear with tree size (~1-5MB for 100+ components)
When to Expect Peak Performance¶
Best case (warm cache):
SSG workflows (reusing components)
Long-running servers (modules stay loaded)
Component libraries (shared across pages)
Development with hot reload (cache persists)
First-time use (cold cache):
Initial page build
Fresh Python process
New module references
Still fast (25μs), just not cached
Monitoring Cache Performance¶
You can monitor cache statistics to understand hit rates:
from tdom_path.webpath import _get_module_files
# Check cache info
info = _get_module_files.cache_info()
print(f"Cache hits: {info.hits}")
print(f"Cache misses: {info.misses}")
print(f"Hit rate: {info.hits / (info.hits + info.misses):.1%}")
Performance Thresholds¶
The library targets these performance thresholds:
Operation |
Target |
Status |
|---|---|---|
Path resolution (warm) |
< 2μs |
✅ 1.4μs |
Tree transformation |
< 1ms |
✅ 758μs |
Page rendering |
< 1ms |
✅ 684μs |
Memory overhead |
< 1MB |
✅ 128KB |
Conclusion¶
tdom-path is optimized for the common SSG use case: building multiple pages with shared components. The LRU cache provides massive speedups for repeated operations, making it ideal for:
Static site generators
Component libraries
Reusable web components
Framework-agnostic asset management
The library achieves 17.9x speedup with warm cache while maintaining simple, clean APIs.