5.8 KiB
Memory Usage Benchmarks
Benchmarks for measuring memory consumption of the Horus stack components.
Overview
The memory benchmarks measure heap memory usage for various operations:
- Job creation and storage
- Client instantiation
- Payload size impact
- Memory growth under load
Benchmarks
1. memory_job_creation
Measures memory usage when creating multiple Job objects in memory.
Test sizes: 10, 50, 100, 200 jobs
What it measures:
- Memory allocated per job object
- Heap growth with increasing job count
- Memory efficiency of Job structure
Expected results:
- Linear memory growth with job count
- ~1-2 KB per job object (depending on payload)
2. memory_client_creation
Measures memory overhead of creating multiple Supervisor client instances.
Test sizes: 1, 10, 50, 100 clients
What it measures:
- Memory per client instance
- Connection pool overhead
- HTTP client memory footprint
Expected results:
- ~10-50 KB per client instance
- Includes HTTP client, connection pools, and buffers
3. memory_payload_sizes
Measures memory usage with different payload sizes.
Test sizes: 1KB, 10KB, 100KB, 1MB payloads
What it measures:
- Memory overhead of JSON serialization
- String allocation costs
- Payload storage efficiency
Expected results:
- Memory usage should scale linearly with payload size
- Small overhead for JSON structure (~5-10%)
Running Memory Benchmarks
# Run all memory benchmarks
cargo bench --bench memory_usage
# Run specific memory test
cargo bench --bench memory_usage -- memory_job_creation
# Run with verbose output to see memory deltas
cargo bench --bench memory_usage -- --verbose
Interpreting Results
The benchmarks print memory deltas to stderr during execution:
Memory delta for 100 jobs: 156 KB
Memory delta for 50 clients: 2048 KB
Memory delta for 100KB payload: 105 KB
Memory Delta Interpretation
- Positive delta: Memory was allocated during the operation
- Zero delta: No significant memory change (may be reusing existing allocations)
- Negative delta: Memory was freed (garbage collection, deallocations)
Platform Differences
macOS: Uses ps command to read RSS (Resident Set Size)
Linux: Reads /proc/self/status for VmRSS
RSS includes:
- Heap allocations
- Stack memory
- Shared libraries (mapped into process)
- Memory-mapped files
Limitations
- Granularity: OS-level memory reporting may not capture small allocations
- Timing: Memory measurements happen before/after operations, not continuously
- GC effects: Rust's allocator may not immediately release memory to OS
- Shared memory: RSS includes shared library memory
Best Practices
For Accurate Measurements
- Run multiple iterations: Criterion handles this automatically
- Warm up: First iterations may show higher memory due to lazy initialization
- Isolate tests: Run memory benchmarks separately from performance benchmarks
- Monitor trends: Compare results over time, not absolute values
Memory Optimization Tips
If benchmarks show high memory usage:
- Check payload sizes: Large payloads consume proportional memory
- Limit concurrent operations: Too many simultaneous jobs/clients increase memory
- Review data structures: Ensure efficient serialization
- Profile with tools: Use
heaptrack(Linux) orinstruments(macOS) for detailed analysis
Advanced Profiling
For detailed memory profiling beyond these benchmarks:
macOS
# Use Instruments
instruments -t Allocations -D memory_trace.trace ./target/release/horus
# Use heap profiler
cargo install cargo-instruments
cargo instruments --bench memory_usage --template Allocations
Linux
# Use Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out \
./target/release/deps/memory_usage-*
# Visualize with massif-visualizer
massif-visualizer massif.out
# Use heaptrack
heaptrack ./target/release/deps/memory_usage-*
heaptrack_gui heaptrack.memory_usage.*.gz
Cross-platform
# Use dhat (heap profiler)
cargo install dhat
# Add dhat to your benchmark and run
cargo bench --bench memory_usage --features dhat-heap
Continuous Monitoring
Integrate memory benchmarks into CI/CD:
# Run and save baseline
cargo bench --bench memory_usage -- --save-baseline memory-main
# Compare in PR
cargo bench --bench memory_usage -- --baseline memory-main
# Fail if memory usage increases >10%
# (requires custom scripting to parse Criterion output)
Troubleshooting
"Memory delta is always 0"
- OS may not update RSS immediately
- Allocations might be too small to measure
- Try increasing iteration count or operation size
"Memory keeps growing"
- Check for memory leaks
- Verify objects are being dropped
- Use
cargo clippyto find potential issues
"Results are inconsistent"
- Other processes may be affecting measurements
- Run benchmarks on idle system
- Increase sample size in benchmark code
Example Output
memory_job_creation/10 time: [45.2 µs 46.1 µs 47.3 µs]
Memory delta for 10 jobs: 24 KB
memory_job_creation/50 time: [198.4 µs 201.2 µs 204.8 µs]
Memory delta for 50 jobs: 98 KB
memory_job_creation/100 time: [387.6 µs 392.1 µs 397.4 µs]
Memory delta for 100 jobs: 187 KB
memory_client_creation/1 time: [234.5 µs 238.2 µs 242.6 µs]
Memory delta for 1 clients: 45 KB
memory_payload_sizes/1KB time: [12.3 µs 12.6 µs 13.0 µs]
Memory delta for 1KB payload: 2 KB
memory_payload_sizes/100KB time: [156.7 µs 159.4 µs 162.8 µs]
Memory delta for 100KB payload: 105 KB