Files
horus/benches/MEMORY_BENCHMARKS.md
2025-11-18 20:39:25 +01:00

5.8 KiB

Memory Usage Benchmarks

Benchmarks for measuring memory consumption of the Horus stack components.

Overview

The memory benchmarks measure heap memory usage for various operations:

  • Job creation and storage
  • Client instantiation
  • Payload size impact
  • Memory growth under load

Benchmarks

1. memory_job_creation

Measures memory usage when creating multiple Job objects in memory.

Test sizes: 10, 50, 100, 200 jobs

What it measures:

  • Memory allocated per job object
  • Heap growth with increasing job count
  • Memory efficiency of Job structure

Expected results:

  • Linear memory growth with job count
  • ~1-2 KB per job object (depending on payload)

2. memory_client_creation

Measures memory overhead of creating multiple Supervisor client instances.

Test sizes: 1, 10, 50, 100 clients

What it measures:

  • Memory per client instance
  • Connection pool overhead
  • HTTP client memory footprint

Expected results:

  • ~10-50 KB per client instance
  • Includes HTTP client, connection pools, and buffers

3. memory_payload_sizes

Measures memory usage with different payload sizes.

Test sizes: 1KB, 10KB, 100KB, 1MB payloads

What it measures:

  • Memory overhead of JSON serialization
  • String allocation costs
  • Payload storage efficiency

Expected results:

  • Memory usage should scale linearly with payload size
  • Small overhead for JSON structure (~5-10%)

Running Memory Benchmarks

# Run all memory benchmarks
cargo bench --bench memory_usage

# Run specific memory test
cargo bench --bench memory_usage -- memory_job_creation

# Run with verbose output to see memory deltas
cargo bench --bench memory_usage -- --verbose

Interpreting Results

The benchmarks print memory deltas to stderr during execution:

Memory delta for 100 jobs: 156 KB
Memory delta for 50 clients: 2048 KB
Memory delta for 100KB payload: 105 KB

Memory Delta Interpretation

  • Positive delta: Memory was allocated during the operation
  • Zero delta: No significant memory change (may be reusing existing allocations)
  • Negative delta: Memory was freed (garbage collection, deallocations)

Platform Differences

macOS: Uses ps command to read RSS (Resident Set Size) Linux: Reads /proc/self/status for VmRSS

RSS includes:

  • Heap allocations
  • Stack memory
  • Shared libraries (mapped into process)
  • Memory-mapped files

Limitations

  1. Granularity: OS-level memory reporting may not capture small allocations
  2. Timing: Memory measurements happen before/after operations, not continuously
  3. GC effects: Rust's allocator may not immediately release memory to OS
  4. Shared memory: RSS includes shared library memory

Best Practices

For Accurate Measurements

  1. Run multiple iterations: Criterion handles this automatically
  2. Warm up: First iterations may show higher memory due to lazy initialization
  3. Isolate tests: Run memory benchmarks separately from performance benchmarks
  4. Monitor trends: Compare results over time, not absolute values

Memory Optimization Tips

If benchmarks show high memory usage:

  1. Check payload sizes: Large payloads consume proportional memory
  2. Limit concurrent operations: Too many simultaneous jobs/clients increase memory
  3. Review data structures: Ensure efficient serialization
  4. Profile with tools: Use heaptrack (Linux) or instruments (macOS) for detailed analysis

Advanced Profiling

For detailed memory profiling beyond these benchmarks:

macOS

# Use Instruments
instruments -t Allocations -D memory_trace.trace ./target/release/horus

# Use heap profiler
cargo install cargo-instruments
cargo instruments --bench memory_usage --template Allocations

Linux

# Use Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out \
    ./target/release/deps/memory_usage-*

# Visualize with massif-visualizer
massif-visualizer massif.out

# Use heaptrack
heaptrack ./target/release/deps/memory_usage-*
heaptrack_gui heaptrack.memory_usage.*.gz

Cross-platform

# Use dhat (heap profiler)
cargo install dhat
# Add dhat to your benchmark and run
cargo bench --bench memory_usage --features dhat-heap

Continuous Monitoring

Integrate memory benchmarks into CI/CD:

# Run and save baseline
cargo bench --bench memory_usage -- --save-baseline memory-main

# Compare in PR
cargo bench --bench memory_usage -- --baseline memory-main

# Fail if memory usage increases >10%
# (requires custom scripting to parse Criterion output)

Troubleshooting

"Memory delta is always 0"

  • OS may not update RSS immediately
  • Allocations might be too small to measure
  • Try increasing iteration count or operation size

"Memory keeps growing"

  • Check for memory leaks
  • Verify objects are being dropped
  • Use cargo clippy to find potential issues

"Results are inconsistent"

  • Other processes may be affecting measurements
  • Run benchmarks on idle system
  • Increase sample size in benchmark code

Example Output

memory_job_creation/10  time:   [45.2 µs 46.1 µs 47.3 µs]
Memory delta for 10 jobs: 24 KB

memory_job_creation/50  time:   [198.4 µs 201.2 µs 204.8 µs]
Memory delta for 50 jobs: 98 KB

memory_job_creation/100 time:   [387.6 µs 392.1 µs 397.4 µs]
Memory delta for 100 jobs: 187 KB

memory_client_creation/1    time:   [234.5 µs 238.2 µs 242.6 µs]
Memory delta for 1 clients: 45 KB

memory_payload_sizes/1KB    time:   [12.3 µs 12.6 µs 13.0 µs]
Memory delta for 1KB payload: 2 KB

memory_payload_sizes/100KB  time:   [156.7 µs 159.4 µs 162.8 µs]
Memory delta for 100KB payload: 105 KB