herocode/horus

Fork 1

Files

Timur Gordon 4142f62e54 add complete binary and benchmarking

2025-11-18 20:39:25 +01:00

5.8 KiB

Raw Blame History

Memory Usage Benchmarks

Benchmarks for measuring memory consumption of the Horus stack components.

Overview

The memory benchmarks measure heap memory usage for various operations:

Job creation and storage
Client instantiation
Payload size impact
Memory growth under load

Benchmarks

1. `memory_job_creation`

Measures memory usage when creating multiple Job objects in memory.

Test sizes: 10, 50, 100, 200 jobs

What it measures:

Memory allocated per job object
Heap growth with increasing job count
Memory efficiency of Job structure

Expected results:

Linear memory growth with job count
~1-2 KB per job object (depending on payload)

2. `memory_client_creation`

Measures memory overhead of creating multiple Supervisor client instances.

Test sizes: 1, 10, 50, 100 clients

What it measures:

Memory per client instance
Connection pool overhead
HTTP client memory footprint

Expected results:

~10-50 KB per client instance
Includes HTTP client, connection pools, and buffers

3. `memory_payload_sizes`

Measures memory usage with different payload sizes.

Test sizes: 1KB, 10KB, 100KB, 1MB payloads

What it measures:

Memory overhead of JSON serialization
String allocation costs
Payload storage efficiency

Expected results:

Memory usage should scale linearly with payload size
Small overhead for JSON structure (~5-10%)

Running Memory Benchmarks

# Run all memory benchmarks
cargo bench --bench memory_usage

# Run specific memory test
cargo bench --bench memory_usage -- memory_job_creation

# Run with verbose output to see memory deltas
cargo bench --bench memory_usage -- --verbose

Interpreting Results

The benchmarks print memory deltas to stderr during execution:

Memory delta for 100 jobs: 156 KB
Memory delta for 50 clients: 2048 KB
Memory delta for 100KB payload: 105 KB

Memory Delta Interpretation

Positive delta: Memory was allocated during the operation
Zero delta: No significant memory change (may be reusing existing allocations)
Negative delta: Memory was freed (garbage collection, deallocations)

Platform Differences

macOS: Uses ps command to read RSS (Resident Set Size) Linux: Reads /proc/self/status for VmRSS

RSS includes:

Heap allocations
Stack memory
Shared libraries (mapped into process)
Memory-mapped files

Limitations

Granularity: OS-level memory reporting may not capture small allocations
Timing: Memory measurements happen before/after operations, not continuously
GC effects: Rust's allocator may not immediately release memory to OS
Shared memory: RSS includes shared library memory

Best Practices

For Accurate Measurements

Run multiple iterations: Criterion handles this automatically
Warm up: First iterations may show higher memory due to lazy initialization
Isolate tests: Run memory benchmarks separately from performance benchmarks
Monitor trends: Compare results over time, not absolute values

Memory Optimization Tips

If benchmarks show high memory usage:

Check payload sizes: Large payloads consume proportional memory
Limit concurrent operations: Too many simultaneous jobs/clients increase memory
Review data structures: Ensure efficient serialization
Profile with tools: Use heaptrack (Linux) or instruments (macOS) for detailed analysis

Advanced Profiling

For detailed memory profiling beyond these benchmarks:

macOS

# Use Instruments
instruments -t Allocations -D memory_trace.trace ./target/release/horus

# Use heap profiler
cargo install cargo-instruments
cargo instruments --bench memory_usage --template Allocations

Linux

# Use Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out \
    ./target/release/deps/memory_usage-*

# Visualize with massif-visualizer
massif-visualizer massif.out

# Use heaptrack
heaptrack ./target/release/deps/memory_usage-*
heaptrack_gui heaptrack.memory_usage.*.gz

Cross-platform

# Use dhat (heap profiler)
cargo install dhat
# Add dhat to your benchmark and run
cargo bench --bench memory_usage --features dhat-heap

Continuous Monitoring

Integrate memory benchmarks into CI/CD:

# Run and save baseline
cargo bench --bench memory_usage -- --save-baseline memory-main

# Compare in PR
cargo bench --bench memory_usage -- --baseline memory-main

# Fail if memory usage increases >10%
# (requires custom scripting to parse Criterion output)

Troubleshooting

"Memory delta is always 0"

OS may not update RSS immediately
Allocations might be too small to measure
Try increasing iteration count or operation size

"Memory keeps growing"

Check for memory leaks
Verify objects are being dropped
Use cargo clippy to find potential issues

"Results are inconsistent"

Other processes may be affecting measurements
Run benchmarks on idle system
Increase sample size in benchmark code

Example Output

memory_job_creation/10  time:   [45.2 µs 46.1 µs 47.3 µs]
Memory delta for 10 jobs: 24 KB

memory_job_creation/50  time:   [198.4 µs 201.2 µs 204.8 µs]
Memory delta for 50 jobs: 98 KB

memory_job_creation/100 time:   [387.6 µs 392.1 µs 397.4 µs]
Memory delta for 100 jobs: 187 KB

memory_client_creation/1    time:   [234.5 µs 238.2 µs 242.6 µs]
Memory delta for 1 clients: 45 KB

memory_payload_sizes/1KB    time:   [12.3 µs 12.6 µs 13.0 µs]
Memory delta for 1KB payload: 2 KB

memory_payload_sizes/100KB  time:   [156.7 µs 159.4 µs 162.8 µs]
Memory delta for 100KB payload: 105 KB

5.8 KiB Raw Blame History

Memory Usage Benchmarks

Overview

Benchmarks

1. memory_job_creation

2. memory_client_creation

3. memory_payload_sizes

Running Memory Benchmarks

Interpreting Results

Memory Delta Interpretation

Platform Differences

Limitations

Best Practices

For Accurate Measurements

Memory Optimization Tips

Advanced Profiling

macOS

Linux

Cross-platform

Continuous Monitoring

Troubleshooting

"Memory delta is always 0"

"Memory keeps growing"

"Results are inconsistent"

Example Output

Related Documentation

5.8 KiB

Raw Blame History

1. `memory_job_creation`

2. `memory_client_creation`

3. `memory_payload_sizes`