horus/benches/MEMORY_BENCHMARKS.md

# Memory Usage Benchmarks

Benchmarks for measuring memory consumption of the Horus stack components.

## Overview

The memory benchmarks measure heap memory usage for various operations:
- Job creation and storage
- Client instantiation
- Payload size impact
- Memory growth under load

## Benchmarks

### 1. `memory_job_creation`
Measures memory usage when creating multiple Job objects in memory.

**Test sizes**: 10, 50, 100, 200 jobs

**What it measures**:
- Memory allocated per job object
- Heap growth with increasing job count
- Memory efficiency of Job structure

**Expected results**:
- Linear memory growth with job count
- ~1-2 KB per job object (depending on payload)

### 2. `memory_client_creation`
Measures memory overhead of creating multiple Supervisor client instances.

**Test sizes**: 1, 10, 50, 100 clients

**What it measures**:
- Memory per client instance
- Connection pool overhead
- HTTP client memory footprint

**Expected results**:
- ~10-50 KB per client instance
- Includes HTTP client, connection pools, and buffers

### 3. `memory_payload_sizes`
Measures memory usage with different payload sizes.

**Test sizes**: 1KB, 10KB, 100KB, 1MB payloads

**What it measures**:
- Memory overhead of JSON serialization
- String allocation costs
- Payload storage efficiency

**Expected results**:
- Memory usage should scale linearly with payload size
- Small overhead for JSON structure (~5-10%)

## Running Memory Benchmarks

```bash
# Run all memory benchmarks
cargo bench --bench memory_usage

# Run specific memory test
cargo bench --bench memory_usage -- memory_job_creation

# Run with verbose output to see memory deltas
cargo bench --bench memory_usage -- --verbose
```

## Interpreting Results

The benchmarks print memory deltas to stderr during execution:

```
Memory delta for 100 jobs: 156 KB
Memory delta for 50 clients: 2048 KB
Memory delta for 100KB payload: 105 KB
```

### Memory Delta Interpretation

- **Positive delta**: Memory was allocated during the operation
- **Zero delta**: No significant memory change (may be reusing existing allocations)
- **Negative delta**: Memory was freed (garbage collection, deallocations)

### Platform Differences

**macOS**: Uses `ps` command to read RSS (Resident Set Size)
**Linux**: Reads `/proc/self/status` for VmRSS

RSS includes:
- Heap allocations
- Stack memory
- Shared libraries (mapped into process)
- Memory-mapped files

## Limitations

1. **Granularity**: OS-level memory reporting may not capture small allocations
2. **Timing**: Memory measurements happen before/after operations, not continuously
3. **GC effects**: Rust's allocator may not immediately release memory to OS
4. **Shared memory**: RSS includes shared library memory

## Best Practices

### For Accurate Measurements

1. **Run multiple iterations**: Criterion handles this automatically
2. **Warm up**: First iterations may show higher memory due to lazy initialization
3. **Isolate tests**: Run memory benchmarks separately from performance benchmarks
4. **Monitor trends**: Compare results over time, not absolute values

### Memory Optimization Tips

If benchmarks show high memory usage:

1. **Check payload sizes**: Large payloads consume proportional memory
2. **Limit concurrent operations**: Too many simultaneous jobs/clients increase memory
3. **Review data structures**: Ensure efficient serialization
4. **Profile with tools**: Use `heaptrack` (Linux) or `instruments` (macOS) for detailed analysis

## Advanced Profiling

For detailed memory profiling beyond these benchmarks:

### macOS
```bash
# Use Instruments
instruments -t Allocations -D memory_trace.trace ./target/release/horus

# Use heap profiler
cargo install cargo-instruments
cargo instruments --bench memory_usage --template Allocations
```

### Linux
```bash
# Use Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out \
    ./target/release/deps/memory_usage-*

# Visualize with massif-visualizer
massif-visualizer massif.out

# Use heaptrack
heaptrack ./target/release/deps/memory_usage-*
heaptrack_gui heaptrack.memory_usage.*.gz
```

### Cross-platform
```bash
# Use dhat (heap profiler)
cargo install dhat
# Add dhat to your benchmark and run
cargo bench --bench memory_usage --features dhat-heap
```

## Continuous Monitoring

Integrate memory benchmarks into CI/CD:

```bash
# Run and save baseline
cargo bench --bench memory_usage -- --save-baseline memory-main

# Compare in PR
cargo bench --bench memory_usage -- --baseline memory-main

# Fail if memory usage increases >10%
# (requires custom scripting to parse Criterion output)
```

## Troubleshooting

### "Memory delta is always 0"
- OS may not update RSS immediately
- Allocations might be too small to measure
- Try increasing iteration count or operation size

### "Memory keeps growing"
- Check for memory leaks
- Verify objects are being dropped
- Use `cargo clippy` to find potential issues

### "Results are inconsistent"
- Other processes may be affecting measurements
- Run benchmarks on idle system
- Increase sample size in benchmark code

## Example Output

```
memory_job_creation/10  time:   [45.2 µs 46.1 µs 47.3 µs]
Memory delta for 10 jobs: 24 KB

memory_job_creation/50  time:   [198.4 µs 201.2 µs 204.8 µs]
Memory delta for 50 jobs: 98 KB

memory_job_creation/100 time:   [387.6 µs 392.1 µs 397.4 µs]
Memory delta for 100 jobs: 187 KB

memory_client_creation/1    time:   [234.5 µs 238.2 µs 242.6 µs]
Memory delta for 1 clients: 45 KB

memory_payload_sizes/1KB    time:   [12.3 µs 12.6 µs 13.0 µs]
Memory delta for 1KB payload: 2 KB

memory_payload_sizes/100KB  time:   [156.7 µs 159.4 µs 162.8 µs]
Memory delta for 100KB payload: 105 KB
```

## Related Documentation

- [Performance Benchmarks](./README.md)
- [Stress Tests](./README.md#stress-tests)
- [Rust Performance Book](https://nnethercote.github.io/perf-book/)
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)