Files
horus/benches/README.md
2025-11-18 20:39:25 +01:00

207 lines
5.9 KiB
Markdown

# Horus Stack Benchmarks
Comprehensive benchmark suite for the entire Horus stack, testing performance through the client APIs.
## Overview
These benchmarks test the full Horus system including:
- **Supervisor API** - Job management, runner coordination
- **Coordinator API** - Job routing and execution
- **Osiris API** - REST API for data queries
All benchmarks interact with the stack through the official client libraries in `/lib/clients`, which is the only supported way to interact with the system.
## Prerequisites
Before running benchmarks, you must have the Horus stack running:
```bash
# Start Redis
redis-server
# Start all Horus services
cd /Users/timurgordon/code/git.ourworld.tf/herocode/horus
RUST_LOG=info ./target/release/horus all --admin-secret SECRET --kill-ports
```
The benchmarks expect:
- **Supervisor** running on `http://127.0.0.1:3030`
- **Coordinator** running on `http://127.0.0.1:9652` (HTTP) and `ws://127.0.0.1:9653` (WebSocket)
- **Osiris** running on `http://127.0.0.1:8081`
- **Redis** running on `127.0.0.1:6379`
- Admin secret: `SECRET`
## Running Benchmarks
### Run all benchmarks
```bash
cargo bench --bench horus_stack
```
### Run specific benchmark
```bash
cargo bench --bench horus_stack -- supervisor_discovery
```
### Run with specific filter
```bash
cargo bench --bench horus_stack -- concurrent
```
### Generate detailed reports
```bash
cargo bench --bench horus_stack -- --verbose
```
## Benchmark Categories
### 1. API Discovery & Metadata (`horus_stack`)
- `supervisor_discovery` - OpenRPC metadata retrieval
- `supervisor_get_info` - Supervisor information and stats
### 2. Runner Management (`horus_stack`)
- `supervisor_list_runners` - List all registered runners
- `get_all_runner_status` - Get status of all runners
### 3. Job Operations (`horus_stack`)
- `supervisor_job_create` - Create job without execution
- `supervisor_job_list` - List all jobs
- `job_full_lifecycle` - Complete job lifecycle (create → execute → result)
### 4. Concurrency Tests (`horus_stack`)
- `concurrent_jobs` - Submit multiple jobs concurrently (1, 5, 10, 20 jobs)
### 5. Health & Monitoring (`horus_stack`)
- `osiris_health_check` - Osiris server health endpoint
### 6. API Latency (`horus_stack`)
- `api_latency/supervisor_info` - Supervisor info latency
- `api_latency/runner_list` - Runner list latency
- `api_latency/job_list` - Job list latency
### 7. Stress Tests (`stress_test`)
- `stress_high_frequency_jobs` - High-frequency submissions (50-200 jobs)
- `stress_sustained_load` - Continuous load testing
- `stress_large_payloads` - Large payload handling (1KB-100KB)
- `stress_rapid_api_calls` - Rapid API calls (100 calls/iteration)
- `stress_mixed_workload` - Mixed operation scenarios
- `stress_connection_pool` - Connection pool exhaustion (10-100 clients)
### 8. Memory Usage (`memory_usage`)
- `memory_job_creation` - Memory per job object (10-200 jobs)
- `memory_client_creation` - Memory per client instance (1-100 clients)
- `memory_payload_sizes` - Memory vs payload size (1KB-1MB)
See [MEMORY_BENCHMARKS.md](./MEMORY_BENCHMARKS.md) for detailed memory profiling documentation.
## Interpreting Results
Criterion outputs detailed statistics including:
- **Mean time** - Average execution time
- **Std deviation** - Variability in measurements
- **Median** - Middle value (50th percentile)
- **MAD** - Median Absolute Deviation
- **Throughput** - Operations per second
Results are saved in `target/criterion/` with:
- HTML reports with graphs
- JSON data for further analysis
- Historical comparison with previous runs
## Performance Targets
Expected performance (on modern hardware):
| Benchmark | Target | Notes |
|-----------|--------|-------|
| supervisor_discovery | < 10ms | Metadata retrieval |
| supervisor_get_info | < 5ms | Simple info query |
| supervisor_list_runners | < 5ms | List operation |
| supervisor_job_create | < 10ms | Job creation only |
| job_full_lifecycle | < 100ms | Full execution cycle |
| osiris_health_check | < 2ms | Health endpoint |
| concurrent_jobs (10) | < 500ms | 10 parallel jobs |
## Customization
To modify benchmark parameters, edit `benches/horus_stack.rs`:
```rust
// Change URLs
const SUPERVISOR_URL: &str = "http://127.0.0.1:3030";
const OSIRIS_URL: &str = "http://127.0.0.1:8081";
// Change admin secret
const ADMIN_SECRET: &str = "SECRET";
// Adjust concurrent job counts
for num_jobs in [1, 5, 10, 20, 50].iter() {
// ...
}
```
## CI/CD Integration
To run benchmarks in CI without the full stack:
```bash
# Run only fast benchmarks
cargo bench --bench horus_stack -- --quick
# Save baseline for comparison
cargo bench --bench horus_stack -- --save-baseline main
# Compare against baseline
cargo bench --bench horus_stack -- --baseline main
```
## Troubleshooting
### "Connection refused" errors
- Ensure the Horus stack is running
- Check that all services are listening on expected ports
- Verify firewall settings
### "Job execution timeout" errors
- Increase timeout values in benchmark code
- Check that runners are properly registered
- Verify Redis is accessible
### Inconsistent results
- Close other applications to reduce system load
- Run benchmarks multiple times for statistical significance
- Use `--warm-up-time` flag to increase warm-up period
## Adding New Benchmarks
To add a new benchmark:
1. Create a new function in `benches/horus_stack.rs`:
```rust
fn bench_my_feature(c: &mut Criterion) {
let rt = create_runtime();
let client = /* create client */;
c.bench_function("my_feature", |b| {
b.to_async(&rt).iter(|| async {
// Your benchmark code
});
});
}
```
2. Add to the criterion_group:
```rust
criterion_group!(
benches,
// ... existing benchmarks
bench_my_feature,
);
```
## Resources
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
- [Horus Client Documentation](../lib/clients/)
- [Performance Tuning Guide](../docs/performance.md)