207 lines
5.9 KiB
Markdown
207 lines
5.9 KiB
Markdown
# Horus Stack Benchmarks
|
|
|
|
Comprehensive benchmark suite for the entire Horus stack, testing performance through the client APIs.
|
|
|
|
## Overview
|
|
|
|
These benchmarks test the full Horus system including:
|
|
- **Supervisor API** - Job management, runner coordination
|
|
- **Coordinator API** - Job routing and execution
|
|
- **Osiris API** - REST API for data queries
|
|
|
|
All benchmarks interact with the stack through the official client libraries in `/lib/clients`, which is the only supported way to interact with the system.
|
|
|
|
## Prerequisites
|
|
|
|
Before running benchmarks, you must have the Horus stack running:
|
|
|
|
```bash
|
|
# Start Redis
|
|
redis-server
|
|
|
|
# Start all Horus services
|
|
cd /Users/timurgordon/code/git.ourworld.tf/herocode/horus
|
|
RUST_LOG=info ./target/release/horus all --admin-secret SECRET --kill-ports
|
|
```
|
|
|
|
The benchmarks expect:
|
|
- **Supervisor** running on `http://127.0.0.1:3030`
|
|
- **Coordinator** running on `http://127.0.0.1:9652` (HTTP) and `ws://127.0.0.1:9653` (WebSocket)
|
|
- **Osiris** running on `http://127.0.0.1:8081`
|
|
- **Redis** running on `127.0.0.1:6379`
|
|
- Admin secret: `SECRET`
|
|
|
|
## Running Benchmarks
|
|
|
|
### Run all benchmarks
|
|
```bash
|
|
cargo bench --bench horus_stack
|
|
```
|
|
|
|
### Run specific benchmark
|
|
```bash
|
|
cargo bench --bench horus_stack -- supervisor_discovery
|
|
```
|
|
|
|
### Run with specific filter
|
|
```bash
|
|
cargo bench --bench horus_stack -- concurrent
|
|
```
|
|
|
|
### Generate detailed reports
|
|
```bash
|
|
cargo bench --bench horus_stack -- --verbose
|
|
```
|
|
|
|
## Benchmark Categories
|
|
|
|
### 1. API Discovery & Metadata (`horus_stack`)
|
|
- `supervisor_discovery` - OpenRPC metadata retrieval
|
|
- `supervisor_get_info` - Supervisor information and stats
|
|
|
|
### 2. Runner Management (`horus_stack`)
|
|
- `supervisor_list_runners` - List all registered runners
|
|
- `get_all_runner_status` - Get status of all runners
|
|
|
|
### 3. Job Operations (`horus_stack`)
|
|
- `supervisor_job_create` - Create job without execution
|
|
- `supervisor_job_list` - List all jobs
|
|
- `job_full_lifecycle` - Complete job lifecycle (create → execute → result)
|
|
|
|
### 4. Concurrency Tests (`horus_stack`)
|
|
- `concurrent_jobs` - Submit multiple jobs concurrently (1, 5, 10, 20 jobs)
|
|
|
|
### 5. Health & Monitoring (`horus_stack`)
|
|
- `osiris_health_check` - Osiris server health endpoint
|
|
|
|
### 6. API Latency (`horus_stack`)
|
|
- `api_latency/supervisor_info` - Supervisor info latency
|
|
- `api_latency/runner_list` - Runner list latency
|
|
- `api_latency/job_list` - Job list latency
|
|
|
|
### 7. Stress Tests (`stress_test`)
|
|
- `stress_high_frequency_jobs` - High-frequency submissions (50-200 jobs)
|
|
- `stress_sustained_load` - Continuous load testing
|
|
- `stress_large_payloads` - Large payload handling (1KB-100KB)
|
|
- `stress_rapid_api_calls` - Rapid API calls (100 calls/iteration)
|
|
- `stress_mixed_workload` - Mixed operation scenarios
|
|
- `stress_connection_pool` - Connection pool exhaustion (10-100 clients)
|
|
|
|
### 8. Memory Usage (`memory_usage`)
|
|
- `memory_job_creation` - Memory per job object (10-200 jobs)
|
|
- `memory_client_creation` - Memory per client instance (1-100 clients)
|
|
- `memory_payload_sizes` - Memory vs payload size (1KB-1MB)
|
|
|
|
See [MEMORY_BENCHMARKS.md](./MEMORY_BENCHMARKS.md) for detailed memory profiling documentation.
|
|
|
|
## Interpreting Results
|
|
|
|
Criterion outputs detailed statistics including:
|
|
- **Mean time** - Average execution time
|
|
- **Std deviation** - Variability in measurements
|
|
- **Median** - Middle value (50th percentile)
|
|
- **MAD** - Median Absolute Deviation
|
|
- **Throughput** - Operations per second
|
|
|
|
Results are saved in `target/criterion/` with:
|
|
- HTML reports with graphs
|
|
- JSON data for further analysis
|
|
- Historical comparison with previous runs
|
|
|
|
## Performance Targets
|
|
|
|
Expected performance (on modern hardware):
|
|
|
|
| Benchmark | Target | Notes |
|
|
|-----------|--------|-------|
|
|
| supervisor_discovery | < 10ms | Metadata retrieval |
|
|
| supervisor_get_info | < 5ms | Simple info query |
|
|
| supervisor_list_runners | < 5ms | List operation |
|
|
| supervisor_job_create | < 10ms | Job creation only |
|
|
| job_full_lifecycle | < 100ms | Full execution cycle |
|
|
| osiris_health_check | < 2ms | Health endpoint |
|
|
| concurrent_jobs (10) | < 500ms | 10 parallel jobs |
|
|
|
|
## Customization
|
|
|
|
To modify benchmark parameters, edit `benches/horus_stack.rs`:
|
|
|
|
```rust
|
|
// Change URLs
|
|
const SUPERVISOR_URL: &str = "http://127.0.0.1:3030";
|
|
const OSIRIS_URL: &str = "http://127.0.0.1:8081";
|
|
|
|
// Change admin secret
|
|
const ADMIN_SECRET: &str = "SECRET";
|
|
|
|
// Adjust concurrent job counts
|
|
for num_jobs in [1, 5, 10, 20, 50].iter() {
|
|
// ...
|
|
}
|
|
```
|
|
|
|
## CI/CD Integration
|
|
|
|
To run benchmarks in CI without the full stack:
|
|
|
|
```bash
|
|
# Run only fast benchmarks
|
|
cargo bench --bench horus_stack -- --quick
|
|
|
|
# Save baseline for comparison
|
|
cargo bench --bench horus_stack -- --save-baseline main
|
|
|
|
# Compare against baseline
|
|
cargo bench --bench horus_stack -- --baseline main
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "Connection refused" errors
|
|
- Ensure the Horus stack is running
|
|
- Check that all services are listening on expected ports
|
|
- Verify firewall settings
|
|
|
|
### "Job execution timeout" errors
|
|
- Increase timeout values in benchmark code
|
|
- Check that runners are properly registered
|
|
- Verify Redis is accessible
|
|
|
|
### Inconsistent results
|
|
- Close other applications to reduce system load
|
|
- Run benchmarks multiple times for statistical significance
|
|
- Use `--warm-up-time` flag to increase warm-up period
|
|
|
|
## Adding New Benchmarks
|
|
|
|
To add a new benchmark:
|
|
|
|
1. Create a new function in `benches/horus_stack.rs`:
|
|
```rust
|
|
fn bench_my_feature(c: &mut Criterion) {
|
|
let rt = create_runtime();
|
|
let client = /* create client */;
|
|
|
|
c.bench_function("my_feature", |b| {
|
|
b.to_async(&rt).iter(|| async {
|
|
// Your benchmark code
|
|
});
|
|
});
|
|
}
|
|
```
|
|
|
|
2. Add to the criterion_group:
|
|
```rust
|
|
criterion_group!(
|
|
benches,
|
|
// ... existing benchmarks
|
|
bench_my_feature,
|
|
);
|
|
```
|
|
|
|
## Resources
|
|
|
|
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
|
|
- [Horus Client Documentation](../lib/clients/)
|
|
- [Performance Tuning Guide](../docs/performance.md)
|