# Horus Stack Benchmarks Comprehensive benchmark suite for the entire Horus stack, testing performance through the client APIs. ## Overview These benchmarks test the full Horus system including: - **Supervisor API** - Job management, runner coordination - **Coordinator API** - Job routing and execution - **Osiris API** - REST API for data queries All benchmarks interact with the stack through the official client libraries in `/lib/clients`, which is the only supported way to interact with the system. ## Prerequisites Before running benchmarks, you must have the Horus stack running: ```bash # Start Redis redis-server # Start all Horus services cd /Users/timurgordon/code/git.ourworld.tf/herocode/horus RUST_LOG=info ./target/release/horus all --admin-secret SECRET --kill-ports ``` The benchmarks expect: - **Supervisor** running on `http://127.0.0.1:3030` - **Coordinator** running on `http://127.0.0.1:9652` (HTTP) and `ws://127.0.0.1:9653` (WebSocket) - **Osiris** running on `http://127.0.0.1:8081` - **Redis** running on `127.0.0.1:6379` - Admin secret: `SECRET` ## Running Benchmarks ### Run all benchmarks ```bash cargo bench --bench horus_stack ``` ### Run specific benchmark ```bash cargo bench --bench horus_stack -- supervisor_discovery ``` ### Run with specific filter ```bash cargo bench --bench horus_stack -- concurrent ``` ### Generate detailed reports ```bash cargo bench --bench horus_stack -- --verbose ``` ## Benchmark Categories ### 1. API Discovery & Metadata (`horus_stack`) - `supervisor_discovery` - OpenRPC metadata retrieval - `supervisor_get_info` - Supervisor information and stats ### 2. Runner Management (`horus_stack`) - `supervisor_list_runners` - List all registered runners - `get_all_runner_status` - Get status of all runners ### 3. Job Operations (`horus_stack`) - `supervisor_job_create` - Create job without execution - `supervisor_job_list` - List all jobs - `job_full_lifecycle` - Complete job lifecycle (create → execute → result) ### 4. Concurrency Tests (`horus_stack`) - `concurrent_jobs` - Submit multiple jobs concurrently (1, 5, 10, 20 jobs) ### 5. Health & Monitoring (`horus_stack`) - `osiris_health_check` - Osiris server health endpoint ### 6. API Latency (`horus_stack`) - `api_latency/supervisor_info` - Supervisor info latency - `api_latency/runner_list` - Runner list latency - `api_latency/job_list` - Job list latency ### 7. Stress Tests (`stress_test`) - `stress_high_frequency_jobs` - High-frequency submissions (50-200 jobs) - `stress_sustained_load` - Continuous load testing - `stress_large_payloads` - Large payload handling (1KB-100KB) - `stress_rapid_api_calls` - Rapid API calls (100 calls/iteration) - `stress_mixed_workload` - Mixed operation scenarios - `stress_connection_pool` - Connection pool exhaustion (10-100 clients) ### 8. Memory Usage (`memory_usage`) - `memory_job_creation` - Memory per job object (10-200 jobs) - `memory_client_creation` - Memory per client instance (1-100 clients) - `memory_payload_sizes` - Memory vs payload size (1KB-1MB) See [MEMORY_BENCHMARKS.md](./MEMORY_BENCHMARKS.md) for detailed memory profiling documentation. ## Interpreting Results Criterion outputs detailed statistics including: - **Mean time** - Average execution time - **Std deviation** - Variability in measurements - **Median** - Middle value (50th percentile) - **MAD** - Median Absolute Deviation - **Throughput** - Operations per second Results are saved in `target/criterion/` with: - HTML reports with graphs - JSON data for further analysis - Historical comparison with previous runs ## Performance Targets Expected performance (on modern hardware): | Benchmark | Target | Notes | |-----------|--------|-------| | supervisor_discovery | < 10ms | Metadata retrieval | | supervisor_get_info | < 5ms | Simple info query | | supervisor_list_runners | < 5ms | List operation | | supervisor_job_create | < 10ms | Job creation only | | job_full_lifecycle | < 100ms | Full execution cycle | | osiris_health_check | < 2ms | Health endpoint | | concurrent_jobs (10) | < 500ms | 10 parallel jobs | ## Customization To modify benchmark parameters, edit `benches/horus_stack.rs`: ```rust // Change URLs const SUPERVISOR_URL: &str = "http://127.0.0.1:3030"; const OSIRIS_URL: &str = "http://127.0.0.1:8081"; // Change admin secret const ADMIN_SECRET: &str = "SECRET"; // Adjust concurrent job counts for num_jobs in [1, 5, 10, 20, 50].iter() { // ... } ``` ## CI/CD Integration To run benchmarks in CI without the full stack: ```bash # Run only fast benchmarks cargo bench --bench horus_stack -- --quick # Save baseline for comparison cargo bench --bench horus_stack -- --save-baseline main # Compare against baseline cargo bench --bench horus_stack -- --baseline main ``` ## Troubleshooting ### "Connection refused" errors - Ensure the Horus stack is running - Check that all services are listening on expected ports - Verify firewall settings ### "Job execution timeout" errors - Increase timeout values in benchmark code - Check that runners are properly registered - Verify Redis is accessible ### Inconsistent results - Close other applications to reduce system load - Run benchmarks multiple times for statistical significance - Use `--warm-up-time` flag to increase warm-up period ## Adding New Benchmarks To add a new benchmark: 1. Create a new function in `benches/horus_stack.rs`: ```rust fn bench_my_feature(c: &mut Criterion) { let rt = create_runtime(); let client = /* create client */; c.bench_function("my_feature", |b| { b.to_async(&rt).iter(|| async { // Your benchmark code }); }); } ``` 2. Add to the criterion_group: ```rust criterion_group!( benches, // ... existing benchmarks bench_my_feature, ); ``` ## Resources - [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/) - [Horus Client Documentation](../lib/clients/) - [Performance Tuning Guide](../docs/performance.md)