6.2 KiB
Horus Stack Benchmarks - Summary
✅ Created Comprehensive Benchmark Suite
Successfully created a complete benchmark suite for the Horus stack that tests the entire system through the official client APIs.
Files Created
-
benches/horus_stack.rs- Main benchmark suite- API discovery and metadata retrieval
- Runner management operations
- Job lifecycle testing
- Concurrent job submissions (1, 5, 10, 20 jobs)
- Health checks
- API latency measurements
-
benches/stress_test.rs- Stress and load testing- High-frequency job submissions (50-200 jobs)
- Sustained load testing
- Large payload handling (1KB-100KB)
- Rapid API calls (100 calls/iteration)
- Mixed workload scenarios
- Connection pool exhaustion tests (10-100 clients)
-
benches/memory_usage.rs- Memory profiling- Job object memory footprint (10-200 jobs)
- Client instance memory overhead (1-100 clients)
- Payload size impact on memory (1KB-1MB)
- Real-time memory delta reporting
-
benches/README.md- Comprehensive documentation- Setup instructions
- Benchmark descriptions
- Performance targets
- Customization guide
- Troubleshooting tips
-
benches/QUICK_START.md- Quick reference guide- Fast setup steps
- Common commands
- Expected performance metrics
-
benches/MEMORY_BENCHMARKS.md- Memory profiling guide- Memory benchmark descriptions
- Platform-specific measurement details
- Advanced profiling tools
- Memory optimization tips
-
benches/run_benchmarks.sh- Helper script- Automated prerequisite checking
- Service health verification
- One-command benchmark execution
Architecture
The benchmarks interact with the Horus stack exclusively through the client libraries:
hero-supervisor-openrpc-client- Supervisor API (job management, runner coordination)osiris-client- Osiris REST API (data queries)hero-job- Job model definitions
This ensures benchmarks test the real-world API surface that users interact with.
Key Features
✅ Async/await support - Uses Criterion's async_tokio feature
✅ Realistic workloads - Tests actual job submission and execution
✅ Concurrent testing - Measures performance under parallel load
✅ Stress testing - Pushes system limits with high-frequency operations
✅ HTML reports - Beautiful visualizations with historical comparison
✅ Automated checks - Helper script verifies stack is running
Benchmark Categories
Performance Benchmarks (horus_stack)
supervisor_discovery- OpenRPC metadata (target: <10ms)supervisor_get_info- Info retrieval (target: <5ms)supervisor_list_runners- List operations (target: <5ms)supervisor_job_create- Job creation (target: <10ms)supervisor_job_list- Job listing (target: <10ms)osiris_health_check- Health endpoint (target: <2ms)job_full_lifecycle- Complete job cycle (target: <100ms)concurrent_jobs- Parallel submissions (target: <500ms for 10 jobs)get_all_runner_status- Status queriesapi_latency/*- Detailed latency measurements
Stress Tests (stress_test)
stress_high_frequency_jobs- 50-200 concurrent jobsstress_sustained_load- Continuous submissions over timestress_large_payloads- 1KB-100KB payload handlingstress_rapid_api_calls- 100 rapid calls per iterationstress_mixed_workload- Combined operationsstress_connection_pool- 10-100 concurrent clients
Memory Profiling (memory_usage)
memory_job_creation- Memory footprint per job (10-200 jobs)memory_client_creation- Memory per client instance (1-100 clients)memory_payload_sizes- Memory vs payload size (1KB-1MB)- Reports memory deltas in real-time during execution
Usage
# Quick start
./benches/run_benchmarks.sh
# Run specific suite
cargo bench --bench horus_stack
cargo bench --bench stress_test
cargo bench --bench memory_usage
# Run specific test
cargo bench -- supervisor_discovery
# Run memory benchmarks with verbose output (shows memory deltas)
cargo bench --bench memory_usage -- --verbose
# Save baseline
cargo bench -- --save-baseline main
# Compare against baseline
cargo bench -- --baseline main
Prerequisites
The benchmarks require the full Horus stack to be running:
# Start Redis
redis-server
# Start Horus (with auto port cleanup)
RUST_LOG=info ./target/release/horus all --admin-secret SECRET --kill-ports
Configuration
All benchmarks use these defaults (configurable in source):
- Supervisor:
http://127.0.0.1:3030 - Osiris:
http://127.0.0.1:8081 - Coordinator HTTP:
http://127.0.0.1:9652 - Coordinator WS:
ws://127.0.0.1:9653 - Admin secret:
SECRET
Results
Results are saved to target/criterion/ with:
- HTML reports with graphs and statistics
- JSON data for programmatic analysis
- Historical comparison with previous runs
- Detailed performance metrics (mean, median, std dev, throughput)
Integration
The benchmarks are integrated into the workspace:
- Added to
Cargo.tomlwith proper dependencies - Uses workspace-level dependencies for consistency
- Configured with
harness = falsefor Criterion - Includes all necessary dev-dependencies
Next Steps
- Run benchmarks to establish baseline performance
- Monitor performance over time as code changes
- Use stress tests to identify bottlenecks
- Customize benchmarks for specific use cases
- Integrate into CI/CD for automated performance tracking
Technical Details
Dependencies Added
criterionv0.5 with async_tokio and html_reports featuresosiris-clientfrom workspacereqwestv0.12 with json featureserde_json,uuid,chronofrom workspace
Benchmark Harness
Uses Criterion.rs for:
- Statistical analysis
- Historical comparison
- HTML report generation
- Configurable sample sizes
- Warm-up periods
- Outlier detection
Job Creation
Helper function create_test_job() creates properly structured Job instances:
- Unique UUIDs for each job
- Proper timestamps
- JSON-serialized payloads
- Empty signatures (for testing)
- Configurable runner and command
This ensures benchmarks test realistic job structures that match production usage.