196 lines
6.2 KiB
Markdown
196 lines
6.2 KiB
Markdown
# Horus Stack Benchmarks - Summary
|
|
|
|
## ✅ Created Comprehensive Benchmark Suite
|
|
|
|
Successfully created a complete benchmark suite for the Horus stack that tests the entire system through the official client APIs.
|
|
|
|
### Files Created
|
|
|
|
1. **`benches/horus_stack.rs`** - Main benchmark suite
|
|
- API discovery and metadata retrieval
|
|
- Runner management operations
|
|
- Job lifecycle testing
|
|
- Concurrent job submissions (1, 5, 10, 20 jobs)
|
|
- Health checks
|
|
- API latency measurements
|
|
|
|
2. **`benches/stress_test.rs`** - Stress and load testing
|
|
- High-frequency job submissions (50-200 jobs)
|
|
- Sustained load testing
|
|
- Large payload handling (1KB-100KB)
|
|
- Rapid API calls (100 calls/iteration)
|
|
- Mixed workload scenarios
|
|
- Connection pool exhaustion tests (10-100 clients)
|
|
|
|
3. **`benches/memory_usage.rs`** - Memory profiling
|
|
- Job object memory footprint (10-200 jobs)
|
|
- Client instance memory overhead (1-100 clients)
|
|
- Payload size impact on memory (1KB-1MB)
|
|
- Real-time memory delta reporting
|
|
|
|
4. **`benches/README.md`** - Comprehensive documentation
|
|
- Setup instructions
|
|
- Benchmark descriptions
|
|
- Performance targets
|
|
- Customization guide
|
|
- Troubleshooting tips
|
|
|
|
5. **`benches/QUICK_START.md`** - Quick reference guide
|
|
- Fast setup steps
|
|
- Common commands
|
|
- Expected performance metrics
|
|
|
|
6. **`benches/MEMORY_BENCHMARKS.md`** - Memory profiling guide
|
|
- Memory benchmark descriptions
|
|
- Platform-specific measurement details
|
|
- Advanced profiling tools
|
|
- Memory optimization tips
|
|
|
|
7. **`benches/run_benchmarks.sh`** - Helper script
|
|
- Automated prerequisite checking
|
|
- Service health verification
|
|
- One-command benchmark execution
|
|
|
|
### Architecture
|
|
|
|
The benchmarks interact with the Horus stack exclusively through the client libraries:
|
|
|
|
- **`hero-supervisor-openrpc-client`** - Supervisor API (job management, runner coordination)
|
|
- **`osiris-client`** - Osiris REST API (data queries)
|
|
- **`hero-job`** - Job model definitions
|
|
|
|
This ensures benchmarks test the real-world API surface that users interact with.
|
|
|
|
### Key Features
|
|
|
|
✅ **Async/await support** - Uses Criterion's async_tokio feature
|
|
✅ **Realistic workloads** - Tests actual job submission and execution
|
|
✅ **Concurrent testing** - Measures performance under parallel load
|
|
✅ **Stress testing** - Pushes system limits with high-frequency operations
|
|
✅ **HTML reports** - Beautiful visualizations with historical comparison
|
|
✅ **Automated checks** - Helper script verifies stack is running
|
|
|
|
### Benchmark Categories
|
|
|
|
#### Performance Benchmarks (`horus_stack`)
|
|
- `supervisor_discovery` - OpenRPC metadata (target: <10ms)
|
|
- `supervisor_get_info` - Info retrieval (target: <5ms)
|
|
- `supervisor_list_runners` - List operations (target: <5ms)
|
|
- `supervisor_job_create` - Job creation (target: <10ms)
|
|
- `supervisor_job_list` - Job listing (target: <10ms)
|
|
- `osiris_health_check` - Health endpoint (target: <2ms)
|
|
- `job_full_lifecycle` - Complete job cycle (target: <100ms)
|
|
- `concurrent_jobs` - Parallel submissions (target: <500ms for 10 jobs)
|
|
- `get_all_runner_status` - Status queries
|
|
- `api_latency/*` - Detailed latency measurements
|
|
|
|
#### Stress Tests (`stress_test`)
|
|
- `stress_high_frequency_jobs` - 50-200 concurrent jobs
|
|
- `stress_sustained_load` - Continuous submissions over time
|
|
- `stress_large_payloads` - 1KB-100KB payload handling
|
|
- `stress_rapid_api_calls` - 100 rapid calls per iteration
|
|
- `stress_mixed_workload` - Combined operations
|
|
- `stress_connection_pool` - 10-100 concurrent clients
|
|
|
|
#### Memory Profiling (`memory_usage`)
|
|
- `memory_job_creation` - Memory footprint per job (10-200 jobs)
|
|
- `memory_client_creation` - Memory per client instance (1-100 clients)
|
|
- `memory_payload_sizes` - Memory vs payload size (1KB-1MB)
|
|
- Reports memory deltas in real-time during execution
|
|
|
|
### Usage
|
|
|
|
```bash
|
|
# Quick start
|
|
./benches/run_benchmarks.sh
|
|
|
|
# Run specific suite
|
|
cargo bench --bench horus_stack
|
|
cargo bench --bench stress_test
|
|
cargo bench --bench memory_usage
|
|
|
|
# Run specific test
|
|
cargo bench -- supervisor_discovery
|
|
|
|
# Run memory benchmarks with verbose output (shows memory deltas)
|
|
cargo bench --bench memory_usage -- --verbose
|
|
|
|
# Save baseline
|
|
cargo bench -- --save-baseline main
|
|
|
|
# Compare against baseline
|
|
cargo bench -- --baseline main
|
|
```
|
|
|
|
### Prerequisites
|
|
|
|
The benchmarks require the full Horus stack to be running:
|
|
|
|
```bash
|
|
# Start Redis
|
|
redis-server
|
|
|
|
# Start Horus (with auto port cleanup)
|
|
RUST_LOG=info ./target/release/horus all --admin-secret SECRET --kill-ports
|
|
```
|
|
|
|
### Configuration
|
|
|
|
All benchmarks use these defaults (configurable in source):
|
|
- Supervisor: `http://127.0.0.1:3030`
|
|
- Osiris: `http://127.0.0.1:8081`
|
|
- Coordinator HTTP: `http://127.0.0.1:9652`
|
|
- Coordinator WS: `ws://127.0.0.1:9653`
|
|
- Admin secret: `SECRET`
|
|
|
|
### Results
|
|
|
|
Results are saved to `target/criterion/` with:
|
|
- HTML reports with graphs and statistics
|
|
- JSON data for programmatic analysis
|
|
- Historical comparison with previous runs
|
|
- Detailed performance metrics (mean, median, std dev, throughput)
|
|
|
|
### Integration
|
|
|
|
The benchmarks are integrated into the workspace:
|
|
- Added to `Cargo.toml` with proper dependencies
|
|
- Uses workspace-level dependencies for consistency
|
|
- Configured with `harness = false` for Criterion
|
|
- Includes all necessary dev-dependencies
|
|
|
|
### Next Steps
|
|
|
|
1. Run benchmarks to establish baseline performance
|
|
2. Monitor performance over time as code changes
|
|
3. Use stress tests to identify bottlenecks
|
|
4. Customize benchmarks for specific use cases
|
|
5. Integrate into CI/CD for automated performance tracking
|
|
|
|
## Technical Details
|
|
|
|
### Dependencies Added
|
|
- `criterion` v0.5 with async_tokio and html_reports features
|
|
- `osiris-client` from workspace
|
|
- `reqwest` v0.12 with json feature
|
|
- `serde_json`, `uuid`, `chrono` from workspace
|
|
|
|
### Benchmark Harness
|
|
Uses Criterion.rs for:
|
|
- Statistical analysis
|
|
- Historical comparison
|
|
- HTML report generation
|
|
- Configurable sample sizes
|
|
- Warm-up periods
|
|
- Outlier detection
|
|
|
|
### Job Creation
|
|
Helper function `create_test_job()` creates properly structured Job instances:
|
|
- Unique UUIDs for each job
|
|
- Proper timestamps
|
|
- JSON-serialized payloads
|
|
- Empty signatures (for testing)
|
|
- Configurable runner and command
|
|
|
|
This ensures benchmarks test realistic job structures that match production usage.
|