409 lines
		
	
	
		
			9.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			409 lines
		
	
	
		
			9.9 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # HeroDB Performance Benchmarking Guide
 | |
| 
 | |
| ## Overview
 | |
| 
 | |
| This document describes the comprehensive benchmarking suite for HeroDB, designed to measure and compare the performance characteristics of the two storage backends: **redb** (default) and **sled**.
 | |
| 
 | |
| ## Benchmark Architecture
 | |
| 
 | |
| ### Design Principles
 | |
| 
 | |
| 1. **Fair Comparison**: Identical test datasets and operations across all backends
 | |
| 2. **Statistical Rigor**: Using Criterion for statistically sound measurements
 | |
| 3. **Real-World Scenarios**: Mix of synthetic and realistic workload patterns
 | |
| 4. **Reproducibility**: Deterministic test data generation with fixed seeds
 | |
| 5. **Isolation**: Each benchmark runs in a clean environment
 | |
| 
 | |
| ### Benchmark Categories
 | |
| 
 | |
| #### 1. Single-Operation CRUD Benchmarks
 | |
| Measures the performance of individual database operations:
 | |
| 
 | |
| - **String Operations**
 | |
|   - `SET` - Write a single key-value pair
 | |
|   - `GET` - Read a single key-value pair
 | |
|   - `DEL` - Delete a single key
 | |
|   - `EXISTS` - Check key existence
 | |
| 
 | |
| - **Hash Operations**
 | |
|   - `HSET` - Set single field in hash
 | |
|   - `HGET` - Get single field from hash
 | |
|   - `HGETALL` - Get all fields from hash
 | |
|   - `HDEL` - Delete field from hash
 | |
|   - `HEXISTS` - Check field existence
 | |
| 
 | |
| - **List Operations**
 | |
|   - `LPUSH` - Push to list head
 | |
|   - `RPUSH` - Push to list tail
 | |
|   - `LPOP` - Pop from list head
 | |
|   - `RPOP` - Pop from list tail
 | |
|   - `LRANGE` - Get range of elements
 | |
| 
 | |
| #### 2. Bulk Operation Benchmarks
 | |
| Tests throughput with varying batch sizes:
 | |
| 
 | |
| - **Bulk Insert**: 100, 1,000, 10,000 records
 | |
| - **Bulk Read**: Sequential and random access patterns
 | |
| - **Bulk Update**: Modify existing records
 | |
| - **Bulk Delete**: Remove multiple records
 | |
| 
 | |
| #### 3. Query and Scan Benchmarks
 | |
| Evaluates iteration and filtering performance:
 | |
| 
 | |
| - **SCAN**: Cursor-based key iteration
 | |
| - **HSCAN**: Hash field iteration
 | |
| - **KEYS**: Pattern matching (with various patterns)
 | |
| - **Range Queries**: List range operations
 | |
| 
 | |
| #### 4. Concurrent Operation Benchmarks
 | |
| Simulates multi-client scenarios:
 | |
| 
 | |
| - **10 Concurrent Clients**: Light load
 | |
| - **50 Concurrent Clients**: Medium load
 | |
| - **Mixed Workload**: 70% reads, 30% writes
 | |
| 
 | |
| #### 5. Memory Profiling
 | |
| Tracks memory usage patterns:
 | |
| 
 | |
| - **Allocation Tracking**: Total allocations per operation
 | |
| - **Peak Memory**: Maximum memory usage
 | |
| - **Memory Efficiency**: Bytes per record stored
 | |
| 
 | |
| ### Test Data Specifications
 | |
| 
 | |
| #### Dataset Sizes
 | |
| - **Small**: 1,000 - 10,000 records
 | |
| - **Medium**: 10,000 records (primary focus)
 | |
| 
 | |
| #### Data Characteristics
 | |
| - **Key Format**: `bench:key:{id}` (predictable, sortable)
 | |
| - **Value Sizes**: 
 | |
|   - Small: 50-100 bytes
 | |
|   - Medium: 500-1000 bytes
 | |
|   - Large: 5000-10000 bytes
 | |
| - **Hash Fields**: 5-20 fields per hash
 | |
| - **List Elements**: 10-100 elements per list
 | |
| 
 | |
| ### Metrics Collected
 | |
| 
 | |
| For each benchmark, we collect:
 | |
| 
 | |
| 1. **Latency Metrics**
 | |
|    - Mean execution time
 | |
|    - Median (p50)
 | |
|    - 95th percentile (p95)
 | |
|    - 99th percentile (p99)
 | |
|    - Standard deviation
 | |
| 
 | |
| 2. **Throughput Metrics**
 | |
|    - Operations per second
 | |
|    - Records per second (for bulk operations)
 | |
| 
 | |
| 3. **Memory Metrics**
 | |
|    - Total allocations
 | |
|    - Peak memory usage
 | |
|    - Average bytes per operation
 | |
| 
 | |
| 4. **Initialization Overhead**
 | |
|    - Database startup time
 | |
|    - First operation latency (cold cache)
 | |
| 
 | |
| ## Benchmark Structure
 | |
| 
 | |
| ### Directory Layout
 | |
| 
 | |
| ```
 | |
| benches/
 | |
| ├── common/
 | |
| │   ├── mod.rs              # Shared utilities
 | |
| │   ├── data_generator.rs   # Test data generation
 | |
| │   ├── metrics.rs          # Custom metrics collection
 | |
| │   └── backends.rs         # Backend setup helpers
 | |
| ├── single_ops.rs           # Single-operation benchmarks
 | |
| ├── bulk_ops.rs             # Bulk operation benchmarks
 | |
| ├── scan_ops.rs             # Scan and query benchmarks
 | |
| ├── concurrent_ops.rs       # Concurrent operation benchmarks
 | |
| └── memory_profile.rs       # Memory profiling benchmarks
 | |
| ```
 | |
| 
 | |
| ### Running Benchmarks
 | |
| 
 | |
| #### Run All Benchmarks
 | |
| ```bash
 | |
| cargo bench
 | |
| ```
 | |
| 
 | |
| #### Run Specific Benchmark Suite
 | |
| ```bash
 | |
| cargo bench --bench single_ops
 | |
| cargo bench --bench bulk_ops
 | |
| cargo bench --bench concurrent_ops
 | |
| ```
 | |
| 
 | |
| #### Run Specific Backend
 | |
| ```bash
 | |
| cargo bench -- redb
 | |
| cargo bench -- sled
 | |
| ```
 | |
| 
 | |
| #### Generate Reports
 | |
| ```bash
 | |
| # Run benchmarks and save results
 | |
| cargo bench -- --save-baseline main
 | |
| 
 | |
| # Compare against baseline
 | |
| cargo bench -- --baseline main
 | |
| 
 | |
| # Export to CSV
 | |
| cargo bench -- --output-format csv > results.csv
 | |
| ```
 | |
| 
 | |
| ### Output Formats
 | |
| 
 | |
| #### 1. Terminal Output (Default)
 | |
| Real-time progress with statistical summaries:
 | |
| ```
 | |
| single_ops/redb/set/small
 | |
|                         time:   [1.234 µs 1.245 µs 1.256 µs]
 | |
|                         thrpt:  [802.5K ops/s 810.2K ops/s 818.1K ops/s]
 | |
| ```
 | |
| 
 | |
| #### 2. CSV Export
 | |
| Structured data for analysis:
 | |
| ```csv
 | |
| backend,operation,dataset_size,mean_ns,median_ns,p95_ns,p99_ns,throughput_ops_sec
 | |
| redb,set,small,1245,1240,1890,2100,810200
 | |
| sled,set,small,1567,1550,2340,2890,638000
 | |
| ```
 | |
| 
 | |
| #### 3. JSON Export
 | |
| Detailed metrics for programmatic processing:
 | |
| ```json
 | |
| {
 | |
|   "benchmark": "single_ops/redb/set/small",
 | |
|   "metrics": {
 | |
|     "mean": 1245,
 | |
|     "median": 1240,
 | |
|     "p95": 1890,
 | |
|     "p99": 2100,
 | |
|     "std_dev": 145,
 | |
|     "throughput": 810200
 | |
|   },
 | |
|   "memory": {
 | |
|     "allocations": 3,
 | |
|     "peak_bytes": 4096
 | |
|   }
 | |
| }
 | |
| ```
 | |
| 
 | |
| ## Benchmark Implementation Details
 | |
| 
 | |
| ### Backend Setup
 | |
| 
 | |
| Each benchmark creates isolated database instances:
 | |
| 
 | |
| ```rust
 | |
| // Redb backend
 | |
| let temp_dir = TempDir::new()?;
 | |
| let db_path = temp_dir.path().join("bench.db");
 | |
| let storage = Storage::new(db_path, false, None)?;
 | |
| 
 | |
| // Sled backend
 | |
| let temp_dir = TempDir::new()?;
 | |
| let db_path = temp_dir.path().join("bench.sled");
 | |
| let storage = SledStorage::new(db_path, false, None)?;
 | |
| ```
 | |
| 
 | |
| ### Data Generation
 | |
| 
 | |
| Deterministic data generation ensures reproducibility:
 | |
| 
 | |
| ```rust
 | |
| use rand::{SeedableRng, Rng};
 | |
| use rand::rngs::StdRng;
 | |
| 
 | |
| fn generate_test_data(count: usize, seed: u64) -> Vec<(String, String)> {
 | |
|     let mut rng = StdRng::seed_from_u64(seed);
 | |
|     (0..count)
 | |
|         .map(|i| {
 | |
|             let key = format!("bench:key:{:08}", i);
 | |
|             let value = generate_value(&mut rng, 100);
 | |
|             (key, value)
 | |
|         })
 | |
|         .collect()
 | |
| }
 | |
| ```
 | |
| 
 | |
| ### Concurrent Testing
 | |
| 
 | |
| Using Tokio for async concurrent operations:
 | |
| 
 | |
| ```rust
 | |
| async fn concurrent_benchmark(
 | |
|     storage: Arc<dyn StorageBackend>,
 | |
|     num_clients: usize,
 | |
|     operations: usize
 | |
| ) {
 | |
|     let tasks: Vec<_> = (0..num_clients)
 | |
|         .map(|client_id| {
 | |
|             let storage = storage.clone();
 | |
|             tokio::spawn(async move {
 | |
|                 for i in 0..operations {
 | |
|                     let key = format!("client:{}:key:{}", client_id, i);
 | |
|                     storage.set(key, "value".to_string()).unwrap();
 | |
|                 }
 | |
|             })
 | |
|         })
 | |
|         .collect();
 | |
|     
 | |
|     futures::future::join_all(tasks).await;
 | |
| }
 | |
| ```
 | |
| 
 | |
| ## Interpreting Results
 | |
| 
 | |
| ### Performance Comparison
 | |
| 
 | |
| When comparing backends, consider:
 | |
| 
 | |
| 1. **Latency vs Throughput Trade-offs**
 | |
|    - Lower latency = better for interactive workloads
 | |
|    - Higher throughput = better for batch processing
 | |
| 
 | |
| 2. **Consistency**
 | |
|    - Lower standard deviation = more predictable performance
 | |
|    - Check p95/p99 for tail latency
 | |
| 
 | |
| 3. **Scalability**
 | |
|    - How performance changes with dataset size
 | |
|    - Concurrent operation efficiency
 | |
| 
 | |
| ### Backend Selection Guidelines
 | |
| 
 | |
| Based on benchmark results, choose:
 | |
| 
 | |
| **redb** when:
 | |
| - Need predictable latency
 | |
| - Working with structured data (separate tables)
 | |
| - Require high concurrent read performance
 | |
| - Memory efficiency is important
 | |
| 
 | |
| **sled** when:
 | |
| - Need high write throughput
 | |
| - Working with uniform data types
 | |
| - Require lock-free operations
 | |
| - Crash recovery is critical
 | |
| 
 | |
| ## Memory Profiling
 | |
| 
 | |
| ### Using DHAT
 | |
| 
 | |
| For detailed memory profiling:
 | |
| 
 | |
| ```bash
 | |
| # Install valgrind and dhat
 | |
| sudo apt-get install valgrind
 | |
| 
 | |
| # Run with DHAT
 | |
| cargo bench --bench memory_profile -- --profile-time=10
 | |
| ```
 | |
| 
 | |
| ### Custom Allocation Tracking
 | |
| 
 | |
| The benchmarks include custom allocation tracking:
 | |
| 
 | |
| ```rust
 | |
| #[global_allocator]
 | |
| static ALLOC: dhat::Alloc = dhat::Alloc;
 | |
| 
 | |
| fn track_allocations<F>(f: F) -> AllocationStats
 | |
| where
 | |
|     F: FnOnce(),
 | |
| {
 | |
|     let _profiler = dhat::Profiler::new_heap();
 | |
|     f();
 | |
|     // Extract stats from profiler
 | |
| }
 | |
| ```
 | |
| 
 | |
| ## Continuous Benchmarking
 | |
| 
 | |
| ### Regression Detection
 | |
| 
 | |
| Compare against baseline to detect performance regressions:
 | |
| 
 | |
| ```bash
 | |
| # Save current performance as baseline
 | |
| cargo bench -- --save-baseline v0.1.0
 | |
| 
 | |
| # After changes, compare
 | |
| cargo bench -- --baseline v0.1.0
 | |
| 
 | |
| # Criterion will highlight significant changes
 | |
| ```
 | |
| 
 | |
| ### CI Integration
 | |
| 
 | |
| Add to CI pipeline:
 | |
| 
 | |
| ```yaml
 | |
| - name: Run Benchmarks
 | |
|   run: |
 | |
|     cargo bench --no-fail-fast -- --output-format json > bench-results.json
 | |
|     
 | |
| - name: Compare Results
 | |
|   run: |
 | |
|     python scripts/compare_benchmarks.py \
 | |
|       --baseline baseline.json \
 | |
|       --current bench-results.json \
 | |
|       --threshold 10  # Fail if >10% regression
 | |
| ```
 | |
| 
 | |
| ## Troubleshooting
 | |
| 
 | |
| ### Common Issues
 | |
| 
 | |
| 1. **Inconsistent Results**
 | |
|    - Ensure system is idle during benchmarks
 | |
|    - Disable CPU frequency scaling
 | |
|    - Run multiple iterations
 | |
| 
 | |
| 2. **Out of Memory**
 | |
|    - Reduce dataset sizes
 | |
|    - Run benchmarks sequentially
 | |
|    - Increase system swap space
 | |
| 
 | |
| 3. **Slow Benchmarks**
 | |
|    - Reduce sample size in Criterion config
 | |
|    - Use `--quick` flag for faster runs
 | |
|    - Focus on specific benchmarks
 | |
| 
 | |
| ### Performance Tips
 | |
| 
 | |
| ```bash
 | |
| # Quick benchmark run (fewer samples)
 | |
| cargo bench -- --quick
 | |
| 
 | |
| # Verbose output for debugging
 | |
| cargo bench -- --verbose
 | |
| 
 | |
| # Profile specific operation
 | |
| cargo bench -- single_ops/redb/set
 | |
| ```
 | |
| 
 | |
| ## Future Enhancements
 | |
| 
 | |
| Potential additions to the benchmark suite:
 | |
| 
 | |
| 1. **Transaction Performance**: Measure MULTI/EXEC overhead
 | |
| 2. **Encryption Overhead**: Compare encrypted vs non-encrypted
 | |
| 3. **Persistence Testing**: Measure flush/sync performance
 | |
| 4. **Recovery Time**: Database restart and recovery speed
 | |
| 5. **Network Overhead**: Redis protocol parsing impact
 | |
| 6. **Long-Running Stability**: Performance over extended periods
 | |
| 
 | |
| ## References
 | |
| 
 | |
| - [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
 | |
| - [DHAT Memory Profiler](https://valgrind.org/docs/manual/dh-manual.html)
 | |
| - [Rust Performance Book](https://nnethercote.github.io/perf-book/) |