- Add a new service manager crate for dynamic service management - Integrate service manager with Rhai for scripting - Provide examples for circle worker management and basic usage - Add comprehensive tests for service lifecycle and error handling - Implement cross-platform support for macOS and Linux (zinit/systemd)
178 lines
6.2 KiB
Markdown
178 lines
6.2 KiB
Markdown
# 🔧 Service Manager Production Readiness Fix Plan
|
|
|
|
## 📋 Executive Summary
|
|
|
|
This plan addresses all critical issues found in the service manager code review to achieve production readiness. The approach prioritizes quick fixes while maintaining code quality and stability.
|
|
|
|
## 🚨 Critical Issues Identified
|
|
|
|
1. **Runtime Creation Anti-Pattern** - Creating new tokio runtimes for every operation
|
|
2. **Trait Design Inconsistency** - Sync/async method signature mismatches
|
|
3. **Placeholder SystemdServiceManager** - Incomplete Linux implementation
|
|
4. **Dangerous Patterns** - `unwrap()` calls and silent error handling
|
|
5. **API Duplication** - Confusing `run()` method that duplicates `start_and_confirm()`
|
|
6. **Dead Code** - Unused fields and methods
|
|
7. **Broken Documentation** - Non-compiling README examples
|
|
|
|
## 🎯 Solution Strategy: Fully Synchronous API
|
|
|
|
**Decision**: Redesign the API to be fully synchronous for quick fixes and better resource management.
|
|
|
|
**Rationale**:
|
|
- Eliminates runtime creation anti-pattern
|
|
- Simplifies the API surface
|
|
- Reduces complexity and potential for async-related bugs
|
|
- Easier to test and debug
|
|
- Better performance for service management operations
|
|
|
|
## 📝 Detailed Fix Plan
|
|
|
|
### Phase 1: Core API Redesign (2 hours)
|
|
|
|
#### 1.1 Fix Trait Definition
|
|
- **File**: `src/lib.rs`
|
|
- **Action**: Remove all `async` keywords from trait methods
|
|
- **Remove**: `run()` method (duplicate of `start_and_confirm()`)
|
|
- **Simplify**: Timeout handling in synchronous context
|
|
|
|
#### 1.2 Update ZinitServiceManager
|
|
- **File**: `src/zinit.rs`
|
|
- **Action**:
|
|
- Remove runtime creation anti-pattern
|
|
- Use single shared runtime or blocking operations
|
|
- Remove `execute_async_with_timeout` (dead code)
|
|
- Remove unused `socket_path` field
|
|
- Replace `unwrap()` calls with proper error handling
|
|
- Fix silent error handling in `remove()` method
|
|
|
|
#### 1.3 Update LaunchctlServiceManager
|
|
- **File**: `src/launchctl.rs`
|
|
- **Action**:
|
|
- Remove runtime creation from every method
|
|
- Use `tokio::task::block_in_place` for async operations
|
|
- Remove `run()` method duplication
|
|
- Fix silent error handling patterns
|
|
|
|
### Phase 2: Complete SystemdServiceManager (3 hours)
|
|
|
|
#### 2.1 Implement Core Functionality
|
|
- **File**: `src/systemd.rs`
|
|
- **Action**: Replace all placeholder implementations with real systemd integration
|
|
- **Methods to implement**:
|
|
- `start()` - Use `systemctl start`
|
|
- `stop()` - Use `systemctl stop`
|
|
- `restart()` - Use `systemctl restart`
|
|
- `status()` - Parse `systemctl status` output
|
|
- `logs()` - Use `journalctl` for log retrieval
|
|
- `list()` - Parse `systemctl list-units`
|
|
- `remove()` - Use `systemctl disable` and remove unit files
|
|
|
|
#### 2.2 Add Unit File Management
|
|
- **Action**: Implement systemd unit file creation and management
|
|
- **Features**:
|
|
- Generate `.service` files from `ServiceConfig`
|
|
- Handle user vs system service placement
|
|
- Proper file permissions and ownership
|
|
|
|
### Phase 3: Error Handling & Safety (1 hour)
|
|
|
|
#### 3.1 Remove Dangerous Patterns
|
|
- **Action**: Replace all `unwrap()` calls with proper error handling
|
|
- **Files**: All implementation files
|
|
- **Pattern**: `unwrap()` → `map_err()` with descriptive errors
|
|
|
|
#### 3.2 Fix Silent Error Handling
|
|
- **Action**: Replace `let _ = ...` patterns with proper error handling
|
|
- **Strategy**: Log errors or propagate them appropriately
|
|
|
|
### Phase 4: Testing & Documentation (1 hour)
|
|
|
|
#### 4.1 Fix README Examples
|
|
- **File**: `README.md`
|
|
- **Action**: Update all code examples to match synchronous API
|
|
- **Add**: Proper error handling examples
|
|
- **Add**: Platform-specific usage notes
|
|
|
|
#### 4.2 Update Tests
|
|
- **Action**: Ensure all tests work with synchronous API
|
|
- **Fix**: Remove async test patterns where not needed
|
|
- **Add**: Error handling test cases
|
|
|
|
## 🔧 Implementation Details
|
|
|
|
### Runtime Strategy for Async Operations
|
|
|
|
Since we're keeping the API synchronous but some underlying operations (like zinit-client) are async:
|
|
|
|
```rust
|
|
// Use a lazy static runtime for async operations
|
|
use once_cell::sync::Lazy;
|
|
use tokio::runtime::Runtime;
|
|
|
|
static ASYNC_RUNTIME: Lazy<Runtime> = Lazy::new(|| {
|
|
Runtime::new().expect("Failed to create async runtime")
|
|
});
|
|
|
|
// In methods that need async operations:
|
|
fn some_method(&self) -> Result<T, ServiceManagerError> {
|
|
ASYNC_RUNTIME.block_on(async {
|
|
// async operations here
|
|
})
|
|
}
|
|
```
|
|
|
|
### SystemdServiceManager Implementation Strategy
|
|
|
|
```rust
|
|
// Use std::process::Command for systemctl operations
|
|
fn run_systemctl(&self, args: &[&str]) -> Result<String, ServiceManagerError> {
|
|
let output = std::process::Command::new("systemctl")
|
|
.args(args)
|
|
.output()
|
|
.map_err(|e| ServiceManagerError::Other(format!("systemctl failed: {}", e)))?;
|
|
|
|
if !output.status.success() {
|
|
let stderr = String::from_utf8_lossy(&output.stderr);
|
|
return Err(ServiceManagerError::Other(format!("systemctl error: {}", stderr)));
|
|
}
|
|
|
|
Ok(String::from_utf8_lossy(&output.stdout).to_string())
|
|
}
|
|
```
|
|
|
|
## ⏱️ Timeline
|
|
|
|
- **Phase 1**: 2 hours - Core API redesign
|
|
- **Phase 2**: 3 hours - SystemdServiceManager implementation
|
|
- **Phase 3**: 1 hour - Error handling & safety
|
|
- **Phase 4**: 1 hour - Testing & documentation
|
|
|
|
**Total Estimated Time**: 7 hours
|
|
|
|
## ✅ Success Criteria
|
|
|
|
1. **No Runtime Creation Anti-Pattern** - Single shared runtime or proper blocking
|
|
2. **Complete Linux Support** - Fully functional SystemdServiceManager
|
|
3. **No Placeholder Code** - All methods have real implementations
|
|
4. **No Dangerous Patterns** - No `unwrap()` or silent error handling
|
|
5. **Clean API** - No duplicate methods, clear purpose for each method
|
|
6. **Working Documentation** - All README examples compile and run
|
|
7. **Comprehensive Tests** - All tests pass and cover error cases
|
|
|
|
## 🚀 Post-Fix Validation
|
|
|
|
1. **Compile Check**: `cargo check` passes without warnings
|
|
2. **Test Suite**: `cargo test` passes all tests
|
|
3. **Documentation**: `cargo doc` generates without errors
|
|
4. **Example Validation**: README examples compile and run
|
|
5. **Performance Test**: No resource leaks under repeated operations
|
|
|
|
## 📦 Dependencies to Add
|
|
|
|
```toml
|
|
[dependencies]
|
|
once_cell = "1.19" # For lazy static runtime
|
|
```
|
|
|
|
This plan ensures production readiness while maintaining the quick-fix approach requested by the team lead.
|