sal/MONOREPO_CONVERSION_PLAN.md
Mahmoud-Emad a35edc2030 docs: Update MONOREPO_CONVERSION_PLAN.md with text package status
- Marked the `text` package as production-ready in the
  conversion plan.
- Added quality metrics achieved for the `text` package,
  including test coverage, security features, and error
  handling.
- Updated the success metrics checklist to reflect the
  `text` package's completion.
2025-06-19 14:51:30 +03:00

22 KiB

SAL Monorepo Conversion Plan

🎯 Objective

Convert the SAL (System Abstraction Layer) project from a single-crate structure with modules in src/ to a proper Rust monorepo with independent packages, following Rust best practices for workspace management.

📊 Current State Analysis

Current Structure

sal/
├── Cargo.toml (single package + workspace with vault, git)
├── src/
│   ├── lib.rs (main library)
│   ├── bin/herodo.rs (binary)
│   ├── mycelium/ (module)
│   ├── net/ (module)
│   ├── os/ (module)
│   ├── postgresclient/ (module)
│   ├── process/ (module)
│   ├── redisclient/ (module)
│   ├── rhai/ (module - depends on ALL others, now imports git from sal-git)
│   ├── text/ (module)
│   ├── vault/ (module)
│   ├── virt/ (module)
│   └── zinit_client/ (module)
├── vault/ (converted package) ✅ COMPLETED
├── git/ (converted package) ✅ COMPLETED
├── redisclient/ (converted package) ✅ COMPLETED

Issues with Current Structure

  1. Monolithic dependencies: All external crates are listed in root Cargo.toml even if only used by specific modules
  2. Tight coupling: All modules are compiled together, making it hard to use individual components
  3. Testing complexity: Cannot test individual packages in isolation
  4. Version management: Cannot version packages independently
  5. Build inefficiency: Changes to one module trigger rebuilds of entire crate

🏗️ Target Architecture

Final Monorepo Structure

sal/
├── Cargo.toml (workspace only)
├── git/           (sal-git package)
├── mycelium/      (sal-mycelium package)
├── net/           (sal-net package)
├── os/            (sal-os package)
├── postgresclient/ (sal-postgresclient package)
├── process/       (sal-process package)
├── redisclient/   (sal-redisclient package)
├── text/          (sal-text package)
├── vault/         (sal-vault package) ✅ already done
├── virt/          (sal-virt package)
├── zinit_client/  (sal-zinit-client package)
├── rhai/          (sal-rhai package - aggregates all others)
└── herodo/        (herodo binary package)

📋 Detailed Conversion Plan

Phase 1: Analysis & Dependency Mapping

  • Analyze each package's source code for dependencies
    • Examine imports and usage in each src/ package
    • Identify external crates actually used by each module
  • Map inter-package dependencies
    • Identify which packages depend on other packages within the project
  • Identify shared vs package-specific dependencies
    • Categorize dependencies as common across packages or specific to individual packages
  • Create dependency tree and conversion order
    • Determine the order for converting packages based on their dependency relationships

Phase 2: Package Structure Design

  • Design workspace structure
    • Keep packages at root level (not in src/ or crates/ subdirectory)
    • Follow Rust monorepo best practices
  • Plan individual package Cargo.toml structure
    • Design template for individual package Cargo.toml files
    • Include proper metadata (name, version, description, etc.)
  • Handle version management strategy
    • Use unified versioning (0.1.0) across all packages initially
    • Plan for independent versioning in the future
  • Plan rhai module handling
    • The rhai module depends on ALL other packages
    • Convert it last as an aggregation package

Phase 3: Incremental Package Conversion

Convert packages in dependency order (leaf packages first):

3.1 Leaf Packages (no internal dependencies)

  • redisclient → sal-redisclient PRODUCTION-READY IMPLEMENTATION
    • Independent package with comprehensive test suite
    • Rhai integration moved to redisclient package with real functionality
    • Environment configuration and connection management
    • Old src/redisclient/ removed and references updated
    • Test infrastructure moved to redisclient/tests/
    • Code review completed: All functionality working correctly
    • Real implementations: Redis operations, connection pooling, error handling
    • Production features: Builder pattern, Unix socket support, automatic reconnection
  • text → sal-text PRODUCTION-READY IMPLEMENTATION
    • Independent package with comprehensive test suite (23 tests: 13 unit + 10 Rhai)
    • Rhai integration moved to text package with real functionality
    • Text processing utilities: dedent, prefix, name_fix, path_fix
    • Old src/text/ removed and references updated
    • Test infrastructure moved to text/tests/ with real behavior validation
    • Code review completed: All functionality working correctly
    • Real implementations: TextReplacer with regex, TemplateBuilder with Tera
    • Production features: Unicode handling, file operations, security sanitization
    • README documentation: Comprehensive package documentation added
    • Integration verified: Herodo integration and test suite integration confirmed
  • mycelium → sal-mycelium PRODUCTION-READY IMPLEMENTATION
    • Independent package with comprehensive test suite (22 tests)
    • Rhai integration moved to mycelium package with real functionality
    • HTTP client for async Mycelium API operations
    • Old src/mycelium/ removed and references updated
    • Test infrastructure moved to mycelium/tests/
    • Code review completed: All functionality working correctly
    • Real implementations: Node info, peer management, routing, messaging
    • Production features: Base64 encoding, timeout handling, error management
    • README documentation: Simple, comprehensive package documentation added
    • Integration verified: Herodo integration and test suite integration confirmed
  • net → sal-net
  • os → sal-os

3.2 Mid-level Packages (depend on leaf packages)

  • git → sal-git (depends on redisclient) PRODUCTION-READY IMPLEMENTATION
    • Independent package with comprehensive test suite (45 tests)
    • Rhai integration moved to git package with real functionality
    • Circular dependency resolved (direct redis client implementation)
    • Old src/git/ removed and references updated
    • Test infrastructure moved to git/tests/rhai/
    • Code review completed: All placeholder code eliminated
    • Security enhancements: Credential helpers, URL masking, environment configuration
    • Real implementations: git_clone, GitTree operations, credential handling
    • Production features: Structured logging, configurable Redis connections, error handling
  • process → sal-process (depends on text)
  • zinit_client → sal-zinit-client

3.3 Higher-level Packages

  • virt → sal-virt (depends on process, os)
  • postgresclient → sal-postgresclient (depends on virt)

3.4 Aggregation Package

  • rhai → sal-rhai (depends on ALL other packages)

3.5 Binary Package

  • herodo → herodo (binary package)

Phase 4: Cleanup & Validation

  • Clean up root Cargo.toml
    • Remove old dependencies that are now in individual packages
    • Keep only workspace configuration
  • Remove old src/ modules
    • After confirming all packages work independently
  • Update documentation
    • Update README.md with new structure
    • Update examples to use new package structure
  • Validate builds
    • Ensure all packages build independently
    • Ensure workspace builds successfully
    • Run all tests

🔧 Implementation Strategy

Package Conversion Template

For each package conversion:

  1. Create package directory (e.g., git/)
  2. Create Cargo.toml with:
    [package]
    name = "sal-{package}"
    version = "0.1.0"
    edition = "2021"
    authors = ["PlanetFirst <info@incubaid.com>"]
    description = "SAL {Package} - {description}"
    repository = "https://git.threefold.info/herocode/sal"
    license = "Apache-2.0"
    
    [dependencies]
    # Only dependencies actually used by this package
    
  3. Move source files from src/{package}/ to {package}/src/
  4. Update imports in moved files
  5. Add to workspace in root Cargo.toml
  6. Test package builds independently
  7. Update dependent packages to use new package

Advanced Package Conversion (Git Package Example)

For packages with Rhai integration and complex dependencies:

  1. Handle Rhai Integration:

    • Move rhai wrappers from src/rhai/{package}.rs to {package}/src/rhai.rs
    • Add rhai dependency to package Cargo.toml
    • Update main SAL rhai module to import from new package
    • Export rhai module from package lib.rs
  2. Resolve Circular Dependencies:

    • Identify circular dependency patterns (e.g., package → sal → redisclient)
    • Implement direct dependencies or minimal client implementations
    • Remove dependency on main sal crate where possible
  3. Comprehensive Testing:

    • Create {package}/tests/ directory with separate test files
    • Keep source files clean (no inline tests)
    • Add both Rust unit tests and Rhai integration tests
    • Move package-specific rhai script tests to {package}/tests/rhai/
  4. Update Test Infrastructure:

    • Update run_rhai_tests.sh to find tests in new locations
    • Update documentation to reflect new test paths
    • Ensure both old and new test locations are supported during transition
  5. Clean Migration:

    • Remove old src/{package}/ directory completely
    • Remove package-specific tests from main SAL test files
    • Update all import references in main SAL crate
    • Verify no broken references remain
  6. Code Review & Quality Assurance:

    • Apply strict code review criteria (see Code Review section)
    • Eliminate all placeholder code (TODO, FIXME, assert!(true))
    • Implement real functionality with proper error handling
    • Add security features (credential handling, URL masking, etc.)
    • Ensure comprehensive test coverage with meaningful assertions
    • Validate production readiness with real-world scenarios

Dependency Management Rules

  • Minimize dependencies: Only include crates actually used by each package
  • Use workspace dependencies: For common dependencies, consider workspace-level dependency management
  • Version consistency: Keep versions consistent across packages for shared dependencies

🧪 Testing Strategy

Package-level Testing

  • Rust Unit Tests: Each package should have tests in {package}/tests/ directory
    • Keep source files clean (no inline #[cfg(test)] modules)
    • Separate test files for different modules (e.g., git_tests.rs, git_executor_tests.rs)
    • Tests should be runnable independently: cd {package} && cargo test
    • Security tests: Credential handling, environment configuration, error scenarios
    • Integration tests: Real-world scenarios with actual external dependencies
    • Configuration tests: Environment variable handling, fallback behavior
  • Rhai Integration Tests: For packages with rhai wrappers
    • Rust tests for rhai function registration in {package}/tests/rhai_tests.rs
    • Rhai script tests in {package}/tests/rhai/ directory
    • Include comprehensive test runner scripts
    • Real functionality tests: Validate actual behavior, not dummy implementations
    • Error handling tests: Invalid inputs, network failures, environment constraints

Integration Testing

  • Workspace-level tests for cross-package functionality
  • Test Infrastructure Updates:
    • Update run_rhai_tests.sh to support both old (rhai_tests/) and new ({package}/tests/rhai/) locations
    • Ensure smooth transition during conversion process
  • Documentation Updates: Update test documentation to reflect new paths

Validation Checklist

Basic Functionality

  • Each package builds independently
  • All packages build together in workspace
  • All existing tests pass
  • Examples work with new structure
  • herodo binary still works
  • Rhai integration works for converted packages
  • Test infrastructure supports new package locations
  • No circular dependencies exist
  • Old source directories completely removed
  • All module references updated (check both imports AND function calls)
  • Integration testing verified (herodo scripts work, test suite integration)
  • Package README created (simple, comprehensive documentation)
  • Documentation updated for new structure

Code Quality & Production Readiness

  • Zero placeholder code: No TODO, FIXME, or stub implementations
  • Real functionality: All functions implement actual behavior
  • Comprehensive testing: Unit, integration, and rhai script tests
  • Security features: Credential handling, URL masking, secure configurations
  • Error handling: Structured logging, graceful fallbacks, meaningful error messages
  • Environment resilience: Graceful handling of network/system constraints
  • Configuration management: Environment variables, fallback values, validation
  • Test integrity: All tests validate real behavior, no trivial passing tests
  • Performance: Reasonable build times and runtime performance
  • Documentation: Updated README, configuration guides, security considerations

🚨 Risk Mitigation

Potential Issues

  1. Circular dependencies: Carefully analyze dependencies to avoid cycles
  2. Feature flags: Some packages might need conditional compilation
  3. External git dependencies: Handle external dependencies like kvstore
  4. Build performance: Monitor build times after conversion

Rollback Plan

  • Keep original src/ structure until full validation
  • Use git branches for incremental changes
  • Test each phase thoroughly before proceeding

📚 Lessons Learned (Git Package Conversion)

Key Insights from Git Package Implementation

  1. Rhai Integration Complexity: Moving rhai wrappers to individual packages provides better cohesion but requires careful dependency management
  2. Circular Dependency Resolution: Main SAL crate depending on packages that depend on SAL creates cycles - resolve by implementing direct dependencies
  3. Test Organization: Separating tests into dedicated directories keeps source files clean and follows Rust best practices
  4. Infrastructure Updates: Test runners and documentation need updates to support new package locations
  5. Comprehensive Validation: Need both Rust unit tests AND rhai script tests to ensure full functionality

Best Practices Established

  • Source File Purity: Keep source files identical to original, move all tests to separate files
  • Comprehensive Test Coverage: Include unit tests, integration tests, and rhai script tests
  • Dependency Minimization: Implement minimal clients rather than depending on main crate
  • Smooth Transition: Support both old and new test locations during conversion
  • Documentation Consistency: Update all references to new package structure

Critical Lessons from Mycelium Conversion

  1. Thorough Reference Updates: When removing old modules, ensure ALL references are updated:

    • Found and fixed critical regression in src/rhai/mod.rs where old module references remained
    • Must check both import statements AND function calls for old module paths
    • Integration tests caught this regression before production deployment
  2. README Documentation: Each package needs simple, comprehensive documentation:

    • Include both Rust API and Rhai usage examples
    • Document all available functions with clear descriptions
    • Provide setup requirements and testing instructions
  3. Integration Verification: Always verify end-to-end integration:

    • Test herodo integration with actual script execution
    • Verify test suite integration with run_rhai_tests.sh
    • Confirm all functions are accessible in production environment

🔍 Code Review & Quality Assurance Process

Strict Code Review Criteria Applied

Based on the git package conversion, establish these mandatory criteria for all future conversions:

1. Code Quality Standards

  • No low-quality or rushed code: All logic must be clear, maintainable, and follow conventions
  • Professional implementations: Real functionality, not placeholder code
  • Proper error handling: Comprehensive error types with meaningful messages
  • Security considerations: Credential handling, URL masking, secure configurations

2. No Nonsense Policy

  • No unused variables or imports: Clean, purposeful code only
  • No redundant functions: Every function serves a clear purpose
  • No unnecessary changes: All modifications must add value

3. Regression Prevention

  • All existing functionality preserved: No breaking changes
  • Comprehensive testing: Both unit tests and integration tests
  • Backward compatibility: Smooth transition for existing users

4. Zero Placeholder Code

  • No TODO/FIXME comments: All code must be production-ready
  • No stub implementations: Real functionality only
  • No assert!(true) tests: All tests must validate actual behavior

5. Test Integrity Requirements

  • Real behavior validation: Tests must verify actual functionality
  • Meaningful assertions: No trivial passing tests
  • Environment resilience: Graceful handling of network/system constraints
  • Comprehensive coverage: Unit, integration, and rhai script tests

Git Package Quality Metrics Achieved

  • 45 comprehensive tests (all passing)
  • Zero placeholder code violations
  • Real functionality implementation (git_clone, credential helpers, etc.)
  • Security features (URL masking, credential scripts, environment config)
  • Production-ready error handling (structured logging, graceful fallbacks)
  • Environment resilience (network failures handled gracefully)

Mycelium Package Quality Metrics Achieved

  • 22 comprehensive tests (all passing - 10 unit + 12 Rhai integration)
  • Zero placeholder code violations
  • Real functionality implementation (HTTP client, base64 encoding, timeout handling)
  • Security features (URL encoding, secure error messages, parameter validation)
  • Production-ready error handling (async operations, graceful fallbacks)
  • Environment resilience (network failures handled gracefully)
  • Integration excellence (herodo integration, test suite integration)

Text Package Quality Metrics Achieved

  • 23 comprehensive tests (all passing - 13 unit + 10 Rhai integration)
  • Zero placeholder code violations
  • Real functionality implementation (text processing, regex replacement, template rendering)
  • Security features (filename sanitization, path normalization, input validation)
  • Production-ready error handling (file operations, template errors, regex validation)
  • Environment resilience (unicode handling, large file processing)
  • Integration excellence (herodo integration, test suite integration)
  • API design excellence (builder patterns, fluent interfaces, comprehensive documentation)

Specific Improvements Made During Code Review

  1. Eliminated Placeholder Code:

    • Replaced dummy git_clone function with real GitTree-based implementation
    • Removed all assert!(true) placeholder tests
    • Implemented actual credential helper functionality
  2. Enhanced Security:

    • Implemented secure credential helper scripts with proper cleanup
    • Added Redis URL masking for sensitive data in logs
    • Replaced hardcoded configurations with environment variables
  3. Improved Test Quality:

    • Replaced fake tests with real behavior validation
    • Added comprehensive error handling tests
    • Implemented environment-resilient test scenarios
    • Fixed API usage bugs (Vec vs single GitRepo)
  4. Production Features:

    • Added structured logging with appropriate levels
    • Implemented configurable Redis connections with fallbacks
    • Enhanced error messages with meaningful context
    • Added comprehensive documentation with security considerations
  5. Code Quality Enhancements:

    • Eliminated unused imports and variables
    • Improved error handling with custom error types
    • Added proper resource cleanup (temporary files, connections)
    • Implemented defensive programming with validation and fallbacks

📈 Success Metrics

Basic Functionality Metrics

  • All packages build independently (git , vault , mycelium , text , others pending)
  • Workspace builds successfully
  • All tests pass
  • Build times are reasonable or improved
  • Individual packages can be used independently
  • Clear separation of concerns between packages
  • Proper dependency management (no unnecessary dependencies)

Quality & Production Readiness Metrics

  • Zero placeholder code violations across all packages (git , vault , mycelium , text , others pending)
  • Comprehensive test coverage (22+ tests per package) (git , mycelium , text , others pending)
  • Real functionality implementation (no dummy/stub code) (git , vault , mycelium , text , others pending)
  • Security features implemented (credential handling, URL masking) (git , mycelium , text , others pending)
  • Production-ready error handling (structured logging, graceful fallbacks) (git , mycelium , text , others pending)
  • Environment resilience (network failures handled gracefully) (git , mycelium , text , others pending)
  • Configuration management (environment variables, secure defaults) (git , mycelium , text , others pending)
  • Code review standards met (all strict criteria satisfied) (git , vault , mycelium , text , others pending)
  • Documentation completeness (README, configuration, security guides) (git , mycelium , text , others pending)
  • Performance standards (reasonable build and runtime performance) (git , vault , mycelium , text , others pending)

Git Package Achievement (Reference Standard)

  • 45 comprehensive tests (unit, integration, security, rhai)
  • Real git operations (clone, repository management, credential handling)
  • Security enhancements (credential helpers, URL masking, environment config)
  • Production features (structured logging, configurable connections, error handling)
  • Code quality score: 10/10 (exceptional production readiness)