add some documentation for blue book

2025-11-14 11:00:26 +01:00
parent 75e62f4730
commit f67296cd25
11 changed files with 1275 additions and 8 deletions
--- a/docs/.collection
+++ b/docs/.collection
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,67 @@
+# Horus Documentation
+
+**Hierarchical Orchestration Runtime for Universal Scripts**
+
+Horus is a distributed job execution system with three layers: Coordinator, Supervisor, and Runner.
+
+## Quick Links
+
+- **[Getting Started](./getting-started.md)** - Install and run your first job
+- **[Architecture](./architecture.md)** - System design and components
+- **[Etymology](./ethymology.md)** - The meaning behind the name
+
+## Components
+
+### Coordinator
+Workflow orchestration engine for DAG-based execution.
+
+- [Overview](./coordinator/overview.md)
+
+### Supervisor
+Job dispatcher with authentication and routing.
+
+- [Overview](./supervisor/overview.md)
+- [Authentication](./supervisor/auth.md)
+- [OpenRPC API](./supervisor/openrpc.json)
+
+### Runners
+Job executors for different workload types.
+
+- [Runner Overview](./runner/overview.md)
+- [Hero Runner](./runner/hero.md) - Heroscript execution
+- [SAL Runner](./runner/sal.md) - System operations
+- [Osiris Runner](./runner/osiris.md) - Database operations
+
+## Core Concepts
+
+### Jobs
+Units of work executed by runners. Each job contains:
+- Target runner ID
+- Payload (script/command)
+- Cryptographic signature
+- Optional timeout and environment variables
+
+### Workflows
+Multi-step DAGs executed by the Coordinator. Steps can:
+- Run in parallel or sequence
+- Pass data between steps
+- Target different runners
+- Handle errors and retries
+
+### Signatures
+All jobs must be cryptographically signed:
+- Ensures job authenticity
+- Prevents tampering
+- Enables authorization
+
+## Use Cases
+
+- **Automation**: Execute system tasks and scripts
+- **Data Pipelines**: Multi-step ETL workflows
+- **CI/CD**: Build, test, and deployment pipelines
+- **Infrastructure**: Manage cloud resources and containers
+- **Integration**: Connect systems via scripted workflows
+
+## Repository
+
+[git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -1,15 +1,185 @@
 # Architecture

-The Horus architecture consists of three layers:
+Horus is a hierarchical orchestration runtime with three layers: Coordinator, Supervisor, and Runner.

-1. Coordinator: A workflow engine that executes DAG-based flows by sending ready job steps to the targeted supervisors.
-2. Supervisor: A job dispatcher that routes jobs to the appropriate runners.
-3. Runner: A job executor that runs the actual job steps.
+## Overview

-## Networking
+```
+┌─────────────────────────────────────────────────────────┐
+│                       Coordinator                        │
+│          (Workflow Engine - DAG Execution)              │
+│                                                          │
+│  • Parses workflow definitions                          │
+│  • Resolves dependencies                                │
+│  • Dispatches ready steps                               │
+│  • Tracks workflow state                                │
+└────────────────────┬────────────────────────────────────┘
+                     │ OpenRPC (HTTP/Mycelium)
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│                       Supervisor                         │
+│            (Job Dispatcher & Authenticator)             │
+│                                                          │
+│  • Verifies job signatures                              │
+│  • Routes jobs to runners                               │
+│  • Manages runner registry                              │
+│  • Tracks job lifecycle                                 │
+└────────────────────┬────────────────────────────────────┘
+                     │ Redis Queue Protocol
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│                        Runners                           │
+│                  (Job Executors)                        │
+│                                                          │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐            │
+│  │   Hero   │  │   SAL    │  │  Osiris  │            │
+│  │ Runner   │  │ Runner   │  │  Runner  │            │
+│  └──────────┘  └──────────┘  └──────────┘            │
+└─────────────────────────────────────────────────────────┘
+```

- The user / client talks to the coordinator over an OpenRPC interface, using either regular HTTP transport or Mycelium.
- The coordinator talks to the supervisor over an OpenRPC interface, using either regular HTTP transport or Mycelium.
- The supervisor talks to runners over a Redis based job execution protocol.
+## Layers
+
+### 1. Coordinator (Optional)
+**Purpose:** Workflow orchestration and DAG execution
+
+**Responsibilities:**
+- Parse and validate workflow definitions
+- Execute DAG-based flows
+- Manage step dependencies
+- Route jobs to appropriate supervisors
+- Handle multi-step workflows
+
+**Use When:**
+- You need multi-step workflows
+- Jobs have dependencies
+- Parallel execution is required
+- Complex data pipelines
+
+[→ Coordinator Documentation](./coordinator/overview.md)
+
+### 2. Supervisor (Required)
+**Purpose:** Job admission, authentication, and routing
+
+**Responsibilities:**
+- Receive jobs via OpenRPC interface
+- Verify cryptographic signatures
+- Route jobs to appropriate runners
+- Manage runner registry
+- Track job status and results
+
+**Features:**
+- OpenRPC API for job management
+- HTTP and Mycelium transport
+- Signature-based authentication
+- Runner health monitoring
+
+[→ Supervisor Documentation](./supervisor/overview.md)
+
+### 3. Runners (Required)
+**Purpose:** Execute actual job workloads
+
+**Available Runners:**
+- **Hero Runner**: Executes heroscripts via Hero CLI
+- **SAL Runner**: System operations (OS, K8s, cloud, etc.)
+- **Osiris Runner**: Database operations with Rhai scripts
+
+**Common Features:**
+- Redis queue-based job polling
+- Signature verification
+- Timeout support
+- Environment variable handling
+
+[→ Runner Documentation](./runner/overview.md)
+
+## Communication Protocols
+
+### Client ↔ Coordinator
+- **Protocol:** OpenRPC
+- **Transport:** HTTP or Mycelium
+- **Operations:** Submit workflow, check status, retrieve results
+
+### Coordinator ↔ Supervisor
+- **Protocol:** OpenRPC
+- **Transport:** HTTP or Mycelium
+- **Operations:** Create job, get status, retrieve logs
+
+### Supervisor ↔ Runner
+- **Protocol:** Redis Queue
+- **Transport:** Redis pub/sub and lists
+- **Operations:** Push job, poll queue, store result
+
+## Job Flow
+
+### Simple Job (No Coordinator)
+```
+1. Client → Supervisor: create_job()
+2. Supervisor: Verify signature
+3. Supervisor → Redis: Push to runner queue
+4. Runner ← Redis: Pop job
+5. Runner: Execute job
+6. Runner → Redis: Store result
+7. Client ← Supervisor: get_job_result()
+```
+
+### Workflow (With Coordinator)
+```
+1. Client → Coordinator: submit_workflow()
+2. Coordinator: Parse DAG
+3. Coordinator: Identify ready steps
+4. Coordinator → Supervisor: create_job() for each ready step
+5. Supervisor → Runner: Route via Redis
+6. Runner: Execute and return result
+7. Coordinator: Update workflow state
+8. Coordinator: Dispatch next ready steps
+9. Repeat until workflow complete
+```
+
+## Security Model
+
+### Authentication
+- Jobs must be cryptographically signed
+- Signatures verified at Supervisor layer
+- Public key infrastructure for identity
+
+### Authorization
+- Runners only execute signed jobs
+- Signature verification before execution
+- Untrusted jobs rejected
+
+### Transport Security
+- Optional TLS for HTTP transport
+- End-to-end encryption via Mycelium
+- No plaintext credentials
+
+[→ Authentication Details](./supervisor/auth.md)
+
+## Deployment Patterns
+
+### Minimal Setup
+```
+Redis + Supervisor + Runner(s)
+```
+Single machine, simple job execution.
+
+### Distributed Setup
+```
+Redis Cluster + Multiple Supervisors + Runner Pool
+```
+High availability, load balancing.
+
+### Full Orchestration
+```
+Coordinator + Multiple Supervisors + Runner Pool
+```
+Complex workflows, multi-step pipelines.
+
+## Design Principles
+
+1. **Hierarchical**: Clear separation of concerns across layers
+2. **Secure**: Signature-based authentication throughout
+3. **Scalable**: Horizontal scaling at each layer
+4. **Observable**: Comprehensive logging and status tracking
+5. **Flexible**: Multiple runners for different workload types


--- a/docs/coordinator/overview.md
+++ b/docs/coordinator/overview.md
@@ -0,0 +1,145 @@
+# Coordinator Overview
+
+The Coordinator is the workflow orchestration layer in Horus. It executes DAG-based flows by managing job dependencies and dispatching ready steps to supervisors.
+
+## Architecture
+
+```
+Client → Coordinator → Supervisor(s) → Runner(s)
+```
+
+## Responsibilities
+
+### 1. **Workflow Management**
+- Parse and validate DAG workflow definitions
+- Track workflow execution state
+- Manage step dependencies
+
+### 2. **Job Orchestration**
+- Determine which steps are ready to execute
+- Dispatch jobs to appropriate supervisors
+- Handle step failures and retries
+
+### 3. **Dependency Resolution**
+- Track step completion
+- Resolve data dependencies between steps
+- Pass outputs from completed steps to dependent steps
+
+### 4. **Multi-Supervisor Coordination**
+- Route jobs to specific supervisors
+- Handle supervisor failures
+- Load balance across supervisors
+
+## Workflow Definition
+
+Workflows are defined as Directed Acyclic Graphs (DAGs):
+
+```yaml
+workflow:
+  name: "data-pipeline"
+  steps:
+    - id: "fetch"
+      runner: "hero"
+      payload: "!!http.get url:'https://api.example.com/data'"
+      
+    - id: "process"
+      runner: "sal"
+      depends_on: ["fetch"]
+      payload: |
+        let data = input.fetch;
+        let processed = process_data(data);
+        processed
+        
+    - id: "store"
+      runner: "osiris"
+      depends_on: ["process"]
+      payload: |
+        let model = osiris.model("results");
+        model.create(input.process);
+```
+
+## Features
+
+### DAG Execution
+- Parallel execution of independent steps
+- Sequential execution of dependent steps
+- Automatic dependency resolution
+
+### Error Handling
+- Step-level retry policies
+- Workflow-level error handlers
+- Partial workflow recovery
+
+### Data Flow
+- Pass outputs between steps
+- Transform data between steps
+- Aggregate results from parallel steps
+
+### Monitoring
+- Real-time workflow status
+- Step-level progress tracking
+- Execution metrics and logs
+
+## Workflow Lifecycle
+
+1. **Submission**: Client submits workflow definition
+2. **Validation**: Coordinator validates DAG structure
+3. **Scheduling**: Determine ready steps (no pending dependencies)
+4. **Dispatch**: Send jobs to supervisors
+5. **Tracking**: Monitor step completion
+6. **Progression**: Execute next ready steps
+7. **Completion**: Workflow finishes when all steps complete
+
+## Use Cases
+
+### Data Pipelines
+```yaml
+Extract → Transform → Load
+```
+
+### CI/CD Workflows
+```yaml
+Build → Test → Deploy
+```
+
+### Multi-Stage Processing
+```yaml
+Fetch Data → Process → Validate → Store → Notify
+```
+
+### Parallel Execution
+```yaml
+        ┌─ Task A ─┐
+Start ──┼─ Task B ─┼── Aggregate → Finish
+        └─ Task C ─┘
+```
+
+## Configuration
+
+```bash
+# Start coordinator
+coordinator --port 9090 --redis-url redis://localhost:6379
+
+# With multiple supervisors
+coordinator --port 9090 \
+  --supervisor http://supervisor1:8080 \
+  --supervisor http://supervisor2:8080
+```
+
+## API
+
+The Coordinator exposes an OpenRPC API:
+
+- `submit_workflow`: Submit a new workflow
+- `get_workflow_status`: Check workflow progress
+- `list_workflows`: List all workflows
+- `cancel_workflow`: Stop a running workflow
+- `get_workflow_logs`: Retrieve execution logs
+
+## Advantages
+
+- **Declarative**: Define what to do, not how
+- **Scalable**: Parallel execution across multiple supervisors
+- **Resilient**: Automatic retry and error handling
+- **Observable**: Real-time status and logging
+- **Composable**: Reuse workflows as steps in larger workflows
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@@ -0,0 +1,186 @@
+# Getting Started with Horus
+
+Quick start guide to running your first Horus job.
+
+## Prerequisites
+
+- Redis server running
+- Rust toolchain installed
+- Horus repository cloned
+
+## Installation
+
+### Build from Source
+
+```bash
+# Clone repository
+git clone https://git.ourworld.tf/herocode/horus
+cd horus
+
+# Build all components
+cargo build --release
+
+# Binaries will be in target/release/
+```
+
+## Quick Start
+
+### 1. Start Redis
+
+```bash
+# Using Docker
+docker run -d -p 6379:6379 redis:latest
+
+# Or install locally
+redis-server
+```
+
+### 2. Start a Runner
+
+```bash
+# Start Hero runner
+./target/release/herorunner my-runner
+
+# Or SAL runner
+./target/release/runner_sal my-sal-runner
+
+# Or Osiris runner
+./target/release/runner_osiris my-osiris-runner
+```
+
+### 3. Start the Supervisor
+
+```bash
+./target/release/supervisor --port 8080
+```
+
+### 4. Submit a Job
+
+Using the Supervisor client:
+
+```rust
+use hero_supervisor_client::SupervisorClient;
+use hero_job::Job;
+
+#[tokio::main]
+async fn main() -> Result<(), Box<dyn std::error::Error>> {
+    let client = SupervisorClient::new("http://localhost:8080")?;
+    
+    let job = Job::new(
+        "my-runner",
+        "print('Hello from Horus!')".to_string(),
+    );
+    
+    let result = client.create_job(job).await?;
+    println!("Job ID: {}", result.id);
+    
+    Ok(())
+}
+```
+
+## Example Workflows
+
+### Simple Heroscript Execution
+
+```bash
+# Job payload
+print("Hello World")
+!!git.list
+```
+
+### SAL System Operation
+
+```rhai
+// List files in directory
+let files = os.list_dir("/tmp");
+for file in files {
+    print(file);
+}
+```
+
+### Osiris Data Storage
+
+```rhai
+// Store user data
+let users = osiris.model("users");
+let user = users.create(#{
+    name: "Alice",
+    email: "alice@example.com"
+});
+print(`Created user: ${user.id}`);
+```
+
+## Architecture Overview
+
+```
+┌──────────────┐
+│ Coordinator  │  (Optional: For workflows)
+└──────┬───────┘
+       │
+┌──────▼───────┐
+│  Supervisor  │  (Job dispatcher)
+└──────┬───────┘
+       │
+       │ Redis
+       │
+┌──────▼───────┐
+│   Runners    │  (Job executors)
+│  - Hero      │
+│  - SAL       │
+│  - Osiris    │
+└──────────────┘
+```
+
+## Next Steps
+
+- [Architecture Details](./architecture.md)
+- [Runner Documentation](./runner/overview.md)
+- [Supervisor API](./supervisor/overview.md)
+- [Coordinator Workflows](./coordinator/overview.md)
+- [Authentication](./supervisor/auth.md)
+
+## Common Issues
+
+### Runner Not Receiving Jobs
+
+1. Check Redis connection
+2. Verify runner ID matches job target
+3. Check supervisor logs
+
+### Job Signature Verification Failed
+
+1. Ensure job is properly signed
+2. Verify public key is registered
+3. Check signature format
+
+### Timeout Errors
+
+1. Increase job timeout value
+2. Check runner resource availability
+3. Optimize job payload
+
+## Development
+
+### Running Tests
+
+```bash
+# All tests
+cargo test
+
+# Specific component
+cargo test -p hero-supervisor
+cargo test -p runner-hero
+```
+
+### Debug Mode
+
+```bash
+# Enable debug logging
+RUST_LOG=debug ./target/release/supervisor --port 8080
+```
+
+## Support
+
+- Documentation: [docs.ourworld.tf/horus](https://docs.ourworld.tf/horus)
+- Repository: [git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
+- Issues: Report on the repository
--- a/docs/job-format.md
+++ b/docs/job-format.md
@@ -0,0 +1,179 @@
+# Job Format
+
+Jobs are the fundamental unit of work in Horus.
+
+## Structure
+
+```rust
+pub struct Job {
+    pub id: String,              // Unique job identifier
+    pub runner_id: String,       // Target runner ID
+    pub payload: String,         // Job payload (script/command)
+    pub timeout: Option<u64>,    // Timeout in seconds
+    pub env_vars: HashMap<String, String>, // Environment variables
+    pub signatures: Vec<Signature>, // Cryptographic signatures
+    pub created_at: i64,         // Creation timestamp
+    pub status: JobStatus,       // Current status
+}
+```
+
+## Job Status
+
+```rust
+pub enum JobStatus {
+    Pending,    // Queued, not yet started
+    Running,    // Currently executing
+    Completed,  // Finished successfully
+    Failed,     // Execution failed
+    Timeout,    // Exceeded timeout
+    Cancelled,  // Manually cancelled
+}
+```
+
+## Signature Format
+
+```rust
+pub struct Signature {
+    pub public_key: String,  // Signer's public key
+    pub signature: String,   // Cryptographic signature
+    pub algorithm: String,   // Signature algorithm (e.g., "ed25519")
+}
+```
+
+## Creating a Job
+
+### Minimal Job
+
+```rust
+use hero_job::Job;
+
+let job = Job::new(
+    "my-runner",
+    "print('Hello World')".to_string(),
+);
+```
+
+### With Timeout
+
+```rust
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("long_running_task()")
+    .timeout(300) // 5 minutes
+    .build();
+```
+
+### With Environment Variables
+
+```rust
+use std::collections::HashMap;
+
+let mut env_vars = HashMap::new();
+env_vars.insert("API_KEY".to_string(), "secret".to_string());
+env_vars.insert("ENV".to_string(), "production".to_string());
+
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("deploy_app()")
+    .env_vars(env_vars)
+    .build();
+```
+
+### With Signature
+
+```rust
+use hero_job::{Job, Signature};
+
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("important_task()")
+    .signature(Signature {
+        public_key: "ed25519:abc123...".to_string(),
+        signature: "sig:xyz789...".to_string(),
+        algorithm: "ed25519".to_string(),
+    })
+    .build();
+```
+
+## Payload Format
+
+The payload format depends on the target runner:
+
+### Hero Runner
+Heroscript content:
+```heroscript
+!!git.list
+print("Repositories listed")
+!!docker.ps
+```
+
+### SAL Runner
+Rhai script with SAL modules:
+```rhai
+let files = os.list_dir("/tmp");
+for file in files {
+    print(file);
+}
+```
+
+### Osiris Runner
+Rhai script with Osiris database:
+```rhai
+let users = osiris.model("users");
+let user = users.create(#{
+    name: "Alice",
+    email: "alice@example.com"
+});
+```
+
+## Job Result
+
+```rust
+pub struct JobResult {
+    pub job_id: String,
+    pub status: JobStatus,
+    pub output: String,      // Stdout
+    pub error: Option<String>, // Stderr or error message
+    pub exit_code: Option<i32>,
+    pub started_at: Option<i64>,
+    pub completed_at: Option<i64>,
+}
+```
+
+## Best Practices
+
+### Timeouts
+- Always set timeouts for jobs
+- Default: 60 seconds
+- Long-running jobs: Set appropriate timeout
+- Infinite jobs: Use separate monitoring
+
+### Environment Variables
+- Don't store secrets in env vars in production
+- Use vault/secret management instead
+- Keep env vars minimal
+- Document required variables
+
+### Signatures
+- Always sign jobs in production
+- Use strong algorithms (ed25519)
+- Rotate keys regularly
+- Store private keys securely
+
+### Payloads
+- Keep payloads concise
+- Validate input data
+- Handle errors gracefully
+- Log important operations
+
+## Validation
+
+Jobs are validated before execution:
+
+1. **Structure**: All required fields present
+2. **Signature**: Valid cryptographic signature
+3. **Runner**: Target runner exists and available
+4. **Payload**: Non-empty payload
+5. **Timeout**: Reasonable timeout value
+
+Invalid jobs are rejected before execution.
--- a/docs/runner/hero.md
+++ b/docs/runner/hero.md
@@ -0,0 +1,71 @@
+# Hero Runner
+
+Executes heroscripts using the Hero CLI tool.
+
+## Overview
+
+The Hero runner pipes job payloads directly to `hero run -s` via stdin, making it ideal for executing Hero automation tasks and heroscripts.
+
+## Features
+
+- **Heroscript Execution**: Direct stdin piping to `hero run -s`
+- **No Temp Files**: Secure execution without filesystem artifacts
+- **Environment Variables**: Full environment variable support
+- **Timeout Support**: Respects job timeout settings
+- **Signature Verification**: Cryptographic job verification
+
+## Usage
+
+```bash
+# Start the runner
+herorunner my-hero-runner
+
+# With custom Redis
+herorunner my-hero-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain the heroscript content:
+
+```heroscript
+!!git.list
+print("Repositories listed")
+!!docker.ps
+```
+
+## Examples
+
+### Simple Print
+```heroscript
+print("Hello from heroscript!")
+```
+
+### Hero Actions
+```heroscript
+!!git.list
+!!docker.start name:"myapp"
+```
+
+### With Environment Variables
+```json
+{
+  "payload": "print(env.MY_VAR)",
+  "env_vars": {
+    "MY_VAR": "Hello World"
+  }
+}
+```
+
+## Requirements
+
+- `hero` CLI must be installed and in PATH
+- Redis server accessible
+- Valid job signatures
+
+## Error Handling
+
+- **Hero CLI Not Found**: Returns error if `hero` command unavailable
+- **Timeout**: Kills process if timeout exceeded
+- **Non-zero Exit**: Returns error with hero CLI output
+- **Invalid Signature**: Rejects job before execution
--- a/docs/runner/osiris.md
+++ b/docs/runner/osiris.md
@@ -0,0 +1,142 @@
+# Osiris Runner
+
+Database-backed runner for structured data storage and retrieval.
+
+## Overview
+
+The Osiris runner executes Rhai scripts with access to a model-based database system, enabling structured data operations and persistence.
+
+## Features
+
+- **Rhai Scripting**: Execute Rhai scripts with Osiris database access
+- **Model-Based Storage**: Define and use data models
+- **CRUD Operations**: Create, read, update, delete records
+- **Query Support**: Search and filter data
+- **Schema Validation**: Type-safe data operations
+- **Transaction Support**: Atomic database operations
+
+## Usage
+
+```bash
+# Start the runner
+runner_osiris my-osiris-runner
+
+# With custom Redis
+runner_osiris my-osiris-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain a Rhai script using Osiris operations:
+
+```rhai
+// Example: Store data
+let model = osiris.model("users");
+let user = model.create(#{
+    name: "Alice",
+    email: "alice@example.com",
+    age: 30
+});
+print(user.id);
+
+// Example: Retrieve data
+let found = model.get(user.id);
+print(found.name);
+```
+
+## Examples
+
+### Create Model and Store Data
+```rhai
+// Define model
+let posts = osiris.model("posts");
+
+// Create record
+let post = posts.create(#{
+    title: "Hello World",
+    content: "First post",
+    author: "Alice",
+    published: true
+});
+
+print(`Created post with ID: ${post.id}`);
+```
+
+### Query Data
+```rhai
+let posts = osiris.model("posts");
+
+// Find by field
+let published = posts.find(#{
+    published: true
+});
+
+for post in published {
+    print(post.title);
+}
+```
+
+### Update Records
+```rhai
+let posts = osiris.model("posts");
+
+// Get record
+let post = posts.get("post-123");
+
+// Update fields
+post.content = "Updated content";
+posts.update(post);
+```
+
+### Delete Records
+```rhai
+let posts = osiris.model("posts");
+
+// Delete by ID
+posts.delete("post-123");
+```
+
+### Transactions
+```rhai
+osiris.transaction(|| {
+    let users = osiris.model("users");
+    let posts = osiris.model("posts");
+    
+    let user = users.create(#{ name: "Bob" });
+    let post = posts.create(#{
+        title: "Bob's Post",
+        author_id: user.id
+    });
+    
+    // Both operations commit together
+});
+```
+
+## Data Models
+
+Models are defined dynamically through Rhai scripts:
+
+```rhai
+let model = osiris.model("products");
+
+// Model automatically handles:
+// - ID generation
+// - Timestamps (created_at, updated_at)
+// - Schema validation
+// - Indexing
+```
+
+## Requirements
+
+- Redis server accessible
+- Osiris database configured
+- Valid job signatures
+- Sufficient storage for data operations
+
+## Use Cases
+
+- **Configuration Storage**: Store application configs
+- **User Data**: Manage user profiles and preferences
+- **Workflow State**: Persist workflow execution state
+- **Metrics & Logs**: Store structured logs and metrics
+- **Cache Management**: Persistent caching layer
--- a/docs/runner/overview.md
+++ b/docs/runner/overview.md
@@ -0,0 +1,96 @@
+# Runners Overview
+
+Runners are the execution layer in the Horus architecture. They receive jobs from the Supervisor via Redis queues and execute the actual workload.
+
+## Architecture
+
+```
+Supervisor → Redis Queue → Runner → Execute Job → Return Result
+```
+
+## Available Runners
+
+Horus provides three specialized runners:
+
+### 1. **Hero Runner**
+Executes heroscripts using the Hero CLI ecosystem.
+
+**Use Cases:**
+- Running Hero automation tasks
+- Executing heroscripts from job payloads
+- Integration with Hero CLI tools
+
+**Binary:** `herorunner`
+
+[→ Hero Runner Documentation](./hero.md)
+
+### 2. **SAL Runner**
+System Abstraction Layer runner for system-level operations.
+
+**Use Cases:**
+- OS operations (file, process, network)
+- Infrastructure management (Kubernetes, VMs)
+- Cloud provider operations (Hetzner)
+- Database operations (Redis, Postgres)
+
+**Binary:** `runner_sal`
+
+[→ SAL Runner Documentation](./sal.md)
+
+### 3. **Osiris Runner**
+Database-backed runner for data storage and retrieval using Rhai scripts.
+
+**Use Cases:**
+- Structured data storage
+- Model-based data operations
+- Rhai script execution with database access
+
+**Binary:** `runner_osiris`
+
+[→ Osiris Runner Documentation](./osiris.md)
+
+## Common Features
+
+All runners implement the `Runner` trait and provide:
+
+- **Job Execution**: Process jobs from Redis queues
+- **Signature Verification**: Verify job signatures before execution
+- **Timeout Support**: Respect job timeout settings
+- **Environment Variables**: Pass environment variables to jobs
+- **Error Handling**: Comprehensive error reporting
+- **Logging**: Structured logging for debugging
+
+## Runner Protocol
+
+Runners communicate with the Supervisor using a Redis-based protocol:
+
+1. **Job Queue**: Supervisor pushes jobs to `runner:{runner_id}:jobs`
+2. **Job Processing**: Runner pops job, validates signature, executes
+3. **Result Storage**: Runner stores result in `job:{job_id}:result`
+4. **Status Updates**: Runner updates job status throughout execution
+
+## Starting a Runner
+
+```bash
+# Hero Runner
+herorunner <runner_id> [--redis-url <url>]
+
+# SAL Runner
+runner_sal <runner_id> [--redis-url <url>]
+
+# Osiris Runner
+runner_osiris <runner_id> [--redis-url <url>]
+```
+
+## Configuration
+
+All runners accept:
+- `runner_id`: Unique identifier for the runner (required)
+- `--redis-url`: Redis connection URL (default: `redis://localhost:6379`)
+
+## Security
+
+- Jobs must be cryptographically signed
+- Runners verify signatures before execution
+- Untrusted jobs are rejected
+- Environment variables should not contain sensitive data in production
--- a/docs/runner/sal.md
+++ b/docs/runner/sal.md
@@ -0,0 +1,123 @@
+# SAL Runner
+
+System Abstraction Layer runner for system-level operations.
+
+## Overview
+
+The SAL runner executes Rhai scripts with access to system abstraction modules for OS operations, infrastructure management, and cloud provider interactions.
+
+## Features
+
+- **Rhai Scripting**: Execute Rhai scripts with SAL modules
+- **System Operations**: File, process, and network management
+- **Infrastructure**: Kubernetes, VM, and container operations
+- **Cloud Providers**: Hetzner and other cloud integrations
+- **Database Access**: Redis and Postgres client operations
+- **Networking**: Mycelium and network configuration
+
+## Available SAL Modules
+
+### Core Modules
+- **sal-os**: Operating system operations
+- **sal-process**: Process management
+- **sal-text**: Text processing utilities
+- **sal-net**: Network operations
+
+### Infrastructure
+- **sal-virt**: Virtualization management
+- **sal-kubernetes**: Kubernetes cluster operations
+- **sal-zinit-client**: Zinit process manager
+
+### Storage & Data
+- **sal-redisclient**: Redis operations
+- **sal-postgresclient**: PostgreSQL operations
+- **sal-vault**: Secret management
+
+### Networking
+- **sal-mycelium**: Mycelium network integration
+
+### Cloud Providers
+- **sal-hetzner**: Hetzner cloud operations
+
+### Version Control
+- **sal-git**: Git repository operations
+
+## Usage
+
+```bash
+# Start the runner
+runner_sal my-sal-runner
+
+# With custom Redis
+runner_sal my-sal-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain a Rhai script using SAL modules:
+
+```rhai
+// Example: List files
+let files = os.list_dir("/tmp");
+print(files);
+
+// Example: Process management
+let pid = process.spawn("ls", ["-la"]);
+let output = process.wait(pid);
+print(output);
+```
+
+## Examples
+
+### File Operations
+```rhai
+// Read file
+let content = os.read_file("/path/to/file");
+print(content);
+
+// Write file
+os.write_file("/path/to/output", "Hello World");
+```
+
+### Kubernetes Operations
+```rhai
+// List pods
+let pods = k8s.list_pods("default");
+for pod in pods {
+    print(pod.name);
+}
+```
+
+### Redis Operations
+```rhai
+// Set value
+redis.set("key", "value");
+
+// Get value
+let val = redis.get("key");
+print(val);
+```
+
+### Git Operations
+```rhai
+// Clone repository
+git.clone("https://github.com/user/repo", "/tmp/repo");
+
+// Get status
+let status = git.status("/tmp/repo");
+print(status);
+```
+
+## Requirements
+
+- Redis server accessible
+- System permissions for requested operations
+- Valid job signatures
+- SAL modules available in runtime
+
+## Security Considerations
+
+- SAL operations have system-level access
+- Jobs must be from trusted sources
+- Signature verification is mandatory
+- Limit runner permissions in production
--- a/docs/supervisor/overview.md
+++ b/docs/supervisor/overview.md
@@ -0,0 +1,88 @@
+# Supervisor Overview
+
+The Supervisor is the job dispatcher layer in Horus. It receives jobs, verifies signatures, and routes them to appropriate runners.
+
+## Architecture
+
+```
+Client → Supervisor → Redis Queue → Runner
+```
+
+## Responsibilities
+
+### 1. **Job Admission**
+- Receive jobs via OpenRPC interface
+- Validate job structure and required fields
+- Verify cryptographic signatures
+
+### 2. **Authentication & Authorization**
+- Verify job signatures using public keys
+- Ensure jobs are from authorized sources
+- Reject unsigned or invalid jobs
+
+### 3. **Job Routing**
+- Route jobs to appropriate runner queues
+- Maintain runner registry
+- Load balance across available runners
+
+### 4. **Job Management**
+- Track job status and lifecycle
+- Provide job query and listing APIs
+- Store job results and logs
+
+### 5. **Runner Management**
+- Register and track available runners
+- Monitor runner health and availability
+- Handle runner disconnections
+
+## OpenRPC Interface
+
+The Supervisor exposes an OpenRPC API for job management:
+
+### Job Operations
+- `create_job`: Submit a new job
+- `get_job`: Retrieve job details
+- `list_jobs`: List all jobs
+- `delete_job`: Remove a job
+- `get_job_logs`: Retrieve job execution logs
+
+### Runner Operations
+- `register_runner`: Register a new runner
+- `list_runners`: List available runners
+- `get_runner_status`: Check runner health
+
+## Job Lifecycle
+
+1. **Submission**: Client submits job via OpenRPC
+2. **Validation**: Supervisor validates structure and signature
+3. **Queueing**: Job pushed to runner's Redis queue
+4. **Execution**: Runner processes job
+5. **Completion**: Result stored in Redis
+6. **Retrieval**: Client retrieves result via OpenRPC
+
+## Transport Options
+
+The Supervisor supports multiple transport layers:
+
+- **HTTP**: Standard HTTP/HTTPS transport
+- **Mycelium**: Peer-to-peer encrypted transport
+
+## Configuration
+
+```bash
+# Start supervisor
+supervisor --port 8080 --redis-url redis://localhost:6379
+
+# With Mycelium
+supervisor --port 8080 --mycelium --redis-url redis://localhost:6379
+```
+
+## Security
+
+- All jobs must be cryptographically signed
+- Signatures verified before job admission
+- Public key infrastructure for identity
+- Optional TLS for HTTP transport
+- End-to-end encryption via Mycelium
+
+[→ Authentication Documentation](./auth.md)