revert: rename back to docs directory

2025-11-14 11:10:32 +01:00
parent dc7ca26c6d
commit c9c84676b3
15 changed files with 0 additions and 0 deletions
--- a/docs/.collection
+++ b/docs/.collection
@@ -0,0 +1 @@
+horus
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,67 @@
+# Horus Documentation
+
+**Hierarchical Orchestration Runtime for Universal Scripts**
+
+Horus is a distributed job execution system with three layers: Coordinator, Supervisor, and Runner.
+
+## Quick Links
+
+- **[Getting Started](./getting-started.md)** - Install and run your first job
+- **[Architecture](./architecture.md)** - System design and components
+- **[Etymology](./ethymology.md)** - The meaning behind the name
+
+## Components
+
+### Coordinator
+Workflow orchestration engine for DAG-based execution.
+
+- [Overview](./coordinator/coordinator.md)
+
+### Supervisor
+Job dispatcher with authentication and routing.
+
+- [Overview](./supervisor/supervisor.md)
+- [Authentication](./supervisor/auth.md)
+- [OpenRPC API](./supervisor/openrpc.json)
+
+### Runners
+Job executors for different workload types.
+
+- [Runner Overview](./runner/runners.md)
+- [Hero Runner](./runner/hero.md) - Heroscript execution
+- [SAL Runner](./runner/sal.md) - System operations
+- [Osiris Runner](./runner/osiris.md) - Database operations
+
+## Core Concepts
+
+### Jobs
+Units of work executed by runners. Each job contains:
+- Target runner ID
+- Payload (script/command)
+- Cryptographic signature
+- Optional timeout and environment variables
+
+### Workflows
+Multi-step DAGs executed by the Coordinator. Steps can:
+- Run in parallel or sequence
+- Pass data between steps
+- Target different runners
+- Handle errors and retries
+
+### Signatures
+All jobs must be cryptographically signed:
+- Ensures job authenticity
+- Prevents tampering
+- Enables authorization
+
+## Use Cases
+
+- **Automation**: Execute system tasks and scripts
+- **Data Pipelines**: Multi-step ETL workflows
+- **CI/CD**: Build, test, and deployment pipelines
+- **Infrastructure**: Manage cloud resources and containers
+- **Integration**: Connect systems via scripted workflows
+
+## Repository
+
+[git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,185 @@
+# Architecture
+
+Horus is a hierarchical orchestration runtime with three layers: Coordinator, Supervisor, and Runner.
+
+## Overview
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                       Coordinator                        │
+│          (Workflow Engine - DAG Execution)              │
+│                                                          │
+│  • Parses workflow definitions                          │
+│  • Resolves dependencies                                │
+│  • Dispatches ready steps                               │
+│  • Tracks workflow state                                │
+└────────────────────┬────────────────────────────────────┘
+                     │ OpenRPC (HTTP/Mycelium)
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│                       Supervisor                         │
+│            (Job Dispatcher & Authenticator)             │
+│                                                          │
+│  • Verifies job signatures                              │
+│  • Routes jobs to runners                               │
+│  • Manages runner registry                              │
+│  • Tracks job lifecycle                                 │
+└────────────────────┬────────────────────────────────────┘
+                     │ Redis Queue Protocol
+                     │
+┌────────────────────▼────────────────────────────────────┐
+│                        Runners                           │
+│                  (Job Executors)                        │
+│                                                          │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐            │
+│  │   Hero   │  │   SAL    │  │  Osiris  │            │
+│  │ Runner   │  │ Runner   │  │  Runner  │            │
+│  └──────────┘  └──────────┘  └──────────┘            │
+└─────────────────────────────────────────────────────────┘
+```
+
+## Layers
+
+### 1. Coordinator (Optional)
+**Purpose:** Workflow orchestration and DAG execution
+
+**Responsibilities:**
+- Parse and validate workflow definitions
+- Execute DAG-based flows
+- Manage step dependencies
+- Route jobs to appropriate supervisors
+- Handle multi-step workflows
+
+**Use When:**
+- You need multi-step workflows
+- Jobs have dependencies
+- Parallel execution is required
+- Complex data pipelines
+
+[→ Coordinator Documentation](./coordinator/coordinator.md)
+
+### 2. Supervisor (Required)
+**Purpose:** Job admission, authentication, and routing
+
+**Responsibilities:**
+- Receive jobs via OpenRPC interface
+- Verify cryptographic signatures
+- Route jobs to appropriate runners
+- Manage runner registry
+- Track job status and results
+
+**Features:**
+- OpenRPC API for job management
+- HTTP and Mycelium transport
+- Signature-based authentication
+- Runner health monitoring
+
+[→ Supervisor Documentation](./supervisor/supervisor.md)
+
+### 3. Runners (Required)
+**Purpose:** Execute actual job workloads
+
+**Available Runners:**
+- **Hero Runner**: Executes heroscripts via Hero CLI
+- **SAL Runner**: System operations (OS, K8s, cloud, etc.)
+- **Osiris Runner**: Database operations with Rhai scripts
+
+**Common Features:**
+- Redis queue-based job polling
+- Signature verification
+- Timeout support
+- Environment variable handling
+
+[→ Runner Documentation](./runner/runners.md)
+
+## Communication Protocols
+
+### Client ↔ Coordinator
+- **Protocol:** OpenRPC
+- **Transport:** HTTP or Mycelium
+- **Operations:** Submit workflow, check status, retrieve results
+
+### Coordinator ↔ Supervisor
+- **Protocol:** OpenRPC
+- **Transport:** HTTP or Mycelium
+- **Operations:** Create job, get status, retrieve logs
+
+### Supervisor ↔ Runner
+- **Protocol:** Redis Queue
+- **Transport:** Redis pub/sub and lists
+- **Operations:** Push job, poll queue, store result
+
+## Job Flow
+
+### Simple Job (No Coordinator)
+```
+1. Client → Supervisor: create_job()
+2. Supervisor: Verify signature
+3. Supervisor → Redis: Push to runner queue
+4. Runner ← Redis: Pop job
+5. Runner: Execute job
+6. Runner → Redis: Store result
+7. Client ← Supervisor: get_job_result()
+```
+
+### Workflow (With Coordinator)
+```
+1. Client → Coordinator: submit_workflow()
+2. Coordinator: Parse DAG
+3. Coordinator: Identify ready steps
+4. Coordinator → Supervisor: create_job() for each ready step
+5. Supervisor → Runner: Route via Redis
+6. Runner: Execute and return result
+7. Coordinator: Update workflow state
+8. Coordinator: Dispatch next ready steps
+9. Repeat until workflow complete
+```
+
+## Security Model
+
+### Authentication
+- Jobs must be cryptographically signed
+- Signatures verified at Supervisor layer
+- Public key infrastructure for identity
+
+### Authorization
+- Runners only execute signed jobs
+- Signature verification before execution
+- Untrusted jobs rejected
+
+### Transport Security
+- Optional TLS for HTTP transport
+- End-to-end encryption via Mycelium
+- No plaintext credentials
+
+[→ Authentication Details](./supervisor/auth.md)
+
+## Deployment Patterns
+
+### Minimal Setup
+```
+Redis + Supervisor + Runner(s)
+```
+Single machine, simple job execution.
+
+### Distributed Setup
+```
+Redis Cluster + Multiple Supervisors + Runner Pool
+```
+High availability, load balancing.
+
+### Full Orchestration
+```
+Coordinator + Multiple Supervisors + Runner Pool
+```
+Complex workflows, multi-step pipelines.
+
+## Design Principles
+
+1. **Hierarchical**: Clear separation of concerns across layers
+2. **Secure**: Signature-based authentication throughout
+3. **Scalable**: Horizontal scaling at each layer
+4. **Observable**: Comprehensive logging and status tracking
+5. **Flexible**: Multiple runners for different workload types
+
+
--- a/docs/coordinator/coordinator.md
+++ b/docs/coordinator/coordinator.md
@@ -0,0 +1,145 @@
+# Coordinator Overview
+
+The Coordinator is the workflow orchestration layer in Horus. It executes DAG-based flows by managing job dependencies and dispatching ready steps to supervisors.
+
+## Architecture
+
+```
+Client → Coordinator → Supervisor(s) → Runner(s)
+```
+
+## Responsibilities
+
+### 1. **Workflow Management**
+- Parse and validate DAG workflow definitions
+- Track workflow execution state
+- Manage step dependencies
+
+### 2. **Job Orchestration**
+- Determine which steps are ready to execute
+- Dispatch jobs to appropriate supervisors
+- Handle step failures and retries
+
+### 3. **Dependency Resolution**
+- Track step completion
+- Resolve data dependencies between steps
+- Pass outputs from completed steps to dependent steps
+
+### 4. **Multi-Supervisor Coordination**
+- Route jobs to specific supervisors
+- Handle supervisor failures
+- Load balance across supervisors
+
+## Workflow Definition
+
+Workflows are defined as Directed Acyclic Graphs (DAGs):
+
+```yaml
+workflow:
+  name: "data-pipeline"
+  steps:
+    - id: "fetch"
+      runner: "hero"
+      payload: "!!http.get url:'https://api.example.com/data'"
+      
+    - id: "process"
+      runner: "sal"
+      depends_on: ["fetch"]
+      payload: |
+        let data = input.fetch;
+        let processed = process_data(data);
+        processed
+        
+    - id: "store"
+      runner: "osiris"
+      depends_on: ["process"]
+      payload: |
+        let model = osiris.model("results");
+        model.create(input.process);
+```
+
+## Features
+
+### DAG Execution
+- Parallel execution of independent steps
+- Sequential execution of dependent steps
+- Automatic dependency resolution
+
+### Error Handling
+- Step-level retry policies
+- Workflow-level error handlers
+- Partial workflow recovery
+
+### Data Flow
+- Pass outputs between steps
+- Transform data between steps
+- Aggregate results from parallel steps
+
+### Monitoring
+- Real-time workflow status
+- Step-level progress tracking
+- Execution metrics and logs
+
+## Workflow Lifecycle
+
+1. **Submission**: Client submits workflow definition
+2. **Validation**: Coordinator validates DAG structure
+3. **Scheduling**: Determine ready steps (no pending dependencies)
+4. **Dispatch**: Send jobs to supervisors
+5. **Tracking**: Monitor step completion
+6. **Progression**: Execute next ready steps
+7. **Completion**: Workflow finishes when all steps complete
+
+## Use Cases
+
+### Data Pipelines
+```yaml
+Extract → Transform → Load
+```
+
+### CI/CD Workflows
+```yaml
+Build → Test → Deploy
+```
+
+### Multi-Stage Processing
+```yaml
+Fetch Data → Process → Validate → Store → Notify
+```
+
+### Parallel Execution
+```yaml
+        ┌─ Task A ─┐
+Start ──┼─ Task B ─┼── Aggregate → Finish
+        └─ Task C ─┘
+```
+
+## Configuration
+
+```bash
+# Start coordinator
+coordinator --port 9090 --redis-url redis://localhost:6379
+
+# With multiple supervisors
+coordinator --port 9090 \
+  --supervisor http://supervisor1:8080 \
+  --supervisor http://supervisor2:8080
+```
+
+## API
+
+The Coordinator exposes an OpenRPC API:
+
+- `submit_workflow`: Submit a new workflow
+- `get_workflow_status`: Check workflow progress
+- `list_workflows`: List all workflows
+- `cancel_workflow`: Stop a running workflow
+- `get_workflow_logs`: Retrieve execution logs
+
+## Advantages
+
+- **Declarative**: Define what to do, not how
+- **Scalable**: Parallel execution across multiple supervisors
+- **Resilient**: Automatic retry and error handling
+- **Observable**: Real-time status and logging
+- **Composable**: Reuse workflows as steps in larger workflows
--- a/docs/ethymology.md
+++ b/docs/ethymology.md
@@ -0,0 +1,91 @@
+# HORUS — The Meaning Behind the Name  
+*Hierarchical Orchestration Runtime for Universal Scripts*
+
+---
+
+## 1. Why “Horus”?
+
+**Horus** is one of the oldest and most symbolic deities of ancient Egypt:  
+a god of the **sky, perception, order, and dominion**.
+
+In mythology, Horus *is* the sky itself;  
+his **right eye is the sun** (clarity, authority),  
+his **left eye the moon** (rhythm, balance).
+
+This symbolism aligns perfectly with a system built to supervise, coordinate, and execute distributed workloads.
+
+---
+
+## 2. Symbolic Mapping to the Architecture
+
+- **Sky** → the compute fabric itself  
+- **Solar eye (sun)** → supervisor layer (visibility, authentication, authority)  
+- **Lunar eye (moon)** → coordinator layer (workflow rhythms, stepwise order)  
+- **Falcon wings** → runners (swift execution of tasks)  
+- **Battle against chaos** → ordering and normalizing raw jobs
+
+Horus is an archetype of **oversight**, **correct action**, and **restoring balance**—all fundamental qualities of an agentic execution system.
+
+---
+
+## 3. The Name as a Backronym  
+**H O R U S**  
+**H**ierarchical  
+**O**rchestration  
+**R**untime for  
+**U**niversal  
+**S**cripts
+
+This describes the system exactly:  
+a runtime that receives jobs, authenticates them, orchestrates workflows, and executes scripts across distributed runners.
+
+---
+
+## 4. Why It Fits This Stack
+
+The stack consists of:
+
+- **Job** – the incoming intent  
+- **Supervisor** – verifies, authenticates, admits  
+- **Coordinator** – plans, arranges, sequences  
+- **Runner** – executes scripts  
+- **SAL** – system-level script engine  
+- **Osiris** – object-level storage & retrieval engine
+
+All of this is unified by the central logic of *oversight, orchestration, and action*.
+
+Horus expresses these ideas precisely:
+- Observation → validation & monitoring  
+- Order → workflow coordination  
+- Action → script execution  
+- Sky → the domain that contains all processes beneath it  
+
+---
+
+## 5. Visual & Conceptual Identity
+
+**Themes:**  
+- The Eye of Horus → observability, correctness, safety  
+- Falcon → agile execution  
+- Sky → the domain of computation  
+- Light (sun/moon) → insight, clarity, cycle
+
+**Palette concepts:**  
+- Gold + deep blue  
+- Light on dark (sun in sky)  
+- Single-line geometric Eye (modernized)
+
+The name offers both deep mythic roots and clean, modern branding potential.
+
+---
+
+## 6. Narrative Summary
+
+**HORUS** is the execution sky:  
+the domain where jobs arrive, gain form, and become actions.  
+It brings clarity to chaos, structure to tasks, and order to distributed systems.
+
+It is not just a name.  
+It is the story of a system that sees clearly, acts decisively, and orchestrates wisely.
+
+---
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@@ -0,0 +1,186 @@
+# Getting Started with Horus
+
+Quick start guide to running your first Horus job.
+
+## Prerequisites
+
+- Redis server running
+- Rust toolchain installed
+- Horus repository cloned
+
+## Installation
+
+### Build from Source
+
+```bash
+# Clone repository
+git clone https://git.ourworld.tf/herocode/horus
+cd horus
+
+# Build all components
+cargo build --release
+
+# Binaries will be in target/release/
+```
+
+## Quick Start
+
+### 1. Start Redis
+
+```bash
+# Using Docker
+docker run -d -p 6379:6379 redis:latest
+
+# Or install locally
+redis-server
+```
+
+### 2. Start a Runner
+
+```bash
+# Start Hero runner
+./target/release/herorunner my-runner
+
+# Or SAL runner
+./target/release/runner_sal my-sal-runner
+
+# Or Osiris runner
+./target/release/runner_osiris my-osiris-runner
+```
+
+### 3. Start the Supervisor
+
+```bash
+./target/release/supervisor --port 8080
+```
+
+### 4. Submit a Job
+
+Using the Supervisor client:
+
+```rust
+use hero_supervisor_client::SupervisorClient;
+use hero_job::Job;
+
+#[tokio::main]
+async fn main() -> Result<(), Box<dyn std::error::Error>> {
+    let client = SupervisorClient::new("http://localhost:8080")?;
+    
+    let job = Job::new(
+        "my-runner",
+        "print('Hello from Horus!')".to_string(),
+    );
+    
+    let result = client.create_job(job).await?;
+    println!("Job ID: {}", result.id);
+    
+    Ok(())
+}
+```
+
+## Example Workflows
+
+### Simple Heroscript Execution
+
+```bash
+# Job payload
+print("Hello World")
+!!git.list
+```
+
+### SAL System Operation
+
+```rhai
+// List files in directory
+let files = os.list_dir("/tmp");
+for file in files {
+    print(file);
+}
+```
+
+### Osiris Data Storage
+
+```rhai
+// Store user data
+let users = osiris.model("users");
+let user = users.create(#{
+    name: "Alice",
+    email: "alice@example.com"
+});
+print(`Created user: ${user.id}`);
+```
+
+## Architecture Overview
+
+```
+┌──────────────┐
+│ Coordinator  │  (Optional: For workflows)
+└──────┬───────┘
+       │
+┌──────▼───────┐
+│  Supervisor  │  (Job dispatcher)
+└──────┬───────┘
+       │
+       │ Redis
+       │
+┌──────▼───────┐
+│   Runners    │  (Job executors)
+│  - Hero      │
+│  - SAL       │
+│  - Osiris    │
+└──────────────┘
+```
+
+## Next Steps
+
+- [Architecture Details](./architecture.md)
+- [Runner Documentation](./runner/overview.md)
+- [Supervisor API](./supervisor/overview.md)
+- [Coordinator Workflows](./coordinator/overview.md)
+- [Authentication](./supervisor/auth.md)
+
+## Common Issues
+
+### Runner Not Receiving Jobs
+
+1. Check Redis connection
+2. Verify runner ID matches job target
+3. Check supervisor logs
+
+### Job Signature Verification Failed
+
+1. Ensure job is properly signed
+2. Verify public key is registered
+3. Check signature format
+
+### Timeout Errors
+
+1. Increase job timeout value
+2. Check runner resource availability
+3. Optimize job payload
+
+## Development
+
+### Running Tests
+
+```bash
+# All tests
+cargo test
+
+# Specific component
+cargo test -p hero-supervisor
+cargo test -p runner-hero
+```
+
+### Debug Mode
+
+```bash
+# Enable debug logging
+RUST_LOG=debug ./target/release/supervisor --port 8080
+```
+
+## Support
+
+- Documentation: [docs.ourworld.tf/horus](https://docs.ourworld.tf/horus)
+- Repository: [git.ourworld.tf/herocode/horus](https://git.ourworld.tf/herocode/horus)
+- Issues: Report on the repository
--- a/docs/glossary.md
+++ b/docs/glossary.md
@@ -0,0 +1,6 @@
+# Terminology
+
+- Flow: A workflow that is executed by the coordinator.
+- Job: A unit of work that is executed by a runner.
+- Supervisor: A job dispatcher that routes jobs to the appropriate runners.
+- Runner: A job executor that runs the actual job steps.
--- a/docs/job-format.md
+++ b/docs/job-format.md
@@ -0,0 +1,179 @@
+# Job Format
+
+Jobs are the fundamental unit of work in Horus.
+
+## Structure
+
+```rust
+pub struct Job {
+    pub id: String,              // Unique job identifier
+    pub runner_id: String,       // Target runner ID
+    pub payload: String,         // Job payload (script/command)
+    pub timeout: Option<u64>,    // Timeout in seconds
+    pub env_vars: HashMap<String, String>, // Environment variables
+    pub signatures: Vec<Signature>, // Cryptographic signatures
+    pub created_at: i64,         // Creation timestamp
+    pub status: JobStatus,       // Current status
+}
+```
+
+## Job Status
+
+```rust
+pub enum JobStatus {
+    Pending,    // Queued, not yet started
+    Running,    // Currently executing
+    Completed,  // Finished successfully
+    Failed,     // Execution failed
+    Timeout,    // Exceeded timeout
+    Cancelled,  // Manually cancelled
+}
+```
+
+## Signature Format
+
+```rust
+pub struct Signature {
+    pub public_key: String,  // Signer's public key
+    pub signature: String,   // Cryptographic signature
+    pub algorithm: String,   // Signature algorithm (e.g., "ed25519")
+}
+```
+
+## Creating a Job
+
+### Minimal Job
+
+```rust
+use hero_job::Job;
+
+let job = Job::new(
+    "my-runner",
+    "print('Hello World')".to_string(),
+);
+```
+
+### With Timeout
+
+```rust
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("long_running_task()")
+    .timeout(300) // 5 minutes
+    .build();
+```
+
+### With Environment Variables
+
+```rust
+use std::collections::HashMap;
+
+let mut env_vars = HashMap::new();
+env_vars.insert("API_KEY".to_string(), "secret".to_string());
+env_vars.insert("ENV".to_string(), "production".to_string());
+
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("deploy_app()")
+    .env_vars(env_vars)
+    .build();
+```
+
+### With Signature
+
+```rust
+use hero_job::{Job, Signature};
+
+let job = Job::builder()
+    .runner_id("my-runner")
+    .payload("important_task()")
+    .signature(Signature {
+        public_key: "ed25519:abc123...".to_string(),
+        signature: "sig:xyz789...".to_string(),
+        algorithm: "ed25519".to_string(),
+    })
+    .build();
+```
+
+## Payload Format
+
+The payload format depends on the target runner:
+
+### Hero Runner
+Heroscript content:
+```heroscript
+!!git.list
+print("Repositories listed")
+!!docker.ps
+```
+
+### SAL Runner
+Rhai script with SAL modules:
+```rhai
+let files = os.list_dir("/tmp");
+for file in files {
+    print(file);
+}
+```
+
+### Osiris Runner
+Rhai script with Osiris database:
+```rhai
+let users = osiris.model("users");
+let user = users.create(#{
+    name: "Alice",
+    email: "alice@example.com"
+});
+```
+
+## Job Result
+
+```rust
+pub struct JobResult {
+    pub job_id: String,
+    pub status: JobStatus,
+    pub output: String,      // Stdout
+    pub error: Option<String>, // Stderr or error message
+    pub exit_code: Option<i32>,
+    pub started_at: Option<i64>,
+    pub completed_at: Option<i64>,
+}
+```
+
+## Best Practices
+
+### Timeouts
+- Always set timeouts for jobs
+- Default: 60 seconds
+- Long-running jobs: Set appropriate timeout
+- Infinite jobs: Use separate monitoring
+
+### Environment Variables
+- Don't store secrets in env vars in production
+- Use vault/secret management instead
+- Keep env vars minimal
+- Document required variables
+
+### Signatures
+- Always sign jobs in production
+- Use strong algorithms (ed25519)
+- Rotate keys regularly
+- Store private keys securely
+
+### Payloads
+- Keep payloads concise
+- Validate input data
+- Handle errors gracefully
+- Log important operations
+
+## Validation
+
+Jobs are validated before execution:
+
+1. **Structure**: All required fields present
+2. **Signature**: Valid cryptographic signature
+3. **Runner**: Target runner exists and available
+4. **Payload**: Non-empty payload
+5. **Timeout**: Reasonable timeout value
+
+Invalid jobs are rejected before execution.
--- a/docs/runner/hero.md
+++ b/docs/runner/hero.md
@@ -0,0 +1,71 @@
+# Hero Runner
+
+Executes heroscripts using the Hero CLI tool.
+
+## Overview
+
+The Hero runner pipes job payloads directly to `hero run -s` via stdin, making it ideal for executing Hero automation tasks and heroscripts.
+
+## Features
+
+- **Heroscript Execution**: Direct stdin piping to `hero run -s`
+- **No Temp Files**: Secure execution without filesystem artifacts
+- **Environment Variables**: Full environment variable support
+- **Timeout Support**: Respects job timeout settings
+- **Signature Verification**: Cryptographic job verification
+
+## Usage
+
+```bash
+# Start the runner
+herorunner my-hero-runner
+
+# With custom Redis
+herorunner my-hero-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain the heroscript content:
+
+```heroscript
+!!git.list
+print("Repositories listed")
+!!docker.ps
+```
+
+## Examples
+
+### Simple Print
+```heroscript
+print("Hello from heroscript!")
+```
+
+### Hero Actions
+```heroscript
+!!git.list
+!!docker.start name:"myapp"
+```
+
+### With Environment Variables
+```json
+{
+  "payload": "print(env.MY_VAR)",
+  "env_vars": {
+    "MY_VAR": "Hello World"
+  }
+}
+```
+
+## Requirements
+
+- `hero` CLI must be installed and in PATH
+- Redis server accessible
+- Valid job signatures
+
+## Error Handling
+
+- **Hero CLI Not Found**: Returns error if `hero` command unavailable
+- **Timeout**: Kills process if timeout exceeded
+- **Non-zero Exit**: Returns error with hero CLI output
+- **Invalid Signature**: Rejects job before execution
--- a/docs/runner/osiris.md
+++ b/docs/runner/osiris.md
@@ -0,0 +1,142 @@
+# Osiris Runner
+
+Database-backed runner for structured data storage and retrieval.
+
+## Overview
+
+The Osiris runner executes Rhai scripts with access to a model-based database system, enabling structured data operations and persistence.
+
+## Features
+
+- **Rhai Scripting**: Execute Rhai scripts with Osiris database access
+- **Model-Based Storage**: Define and use data models
+- **CRUD Operations**: Create, read, update, delete records
+- **Query Support**: Search and filter data
+- **Schema Validation**: Type-safe data operations
+- **Transaction Support**: Atomic database operations
+
+## Usage
+
+```bash
+# Start the runner
+runner_osiris my-osiris-runner
+
+# With custom Redis
+runner_osiris my-osiris-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain a Rhai script using Osiris operations:
+
+```rhai
+// Example: Store data
+let model = osiris.model("users");
+let user = model.create(#{
+    name: "Alice",
+    email: "alice@example.com",
+    age: 30
+});
+print(user.id);
+
+// Example: Retrieve data
+let found = model.get(user.id);
+print(found.name);
+```
+
+## Examples
+
+### Create Model and Store Data
+```rhai
+// Define model
+let posts = osiris.model("posts");
+
+// Create record
+let post = posts.create(#{
+    title: "Hello World",
+    content: "First post",
+    author: "Alice",
+    published: true
+});
+
+print(`Created post with ID: ${post.id}`);
+```
+
+### Query Data
+```rhai
+let posts = osiris.model("posts");
+
+// Find by field
+let published = posts.find(#{
+    published: true
+});
+
+for post in published {
+    print(post.title);
+}
+```
+
+### Update Records
+```rhai
+let posts = osiris.model("posts");
+
+// Get record
+let post = posts.get("post-123");
+
+// Update fields
+post.content = "Updated content";
+posts.update(post);
+```
+
+### Delete Records
+```rhai
+let posts = osiris.model("posts");
+
+// Delete by ID
+posts.delete("post-123");
+```
+
+### Transactions
+```rhai
+osiris.transaction(|| {
+    let users = osiris.model("users");
+    let posts = osiris.model("posts");
+    
+    let user = users.create(#{ name: "Bob" });
+    let post = posts.create(#{
+        title: "Bob's Post",
+        author_id: user.id
+    });
+    
+    // Both operations commit together
+});
+```
+
+## Data Models
+
+Models are defined dynamically through Rhai scripts:
+
+```rhai
+let model = osiris.model("products");
+
+// Model automatically handles:
+// - ID generation
+// - Timestamps (created_at, updated_at)
+// - Schema validation
+// - Indexing
+```
+
+## Requirements
+
+- Redis server accessible
+- Osiris database configured
+- Valid job signatures
+- Sufficient storage for data operations
+
+## Use Cases
+
+- **Configuration Storage**: Store application configs
+- **User Data**: Manage user profiles and preferences
+- **Workflow State**: Persist workflow execution state
+- **Metrics & Logs**: Store structured logs and metrics
+- **Cache Management**: Persistent caching layer
--- a/docs/runner/runners.md
+++ b/docs/runner/runners.md
@@ -0,0 +1,96 @@
+# Runners Overview
+
+Runners are the execution layer in the Horus architecture. They receive jobs from the Supervisor via Redis queues and execute the actual workload.
+
+## Architecture
+
+```
+Supervisor → Redis Queue → Runner → Execute Job → Return Result
+```
+
+## Available Runners
+
+Horus provides three specialized runners:
+
+### 1. **Hero Runner**
+Executes heroscripts using the Hero CLI ecosystem.
+
+**Use Cases:**
+- Running Hero automation tasks
+- Executing heroscripts from job payloads
+- Integration with Hero CLI tools
+
+**Binary:** `herorunner`
+
+[→ Hero Runner Documentation](./hero.md)
+
+### 2. **SAL Runner**
+System Abstraction Layer runner for system-level operations.
+
+**Use Cases:**
+- OS operations (file, process, network)
+- Infrastructure management (Kubernetes, VMs)
+- Cloud provider operations (Hetzner)
+- Database operations (Redis, Postgres)
+
+**Binary:** `runner_sal`
+
+[→ SAL Runner Documentation](./sal.md)
+
+### 3. **Osiris Runner**
+Database-backed runner for data storage and retrieval using Rhai scripts.
+
+**Use Cases:**
+- Structured data storage
+- Model-based data operations
+- Rhai script execution with database access
+
+**Binary:** `runner_osiris`
+
+[→ Osiris Runner Documentation](./osiris.md)
+
+## Common Features
+
+All runners implement the `Runner` trait and provide:
+
+- **Job Execution**: Process jobs from Redis queues
+- **Signature Verification**: Verify job signatures before execution
+- **Timeout Support**: Respect job timeout settings
+- **Environment Variables**: Pass environment variables to jobs
+- **Error Handling**: Comprehensive error reporting
+- **Logging**: Structured logging for debugging
+
+## Runner Protocol
+
+Runners communicate with the Supervisor using a Redis-based protocol:
+
+1. **Job Queue**: Supervisor pushes jobs to `runner:{runner_id}:jobs`
+2. **Job Processing**: Runner pops job, validates signature, executes
+3. **Result Storage**: Runner stores result in `job:{job_id}:result`
+4. **Status Updates**: Runner updates job status throughout execution
+
+## Starting a Runner
+
+```bash
+# Hero Runner
+herorunner <runner_id> [--redis-url <url>]
+
+# SAL Runner
+runner_sal <runner_id> [--redis-url <url>]
+
+# Osiris Runner
+runner_osiris <runner_id> [--redis-url <url>]
+```
+
+## Configuration
+
+All runners accept:
+- `runner_id`: Unique identifier for the runner (required)
+- `--redis-url`: Redis connection URL (default: `redis://localhost:6379`)
+
+## Security
+
+- Jobs must be cryptographically signed
+- Runners verify signatures before execution
+- Untrusted jobs are rejected
+- Environment variables should not contain sensitive data in production
--- a/docs/runner/sal.md
+++ b/docs/runner/sal.md
@@ -0,0 +1,123 @@
+# SAL Runner
+
+System Abstraction Layer runner for system-level operations.
+
+## Overview
+
+The SAL runner executes Rhai scripts with access to system abstraction modules for OS operations, infrastructure management, and cloud provider interactions.
+
+## Features
+
+- **Rhai Scripting**: Execute Rhai scripts with SAL modules
+- **System Operations**: File, process, and network management
+- **Infrastructure**: Kubernetes, VM, and container operations
+- **Cloud Providers**: Hetzner and other cloud integrations
+- **Database Access**: Redis and Postgres client operations
+- **Networking**: Mycelium and network configuration
+
+## Available SAL Modules
+
+### Core Modules
+- **sal-os**: Operating system operations
+- **sal-process**: Process management
+- **sal-text**: Text processing utilities
+- **sal-net**: Network operations
+
+### Infrastructure
+- **sal-virt**: Virtualization management
+- **sal-kubernetes**: Kubernetes cluster operations
+- **sal-zinit-client**: Zinit process manager
+
+### Storage & Data
+- **sal-redisclient**: Redis operations
+- **sal-postgresclient**: PostgreSQL operations
+- **sal-vault**: Secret management
+
+### Networking
+- **sal-mycelium**: Mycelium network integration
+
+### Cloud Providers
+- **sal-hetzner**: Hetzner cloud operations
+
+### Version Control
+- **sal-git**: Git repository operations
+
+## Usage
+
+```bash
+# Start the runner
+runner_sal my-sal-runner
+
+# With custom Redis
+runner_sal my-sal-runner --redis-url redis://custom:6379
+```
+
+## Job Payload
+
+The payload should contain a Rhai script using SAL modules:
+
+```rhai
+// Example: List files
+let files = os.list_dir("/tmp");
+print(files);
+
+// Example: Process management
+let pid = process.spawn("ls", ["-la"]);
+let output = process.wait(pid);
+print(output);
+```
+
+## Examples
+
+### File Operations
+```rhai
+// Read file
+let content = os.read_file("/path/to/file");
+print(content);
+
+// Write file
+os.write_file("/path/to/output", "Hello World");
+```
+
+### Kubernetes Operations
+```rhai
+// List pods
+let pods = k8s.list_pods("default");
+for pod in pods {
+    print(pod.name);
+}
+```
+
+### Redis Operations
+```rhai
+// Set value
+redis.set("key", "value");
+
+// Get value
+let val = redis.get("key");
+print(val);
+```
+
+### Git Operations
+```rhai
+// Clone repository
+git.clone("https://github.com/user/repo", "/tmp/repo");
+
+// Get status
+let status = git.status("/tmp/repo");
+print(status);
+```
+
+## Requirements
+
+- Redis server accessible
+- System permissions for requested operations
+- Valid job signatures
+- SAL modules available in runtime
+
+## Security Considerations
+
+- SAL operations have system-level access
+- Jobs must be from trusted sources
+- Signature verification is mandatory
+- Limit runner permissions in production
--- a/docs/supervisor/auth.md
+++ b/docs/supervisor/auth.md
@@ -0,0 +1,28 @@
+## Supervisor Authentication
+
+The supervisor has two authentication systems:
+
+1. An authentication system based on scoped symmetric API keys.
+2. An authentication of the signatures of a job's canonical representation.
+
+The first is used to control access to the supervisor API, the second is used to authenticate the signatories of a job, such that the runners can implement access control based on the signatories.
+
+#### API Key Management
+
+API keys are used to authenticate requests to the supervisor. They are created using the `auth.key.create` method and can be listed using the `key.list` method.
+
+#### API Key Scopes
+
+API keys have a scope that determines what actions they can perform. The following scopes are available:
+
+- `admin`: Full access to all supervisor methods.
+- `registrar`: Access to methods related to job registration and management.
+- `user`: Access to methods related to job execution and management.
+
+#### API Key Usage
+
+API keys are passed as a header in the `Authorization` field of the request. The format is `Bearer <key>`.
+
+#### API Key Rotation
+
+API keys can be rotated using the `key.remove` method. This will invalidate the old key and create a new one.
--- a/docs/supervisor/openrpc.json
+++ b/docs/supervisor/openrpc.json
@@ -0,0 +1,391 @@
+{
+  "openrpc": "1.3.2",
+  "info": {
+    "title": "Hero Supervisor OpenRPC API",
+    "version": "1.0.0",
+    "description": "OpenRPC API for managing Hero Supervisor runners and jobs. Job operations follow the convention: 'jobs.' for general operations and 'job.' for specific job operations."
+  },
+  "components": {
+    "schemas": {
+      "Job": {
+        "type": "object",
+        "properties": {
+          "id": { "type": "string" },
+          "caller_id": { "type": "string" },
+          "context_id": { "type": "string" },
+          "payload": { "type": "string" },
+          "runner": { "type": "string" },
+          "executor": { "type": "string" },
+          "timeout": { "type": "number" },
+          "env_vars": { "type": "object" },
+          "created_at": { "type": "string" },
+          "updated_at": { "type": "string" }
+        },
+        "required": ["id", "caller_id", "context_id", "payload", "runner", "executor", "timeout", "env_vars", "created_at", "updated_at"]
+      }
+    }
+  },
+  "methods": [
+    {
+      "name": "list_runners",
+      "description": "List all registered runners",
+      "params": [],
+      "result": {
+        "name": "runners",
+        "schema": {
+          "type": "array",
+          "items": { "type": "string" }
+        }
+      }
+    },
+    {
+      "name": "register_runner",
+      "description": "Register a new runner to the supervisor with secret authentication",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "name": { "type": "string" },
+              "queue": { "type": "string" }
+            },
+            "required": ["secret", "name", "queue"]
+          }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "jobs.create",
+      "description": "Create a new job without queuing it to a runner",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "job": {
+                "$ref": "#/components/schemas/Job"
+              }
+            },
+            "required": ["secret", "job"]
+          }
+        }
+      ],
+      "result": {
+        "name": "job_id",
+        "schema": { "type": "string" }
+      }
+    },
+    {
+      "name": "jobs.list",
+      "description": "List all jobs",
+      "params": [],
+      "result": {
+        "name": "jobs",
+        "schema": {
+          "type": "array",
+          "items": { "$ref": "#/components/schemas/Job" }
+        }
+      }
+    },
+    {
+      "name": "job.run",
+      "description": "Run a job on the appropriate runner and return the result",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "job": {
+                "$ref": "#/components/schemas/Job"
+              }
+            },
+            "required": ["secret", "job"]
+          }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": {
+          "oneOf": [
+            {
+              "type": "object",
+              "properties": {
+                "success": { "type": "string" }
+              },
+              "required": ["success"]
+            },
+            {
+              "type": "object", 
+              "properties": {
+                "error": { "type": "string" }
+              },
+              "required": ["error"]
+            }
+          ]
+        }
+      }
+    },
+    {
+      "name": "job.start",
+      "description": "Start a previously created job by queuing it to its assigned runner",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "job_id": { "type": "string" }
+            },
+            "required": ["secret", "job_id"]
+          }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "job.status",
+      "description": "Get the current status of a job",
+      "params": [
+        {
+          "name": "job_id",
+          "schema": { "type": "string" }
+        }
+      ],
+      "result": {
+        "name": "status",
+        "schema": {
+          "type": "object",
+          "properties": {
+            "job_id": { "type": "string" },
+            "status": { 
+              "type": "string",
+              "enum": ["created", "queued", "running", "completed", "failed", "timeout"]
+            },
+            "created_at": { "type": "string" },
+            "started_at": { "type": ["string", "null"] },
+            "completed_at": { "type": ["string", "null"] }
+          },
+          "required": ["job_id", "status", "created_at"]
+        }
+      }
+    },
+    {
+      "name": "job.result",
+      "description": "Get the result of a completed job (blocks until result is available)",
+      "params": [
+        {
+          "name": "job_id",
+          "schema": { "type": "string" }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": {
+          "oneOf": [
+            {
+              "type": "object",
+              "properties": {
+                "success": { "type": "string" }
+              },
+              "required": ["success"]
+            },
+            {
+              "type": "object",
+              "properties": {
+                "error": { "type": "string" }
+              },
+              "required": ["error"]
+            }
+          ]
+        }
+      }
+    },
+    {
+      "name": "remove_runner",
+      "description": "Remove a runner from the supervisor",
+      "params": [
+        {
+          "name": "actor_id",
+          "schema": { "type": "string" }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "start_runner",
+      "description": "Start a specific runner",
+      "params": [
+        {
+          "name": "actor_id",
+          "schema": { "type": "string" }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "stop_runner",
+      "description": "Stop a specific runner",
+      "params": [
+        {
+          "name": "actor_id",
+          "schema": { "type": "string" }
+        },
+        {
+          "name": "force",
+          "schema": { "type": "boolean" }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "get_runner_status",
+      "description": "Get the status of a specific runner",
+      "params": [
+        {
+          "name": "actor_id",
+          "schema": { "type": "string" }
+        }
+      ],
+      "result": {
+        "name": "status",
+        "schema": { "type": "object" }
+      }
+    },
+    {
+      "name": "get_all_runner_status",
+      "description": "Get status of all runners",
+      "params": [],
+      "result": {
+        "name": "statuses",
+        "schema": {
+          "type": "array",
+          "items": { "type": "object" }
+        }
+      }
+    },
+    {
+      "name": "start_all",
+      "description": "Start all runners",
+      "params": [],
+      "result": {
+        "name": "results",
+        "schema": {
+          "type": "array",
+          "items": {
+            "type": "array",
+            "items": { "type": "string" }
+          }
+        }
+      }
+    },
+    {
+      "name": "stop_all",
+      "description": "Stop all runners",
+      "params": [
+        {
+          "name": "force",
+          "schema": { "type": "boolean" }
+        }
+      ],
+      "result": {
+        "name": "results",
+        "schema": {
+          "type": "array",
+          "items": {
+            "type": "array",
+            "items": { "type": "string" }
+          }
+        }
+      }
+    },
+    {
+      "name": "get_all_status",
+      "description": "Get status of all runners (alternative format)",
+      "params": [],
+      "result": {
+        "name": "statuses",
+        "schema": {
+          "type": "array",
+          "items": {
+            "type": "array",
+            "items": { "type": "string" }
+          }
+        }
+      }
+    },
+    {
+      "name": "job.stop",
+      "description": "Stop a running job",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "job_id": { "type": "string" }
+            },
+            "required": ["secret", "job_id"]
+          }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "job.delete",
+      "description": "Delete a job from the system",
+      "params": [
+        {
+          "name": "params",
+          "schema": {
+            "type": "object",
+            "properties": {
+              "secret": { "type": "string" },
+              "job_id": { "type": "string" }
+            },
+            "required": ["secret", "job_id"]
+          }
+        }
+      ],
+      "result": {
+        "name": "result",
+        "schema": { "type": "null" }
+      }
+    },
+    {
+      "name": "rpc.discover",
+      "description": "OpenRPC discovery method - returns the OpenRPC document describing this API",
+      "params": [],
+      "result": {
+        "name": "openrpc_document",
+        "schema": { "type": "object" }
+      }
+    }
+  ]
+}
--- a/docs/supervisor/supervisor.md
+++ b/docs/supervisor/supervisor.md
@@ -0,0 +1,88 @@
+# Supervisor Overview
+
+The Supervisor is the job dispatcher layer in Horus. It receives jobs, verifies signatures, and routes them to appropriate runners.
+
+## Architecture
+
+```
+Client → Supervisor → Redis Queue → Runner
+```
+
+## Responsibilities
+
+### 1. **Job Admission**
+- Receive jobs via OpenRPC interface
+- Validate job structure and required fields
+- Verify cryptographic signatures
+
+### 2. **Authentication & Authorization**
+- Verify job signatures using public keys
+- Ensure jobs are from authorized sources
+- Reject unsigned or invalid jobs
+
+### 3. **Job Routing**
+- Route jobs to appropriate runner queues
+- Maintain runner registry
+- Load balance across available runners
+
+### 4. **Job Management**
+- Track job status and lifecycle
+- Provide job query and listing APIs
+- Store job results and logs
+
+### 5. **Runner Management**
+- Register and track available runners
+- Monitor runner health and availability
+- Handle runner disconnections
+
+## OpenRPC Interface
+
+The Supervisor exposes an OpenRPC API for job management:
+
+### Job Operations
+- `create_job`: Submit a new job
+- `get_job`: Retrieve job details
+- `list_jobs`: List all jobs
+- `delete_job`: Remove a job
+- `get_job_logs`: Retrieve job execution logs
+
+### Runner Operations
+- `register_runner`: Register a new runner
+- `list_runners`: List available runners
+- `get_runner_status`: Check runner health
+
+## Job Lifecycle
+
+1. **Submission**: Client submits job via OpenRPC
+2. **Validation**: Supervisor validates structure and signature
+3. **Queueing**: Job pushed to runner's Redis queue
+4. **Execution**: Runner processes job
+5. **Completion**: Result stored in Redis
+6. **Retrieval**: Client retrieves result via OpenRPC
+
+## Transport Options
+
+The Supervisor supports multiple transport layers:
+
+- **HTTP**: Standard HTTP/HTTPS transport
+- **Mycelium**: Peer-to-peer encrypted transport
+
+## Configuration
+
+```bash
+# Start supervisor
+supervisor --port 8080 --redis-url redis://localhost:6379
+
+# With Mycelium
+supervisor --port 8080 --mycelium --redis-url redis://localhost:6379
+```
+
+## Security
+
+- All jobs must be cryptographically signed
+- Signatures verified before job admission
+- Public key infrastructure for identity
+- Optional TLS for HTTP transport
+- End-to-end encryption via Mycelium
+
+[→ Authentication Documentation](./auth.md)