186 lines
6.4 KiB
Markdown
186 lines
6.4 KiB
Markdown
# Architecture
|
|
|
|
Horus is a hierarchical orchestration runtime with three layers: Coordinator, Supervisor, and Runner.
|
|
|
|
## Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ Coordinator │
|
|
│ (Workflow Engine - DAG Execution) │
|
|
│ │
|
|
│ • Parses workflow definitions │
|
|
│ • Resolves dependencies │
|
|
│ • Dispatches ready steps │
|
|
│ • Tracks workflow state │
|
|
└────────────────────┬────────────────────────────────────┘
|
|
│ OpenRPC (HTTP/Mycelium)
|
|
│
|
|
┌────────────────────▼────────────────────────────────────┐
|
|
│ Supervisor │
|
|
│ (Job Dispatcher & Authenticator) │
|
|
│ │
|
|
│ • Verifies job signatures │
|
|
│ • Routes jobs to runners │
|
|
│ • Manages runner registry │
|
|
│ • Tracks job lifecycle │
|
|
└────────────────────┬────────────────────────────────────┘
|
|
│ Redis Queue Protocol
|
|
│
|
|
┌────────────────────▼────────────────────────────────────┐
|
|
│ Runners │
|
|
│ (Job Executors) │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ Hero │ │ SAL │ │ Osiris │ │
|
|
│ │ Runner │ │ Runner │ │ Runner │ │
|
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Layers
|
|
|
|
### 1. Coordinator (Optional)
|
|
**Purpose:** Workflow orchestration and DAG execution
|
|
|
|
**Responsibilities:**
|
|
- Parse and validate workflow definitions
|
|
- Execute DAG-based flows
|
|
- Manage step dependencies
|
|
- Route jobs to appropriate supervisors
|
|
- Handle multi-step workflows
|
|
|
|
**Use When:**
|
|
- You need multi-step workflows
|
|
- Jobs have dependencies
|
|
- Parallel execution is required
|
|
- Complex data pipelines
|
|
|
|
[→ Coordinator Documentation](./coordinator/overview.md)
|
|
|
|
### 2. Supervisor (Required)
|
|
**Purpose:** Job admission, authentication, and routing
|
|
|
|
**Responsibilities:**
|
|
- Receive jobs via OpenRPC interface
|
|
- Verify cryptographic signatures
|
|
- Route jobs to appropriate runners
|
|
- Manage runner registry
|
|
- Track job status and results
|
|
|
|
**Features:**
|
|
- OpenRPC API for job management
|
|
- HTTP and Mycelium transport
|
|
- Signature-based authentication
|
|
- Runner health monitoring
|
|
|
|
[→ Supervisor Documentation](./supervisor/overview.md)
|
|
|
|
### 3. Runners (Required)
|
|
**Purpose:** Execute actual job workloads
|
|
|
|
**Available Runners:**
|
|
- **Hero Runner**: Executes heroscripts via Hero CLI
|
|
- **SAL Runner**: System operations (OS, K8s, cloud, etc.)
|
|
- **Osiris Runner**: Database operations with Rhai scripts
|
|
|
|
**Common Features:**
|
|
- Redis queue-based job polling
|
|
- Signature verification
|
|
- Timeout support
|
|
- Environment variable handling
|
|
|
|
[→ Runner Documentation](./runner/overview.md)
|
|
|
|
## Communication Protocols
|
|
|
|
### Client ↔ Coordinator
|
|
- **Protocol:** OpenRPC
|
|
- **Transport:** HTTP or Mycelium
|
|
- **Operations:** Submit workflow, check status, retrieve results
|
|
|
|
### Coordinator ↔ Supervisor
|
|
- **Protocol:** OpenRPC
|
|
- **Transport:** HTTP or Mycelium
|
|
- **Operations:** Create job, get status, retrieve logs
|
|
|
|
### Supervisor ↔ Runner
|
|
- **Protocol:** Redis Queue
|
|
- **Transport:** Redis pub/sub and lists
|
|
- **Operations:** Push job, poll queue, store result
|
|
|
|
## Job Flow
|
|
|
|
### Simple Job (No Coordinator)
|
|
```
|
|
1. Client → Supervisor: create_job()
|
|
2. Supervisor: Verify signature
|
|
3. Supervisor → Redis: Push to runner queue
|
|
4. Runner ← Redis: Pop job
|
|
5. Runner: Execute job
|
|
6. Runner → Redis: Store result
|
|
7. Client ← Supervisor: get_job_result()
|
|
```
|
|
|
|
### Workflow (With Coordinator)
|
|
```
|
|
1. Client → Coordinator: submit_workflow()
|
|
2. Coordinator: Parse DAG
|
|
3. Coordinator: Identify ready steps
|
|
4. Coordinator → Supervisor: create_job() for each ready step
|
|
5. Supervisor → Runner: Route via Redis
|
|
6. Runner: Execute and return result
|
|
7. Coordinator: Update workflow state
|
|
8. Coordinator: Dispatch next ready steps
|
|
9. Repeat until workflow complete
|
|
```
|
|
|
|
## Security Model
|
|
|
|
### Authentication
|
|
- Jobs must be cryptographically signed
|
|
- Signatures verified at Supervisor layer
|
|
- Public key infrastructure for identity
|
|
|
|
### Authorization
|
|
- Runners only execute signed jobs
|
|
- Signature verification before execution
|
|
- Untrusted jobs rejected
|
|
|
|
### Transport Security
|
|
- Optional TLS for HTTP transport
|
|
- End-to-end encryption via Mycelium
|
|
- No plaintext credentials
|
|
|
|
[→ Authentication Details](./supervisor/auth.md)
|
|
|
|
## Deployment Patterns
|
|
|
|
### Minimal Setup
|
|
```
|
|
Redis + Supervisor + Runner(s)
|
|
```
|
|
Single machine, simple job execution.
|
|
|
|
### Distributed Setup
|
|
```
|
|
Redis Cluster + Multiple Supervisors + Runner Pool
|
|
```
|
|
High availability, load balancing.
|
|
|
|
### Full Orchestration
|
|
```
|
|
Coordinator + Multiple Supervisors + Runner Pool
|
|
```
|
|
Complex workflows, multi-step pipelines.
|
|
|
|
## Design Principles
|
|
|
|
1. **Hierarchical**: Clear separation of concerns across layers
|
|
2. **Secure**: Signature-based authentication throughout
|
|
3. **Scalable**: Horizontal scaling at each layer
|
|
4. **Observable**: Comprehensive logging and status tracking
|
|
5. **Flexible**: Multiple runners for different workload types
|
|
|
|
|