Files
horus/docs/coordinator/overview.md
2025-11-14 11:00:26 +01:00

3.5 KiB

Coordinator Overview

The Coordinator is the workflow orchestration layer in Horus. It executes DAG-based flows by managing job dependencies and dispatching ready steps to supervisors.

Architecture

Client → Coordinator → Supervisor(s) → Runner(s)

Responsibilities

1. Workflow Management

  • Parse and validate DAG workflow definitions
  • Track workflow execution state
  • Manage step dependencies

2. Job Orchestration

  • Determine which steps are ready to execute
  • Dispatch jobs to appropriate supervisors
  • Handle step failures and retries

3. Dependency Resolution

  • Track step completion
  • Resolve data dependencies between steps
  • Pass outputs from completed steps to dependent steps

4. Multi-Supervisor Coordination

  • Route jobs to specific supervisors
  • Handle supervisor failures
  • Load balance across supervisors

Workflow Definition

Workflows are defined as Directed Acyclic Graphs (DAGs):

workflow:
  name: "data-pipeline"
  steps:
    - id: "fetch"
      runner: "hero"
      payload: "!!http.get url:'https://api.example.com/data'"
      
    - id: "process"
      runner: "sal"
      depends_on: ["fetch"]
      payload: |
        let data = input.fetch;
        let processed = process_data(data);
        processed
        
    - id: "store"
      runner: "osiris"
      depends_on: ["process"]
      payload: |
        let model = osiris.model("results");
        model.create(input.process);

Features

DAG Execution

  • Parallel execution of independent steps
  • Sequential execution of dependent steps
  • Automatic dependency resolution

Error Handling

  • Step-level retry policies
  • Workflow-level error handlers
  • Partial workflow recovery

Data Flow

  • Pass outputs between steps
  • Transform data between steps
  • Aggregate results from parallel steps

Monitoring

  • Real-time workflow status
  • Step-level progress tracking
  • Execution metrics and logs

Workflow Lifecycle

  1. Submission: Client submits workflow definition
  2. Validation: Coordinator validates DAG structure
  3. Scheduling: Determine ready steps (no pending dependencies)
  4. Dispatch: Send jobs to supervisors
  5. Tracking: Monitor step completion
  6. Progression: Execute next ready steps
  7. Completion: Workflow finishes when all steps complete

Use Cases

Data Pipelines

Extract → Transform → Load

CI/CD Workflows

Build → Test → Deploy

Multi-Stage Processing

Fetch Data → Process → Validate → Store → Notify

Parallel Execution

        ┌─ Task A ─┐
Start ──┼─ Task B ─┼── Aggregate → Finish
        └─ Task C ─┘

Configuration

# Start coordinator
coordinator --port 9090 --redis-url redis://localhost:6379

# With multiple supervisors
coordinator --port 9090 \
  --supervisor http://supervisor1:8080 \
  --supervisor http://supervisor2:8080

API

The Coordinator exposes an OpenRPC API:

  • submit_workflow: Submit a new workflow
  • get_workflow_status: Check workflow progress
  • list_workflows: List all workflows
  • cancel_workflow: Stop a running workflow
  • get_workflow_logs: Retrieve execution logs

Advantages

  • Declarative: Define what to do, not how
  • Scalable: Parallel execution across multiple supervisors
  • Resilient: Automatic retry and error handling
  • Observable: Real-time status and logging
  • Composable: Reuse workflows as steps in larger workflows