No description
  • Rust 83%
  • JavaScript 12.4%
  • HTML 2.1%
  • CSS 2%
  • Shell 0.4%
Find a file
Timur Gordon 32af77cd17
All checks were successful
Tests / test (push) Successful in 2m19s
Build and Test / build (push) Successful in 5m4s
refactor(admin): migrate API tab to <hero-api-docs> web component
Drops the hand-rolled OpenRPC explorer (~800 lines of HTML/JS/CSS) in
favor of <hero-api-docs> from hero_admin_lib (commit 630f702 in
hero_website_framework). The component is wired via the same
/static/shared/* mount already used for <hero-connection-status>.

Removed:
- templates/index.html  API TAB markup (api-accordion, api-detail-panel)
- static/js/dashboard.js  API_NS_ICONS + 14 api*/initApiDocs helpers
                          + hashchange branch for #api/... (the component
                          owns hash routing via its default hash-prefix)
- static/css/dashboard.css  all .api-* selectors (component scopes
                            styles to its own Shadow DOM)

Added:
- <hero-api-docs spec-url=".../openrpc.json" rpc-url=".../rpc">
- <script src=".../static/shared/js/api-docs.js"> in base.html

Net: -800 lines, build clean, no orphaned static-asset paths.

Refs hero_proc#108 / hero_skills#262.
2026-05-19 03:16:22 +02:00
.claude/skills/run_ui_tests chore: ignore .claude/ per-user state, keep shared skills 2026-05-01 09:03:41 +02:00
.forgejo/workflows fix(db,ci): HP-01 preserve user secrets on schema upgrade + CI PATH_ROOT 2026-05-19 01:14:32 +03:00
.hero feat(service-base): migrate all binaries to service_base! macro and HeroLog; fix kill_other socket cleanup 2026-05-12 10:34:54 +02:00
_archive refactor: replace integration_tests crate with hero_proc_test, add SSE log streaming to SDK 2026-05-14 10:35:45 +02:00
crates refactor(admin): migrate API tab to <hero-api-docs> web component 2026-05-19 03:16:22 +02:00
docker Add hero_proc project structure and implementation 2026-03-19 07:03:35 +01:00
docs feat(hero_proc): HP-02 daemon singleton enforcement 2026-05-18 19:46:05 +03:00
errors refactor: migrate socket/SSE helpers, clean service.toml metadata, update test fixtures and error reports 2026-05-17 08:21:52 +02:00
examples chore: remove bash/make build files; use nu shell scripts only 2026-05-07 21:16:02 +02:00
memory feat(logs,supervisor,ui): improve job logs lookup and API documentation UI 2026-05-02 15:47:13 +02:00
prompts auto: commit local changes before merge 2026-04-13 17:12:19 +02:00
specs feat: add run.submit RPC with concurrency pool, job_sequence, and cleanup 2026-05-15 16:58:03 +02:00
.gitignore feat(openrpc): expand API surface — runs, job archives, probes, log aliases, archived filtering 2026-05-16 15:43:30 +02:00
Cargo.lock chore(deps): bump herolib_core to pick up PATH_ROOT-only env resolution 2026-05-19 00:29:37 +03:00
Cargo.toml chore(deps): drop local [patch] override for herolib_core 2026-05-14 17:16:22 +02:00
Cargo.toml.hero_builder_backup refactor: extract server run() into reusable lib entry point 2026-05-10 13:07:45 +02:00
changes init fixing 2026-04-13 09:44:36 +02:00
PURPOSE.md refactor: consolidate crates, rewrite logging/scheduler/service/secrets modules 2026-05-14 07:36:51 +02:00
README.md refactor(cli): simplify job/log CLI to use IDs directly, add action logs, overhaul SDK docs 2026-05-15 19:45:39 +02:00
validation_report.json chore: update rust-version to 1.95.0 and canonical dep versions 2026-05-08 09:35:33 +02:00

hero_proc

A lightweight process supervisor with dependency management, similar to systemd but simpler.

Quick Start

Install and Run

service proc start --update --reset

Use the CLI

hero_proc service list
hero_proc service status my-service
hero_proc service start my-service
hero_proc service stop my-service

The web admin dashboard is available via hero_proc_admin on the admin socket.


Documentation


Features

  • Dependency Graph: Services declare dependencies (requires, after, wants, conflicts)
  • State Machine: Explicit states (Inactive, Blocked, Starting, Running, Stopping, Success, Exited, Failed)
  • Process Groups: Signals sent to process groups, handling sh -c child processes correctly
  • Health Checks: TCP, HTTP, and exec-based health checks with retries
  • Ordered Shutdown: Dependents stop before their dependencies
  • Hot Reload: Reload configuration without full restart
  • Secrets Management: Encrypted secret storage with Forgejo sync (init, pull, push)
  • Scheduled Actions: Cron-based scheduling for recurring tasks
  • PTY Attach: Live terminal attach to running processes via WebSocket
  • Web Admin Dashboard: Real-time service management UI with charts, logs, events, and bulk operations
  • TUI Dashboard: Interactive terminal UI for service management (ratatui-based)
  • Fully Embedded UI: All assets (Bootstrap, Chart.js, icons) compiled into the binary — no CDN or network required
  • OpenRPC API: 92 JSON-RPC 2.0 methods over Unix socket

Architecture

hero_proc_server (daemon)
    | unix socket (IPC + JSON-RPC 2.0)
    v
hero_proc (CLI/TUI)        hero_proc_admin (web admin dashboard)
                         | unix socket (admin.sock)

Crate Structure

crates/
  hero_proc_sdk/               # OpenRPC client SDK — generated client + builders + factory
  hero_proc_server/            # Process supervisor daemon (JSON-RPC 2.0 via Unix socket)
  hero_proc/                   # Command-line interface + TUI
  hero_proc_admin/             # Web admin dashboard (Axum + Askama + Bootstrap)
  hero_proc_lib/               # SQLite persistence layer (jobs, runs, secrets, logging, services)
  hero_proc_examples/          # Runnable SDK usage examples
  hero_proc_test/  # Integration test suite + stress tests

Dependency Graph

        hero_proc_sdk (no internal deps)
           ^         ^        ^         ^
           |         |        |         |
        server      CLI      UI       lib

All crates depend on hero_proc_sdk. No cross-dependencies between server, CLI, UI, or lib.

Ports and Sockets

Component Binding Default
hero_proc_server Unix socket (IPC) $HERO_SOCKET_DIR/hero_proc/rpc.sock
hero_proc_admin Unix socket (admin) $HERO_SOCKET_DIR/hero_proc/admin.sock

Core Concepts

Concept Role Lifetime
Action Executable template (script + interpreter + config) Stored, reusable
Service Supervision unit — desired state + auto-restart Ongoing, supervisor-managed
Job Single execution of an action Transient
Run Universal grouping unit — groups jobs under a single lifecycle Transient

Service

A service is a supervision unit (like a systemd unit). It declares a desired state and references one or more actions. The supervisor continuously reconciles reality with the desired state:

  • start — supervisor ensures the service is running; restarts on crash
  • stop — supervisor ensures the service is stopped
  • ignore — supervisor does not manage this service

Action

An action is a reusable executable template: a script, its interpreter, environment, timeout, retry policy, and dependency edges. Actions can declare depends_on other actions for intra-service ordering.

Job

A job is a single execution of an action. Jobs can be one-shot (run and exit) or long-running processes (is_process = true), where exiting is treated as failure. Each job tracks phase (pending -> running -> succeeded/failed), PID, exit code, and logs.

Run

A run is the universal execution grouping unit. It serves two roles:

  1. Service run — created automatically when a service is started. Named service_{name}, with service_id pointing back to the owning service. If a service has 3 actions, starting it creates 1 run with 3 jobs.
  2. Ad-hoc run — standalone execution of a set of actions (e.g., build pipelines, one-off tasks). Name is required. service_id is None.

A run can depend on other runs by ID — the supervisor will not start it until all dependency runs have reached "ok". Status progression: created → waiting_deps → starting → running → ok | error | halted.

Principles

  • Run is the universal grouping unit: both ad-hoc executions and service starts create a Run. The service_id field distinguishes them.
  • Cascade delete: deleting a Run or Service deletes all associated Jobs. A Job belongs to exactly one Run.
  • Clean restart: when a Service is started, previous Jobs for that Service are removed from the database by default (can be overridden).
  • Provenance tracking: each Job records its service_id and action_id so the origin is always traceable.

For the full data model specification, see docs/README.md.

CLI Commands

All CLI commands are organized into subcommand groups:

Service Management

hero_proc service list              # List all services
hero_proc service status <name>     # Show service status
hero_proc service start <name>      # Start a service
hero_proc service stop <name>       # Stop (cascades to dependents)
hero_proc service restart <name>    # Restart a service
hero_proc service kill <name>       # Send signal to service
hero_proc service add <name>        # Add a service at runtime
hero_proc service add-job <svc> ... # Add a job to a service
hero_proc service remove <name>     # Remove a service
hero_proc service logs <name>       # View service logs
hero_proc service why <name>        # Show why service is blocked
hero_proc service tree              # Show dependency tree

Job Management

hero_proc job list                  # List jobs
hero_proc job get <id>              # Get job details
hero_proc job create ...            # Create a job
hero_proc job delete <id>           # Delete a job
hero_proc job status <id>           # Job status
hero_proc job logs <id>             # Job logs
hero_proc job retry <id>            # Retry a failed job
hero_proc job cancel <id>           # Cancel a running job

Run Tracking

hero_proc run list                  # List runs
hero_proc run get <id>              # Get run details
hero_proc run logs <id>             # Run logs
hero_proc run stats                 # Run statistics

Log Management

hero_proc log query                 # Query logs
hero_proc log filter                # Filter logs
hero_proc log prune                 # Prune old logs
hero_proc log export                # Export logs

Secrets Management

hero_proc secret set <key> <val>    # Set a secret
hero_proc secret get <key>          # Get a secret
hero_proc secret list               # List secrets
hero_proc secret delete <key>       # Delete a secret
hero_proc secret init               # Initialize secrets store
hero_proc secret pull               # Pull secrets from Forgejo
hero_proc secret push               # Push secrets to Forgejo

Actions

hero_proc action list               # List actions
hero_proc action get <name>         # Get action details
hero_proc action set ...            # Register an action
hero_proc action delete <name>      # Delete an action

Scripts

hero_proc script scan               # Scan for scripts
hero_proc script list               # List registered scripts
hero_proc script get <name>         # Get script details
hero_proc script set ...            # Register a script
hero_proc script delete <name>      # Delete a script
hero_proc script run <name>         # Run a script

System

hero_proc system ping               # Check daemon connectivity
hero_proc system health             # Server health check
hero_proc system stats              # System statistics
hero_proc system shutdown [--force] # Shutdown daemon
hero_proc system reset [--force]    # Stop all, delete all configs
hero_proc system wipe               # Wipe all data
hero_proc system demo               # Create demo services
hero_proc system schedules          # List scheduled actions

Debug

hero_proc debug state               # Full graph state dump
hero_proc debug procs               # Process tree dump

Other

hero_proc attach <name>             # Attach to PTY of running process
hero_proc tui                       # Launch interactive TUI dashboard

Web Admin Dashboard

The hero_proc_admin crate provides a real-time web admin dashboard with tabs for:

  • Actions: Registered actions with interpreter, timeout, and tags
  • Jobs: Job instances with phase, status, and logs; includes statistics
  • Runs: Execution runs with status and job counts
  • Services: Service management, dependencies, and action mappings
  • Secrets: Encrypted configuration values
  • Logs: Query and filter system logs by source, level, and timestamp

All UI assets (Bootstrap 5.3.3, Bootstrap Icons) are embedded in the binary via rust-embed.

# Start server + admin dashboard
service proc start --update --reset

SDK Usage

hero_proc_sdk is builder-first. The four fluent builders (RetryPolicyBuilder, ActionBuilder, ServiceBuilder, RunBuilder) cover everything you do against hero_proc, and a single HeroProcFactory handle (hp) exposes both convenience helpers and the full RPC surface via Deref.

For the full reference see crates/hero_proc_sdk/README.md and BUILDERS.md.

Connect

use hero_proc_sdk::*;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let hp = hero_proc_factory().await?;          // local Unix socket
    let pong = hp.system_ping(SystemPingInput {}).await?;
    println!("server: {}", pong.version);
    Ok(())
}

Remote:

let hp = HeroProcFactory::builder().http("http://10.0.0.1:8080").connect().await?;

Run a long-running daemon — ServiceBuilder

let svc = ServiceBuilder::new("api")
    .action(ActionBuilder::new("api", "node server.js")
        .env("PORT", "8080")
        .retry_builder(|b| b.max_attempts(10).delay_ms(2_000).backoff(true))
        .build())
    .requires(&["postgres"])
    .build();

hp.start_service("api", svc, 60).await?;          // register + start + wait

Presets when the full builder is overkill: simple_service, oneshot_service, system_service, sleep_service.

Submit a one-shot or batch — RunBuilder

// Trivial one-liner
let handle = hp.submit_oneshot("backup", "rsync -av /data /backup").await?;
hp.wait_run(handle.run_id, 300).await?;

// Multi-step batch with concurrency cap, mixed interpreters, auto-cleanup
let handle = RunBuilder::new("daily-checks")
    .max_concurrency(3)
    .add_inline_script_with(
        "row_count",
        "import sqlite3; print(sqlite3.connect('/data/app.db').execute('SELECT COUNT(*) FROM users').fetchone()[0])",
        Interpreter::Python3,
    )
    .add_inline_script_with("big_files", "ls /var/log | where size > 100mb", Interpreter::Nushell)
    .add_inline_script("notify", "curl -X POST https://hooks.example.com/done")
    .submit(&hp).await?;

The supervisor honours max_concurrency (1..=100) per run, walks the actions array in submission order, skips past dependency-blocked jobs, and auto-cleans inline actions when the run reaches ok. Defaults applied automatically when any inline action is present: cap=5, cleanup_on_success=true.

Interpreters: Bash (default), Sh, Python3, Node, Bun, Nushell, Exec, Ai, Mcp.

Convenience helpers on hp

hp.wait_run(run_id, secs).await?;             // poll to terminal state
hp.wait_job(job_id, secs).await?;
hp.wait_service_running("api", secs).await?;

hp.tail("api", 50).await?;                    // structured logs by service
hp.job_tail(job_id, 100).await?;              // one job's stdout/stderr
hp.search("api.*", 200).await?;               // wildcard search
hp.recent_errors(Some("api.*"), 50).await?;   // loglevel >= 3

Every generated RPC method is also available directly on hp (~109 methods, type-safe Input/Output structs).

Environment Variables

Required

Variable Description
WEBROOT Base URL of the hero_proc admin dashboard (e.g. http://127.0.0.1:9998/).

Optional

Variable Default Description
HERO_PROC_LOG_LEVEL info Log level: trace, debug, info, warn, error
HERO_PROC_CONFIG_DIR ~/hero/cfg/hero_proc Service config directory
HERO_PROC_SOCKET $HERO_SOCKET_DIR/hero_proc/rpc.sock Unix socket path

Shutdown Ordering

Services are stopped in reverse dependency order:

Example: database <- app <- worker

Startup order:   database -> app -> worker
Shutdown order:  worker -> app -> database

When stopping a single service, dependents are stopped first:

  • hero_proc service stop database stops worker, then app, then database
  • Dependencies are NOT auto-stopped (other services may need them)

Development

Start / Stop

service proc start --update --reset   # Install, build, and start
service proc stop                     # Graceful shutdown
service proc start --clear            # Wipe state and restart fresh

Integration Tests

cargo test --test shutdown -- --nocapture --test-threads=1  # Shutdown tests