Refactor home scripts to be wrappers around component scripts

- Simplified build.sh to call individual component build scripts
- Refactored run.sh to call component run scripts with live log display
- Each service gets color-coded prefix ([SUPER], [OSIRS], [COORD])
- Run script keeps running and displays interleaved logs from all services
- Added proper signal handling for graceful shutdown of all services
- Updated README with new architecture and usage examples
This commit is contained in:
Timur Gordon
2025-11-04 17:05:41 +01:00
parent ac6020d883
commit 4aa1a20010
8 changed files with 1780 additions and 0 deletions

181
scripts/README.md Normal file
View File

@@ -0,0 +1,181 @@
# Hero System Scripts
This directory contains wrapper scripts for building and running the Hero system components.
## Overview
The Hero system consists of three main components:
- **Hero Coordinator**: Manages job coordination and routing
- **Hero Supervisor**: Manages runner processes and job execution
- **Osiris Runner**: Executes jobs using the Osiris protocol
## Architecture
The scripts in this directory are **wrappers** that delegate to individual component scripts:
- `home/scripts/build.sh` → calls `build.sh` in each component repo
- `home/scripts/run.sh` → calls `run.sh` in each component repo
- Each component's `run.sh` → calls its own `build.sh` before running
This ensures each component is self-contained and can be built/run independently.
## Scripts
### build.sh
Wrapper that builds all Hero system components by calling their individual build scripts.
**Usage:**
```bash
./build.sh
```
This calls:
- `herocoordinator/scripts/build.sh`
- `supervisor/scripts/build.sh`
- `runner_rust/scripts/build.sh`
### run.sh
Manages Hero system services.
**Usage:**
```bash
./run.sh [command]
```
**Commands:**
- `start` - Start all services (default) - each service builds itself first
- `stop` - Stop all services
- `restart` - Restart all services
- `status` - Show status of all services
- `logs` - Tail all service logs
**Examples:**
```bash
./run.sh start # Start all services (builds automatically)
./run.sh status # Check status
./run.sh logs # View logs
./run.sh stop # Stop all services
./run.sh restart # Restart all services
```
**Note:** Each component's `run.sh` automatically calls its `build.sh` before starting, so you don't need to build manually.
## Components
### Coordinator
- **Binary:** `herocoordinator/target/release/herocoordinator`
- **Port:** 8081 (configurable via `COORDINATOR_PORT`)
- **Purpose:** Coordinates job execution across the system
### Supervisor
- **Binary:** `supervisor/target/release/supervisor`
- **Port:** 3030 (configurable via `SUPERVISOR_PORT`)
- **Purpose:** Manages runners and job dispatch via OpenRPC
### Osiris Runner
- **Binary:** `runner_rust/target/release/runner_osiris`
- **Purpose:** Executes Osiris-specific jobs
## Configuration
Set environment variables before running:
```bash
# Redis connection
export REDIS_URL="redis://127.0.0.1:6379"
# Service ports
export COORDINATOR_PORT=8081
export SUPERVISOR_PORT=3030
# Logging
export LOG_LEVEL=info # Options: trace, debug, info, warn, error
```
## Logs and PIDs
- **Logs:** `/Users/timurgordon/code/git.ourworld.tf/herocode/home/logs/`
- `coordinator.log`
- `supervisor.log`
- `osiris_runner.log`
- **PIDs:** `/Users/timurgordon/code/git.ourworld.tf/herocode/home/pids/`
- `coordinator.pid`
- `supervisor.pid`
- `osiris_runner.pid`
## Prerequisites
1. **Redis** must be running:
```bash
redis-server
```
2. **Rust toolchain** must be installed:
```bash
rustc --version
cargo --version
```
## Troubleshooting
### Services won't start
1. Check if binaries are built:
```bash
ls -la ../herocoordinator/target/release/herocoordinator
ls -la ../supervisor/target/release/supervisor
ls -la ../runner_rust/target/release/runner_osiris
```
2. If missing, build them:
```bash
./build.sh
```
### Check logs
```bash
# View all logs
./run.sh logs
# Or view individual logs
tail -f ../home/logs/coordinator.log
tail -f ../home/logs/supervisor.log
tail -f ../home/logs/osiris_runner.log
```
### Redis not running
```bash
# Start Redis
redis-server
# Or in background
redis-server --daemonize yes
```
### Port already in use
Change the port via environment variables:
```bash
COORDINATOR_PORT=8082 SUPERVISOR_PORT=3031 ./run.sh start
```
## Development Workflow
```bash
# 1. Make code changes
# ... edit code ...
# 2. Rebuild affected component
./build.sh coordinator # or supervisor, or runner
# 3. Restart services
./run.sh restart
# 4. Check logs
./run.sh logs
```

20
scripts/build.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
set -e
# Hero System Build Script - Wrapper
# Calls build.sh in each component repo
HERO_BASE="/Users/timurgordon/code/git.ourworld.tf/herocode"
echo "Building Hero System components..."
# Build coordinator
"$HERO_BASE/herocoordinator/scripts/build.sh"
# Build supervisor
"$HERO_BASE/supervisor/scripts/build.sh"
# Build osiris runner
"$HERO_BASE/runner_rust/scripts/build.sh"
echo "✅ All Hero System components built successfully"

222
scripts/run.sh Normal file → Executable file
View File

@@ -0,0 +1,222 @@
#!/bin/bash
set -e
# Hero System Startup Script - Wrapper
# Runs individual run.sh scripts in background with logging
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Base directory
HERO_BASE="/Users/timurgordon/code/git.ourworld.tf/herocode"
# Log directory
LOG_DIR="$HERO_BASE/home/logs"
mkdir -p "$LOG_DIR"
# PID file directory
PID_DIR="$HERO_BASE/home/pids"
mkdir -p "$PID_DIR"
# Function to print colored messages
log_info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
log_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Function to check if Redis is running
check_redis() {
log_info "Checking Redis connection..."
if redis-cli ping > /dev/null 2>&1; then
log_success "Redis is running"
return 0
else
log_error "Redis is not running. Please start Redis first."
log_info "You can start Redis with: redis-server"
return 1
fi
}
# Function to start a component with log prefix
start_component() {
local name=$1
local script=$2
local log_file=$3
local pid_file=$4
local prefix=$5
log_info "Starting $name..."
# Start the script and pipe output through a prefix filter
"$script" 2>&1 | while IFS= read -r line; do
echo -e "${prefix}${NC} $line"
done &
local pid=$!
echo $pid > "$pid_file"
sleep 2
if ps -p $pid > /dev/null 2>&1; then
log_success "$name started (PID: $pid)"
else
log_error "Failed to start $name"
return 1
fi
}
# Function to stop a process
stop_process() {
local name=$1
local pid_file=$2
if [ -f "$pid_file" ]; then
local pid=$(cat "$pid_file")
if ps -p $pid > /dev/null 2>&1; then
log_info "Stopping $name (PID: $pid)..."
kill "$pid" 2>/dev/null || true
sleep 2
# Force kill if still running
if ps -p "$pid" > /dev/null 2>&1; then
log_warning "Force killing $name..."
kill -9 "$pid" 2>/dev/null || true
fi
rm -f "$pid_file"
log_success "$name stopped"
else
log_info "$name is not running"
rm -f "$pid_file"
fi
else
log_info "$name is not running"
fi
}
# Function to show status
show_status() {
echo ""
log_info "Hero System Status:"
echo ""
for service in coordinator supervisor osiris_runner; do
local pid_file="$PID_DIR/${service}.pid"
if [ -f "$pid_file" ]; then
local pid=$(cat "$pid_file")
if ps -p $pid > /dev/null 2>&1; then
echo -e " ${GREEN}${NC} $service (PID: $pid)"
else
echo -e " ${RED}${NC} $service (dead, PID file exists)"
fi
else
echo -e " ${RED}${NC} $service (not running)"
fi
done
echo ""
}
# Function to stop all services
stop_all() {
log_info "Stopping Hero System..."
stop_process "Osiris Runner" "$PID_DIR/osiris_runner.pid"
stop_process "Supervisor" "$PID_DIR/supervisor.pid"
stop_process "Coordinator" "$PID_DIR/coordinator.pid"
log_success "All services stopped"
}
# Trap handler for cleanup on exit
cleanup() {
echo ""
log_warning "Received exit signal, stopping all services..."
stop_all
exit 0
}
# Main script
case "${1:-start}" in
start)
log_info "Starting Hero System..."
# Set up trap for SIGINT and SIGTERM
trap cleanup SIGINT SIGTERM
# Check Redis
if ! check_redis; then
exit 1
fi
# # Start services
# start_component "Coordinator" \
# "$HERO_BASE/herocoordinator/scripts/run.sh" \
# "$LOG_DIR/coordinator.log" \
# "$PID_DIR/coordinator.pid" \
# "${GREEN}[COORD]" || exit 1
start_component "Supervisor" \
"$HERO_BASE/supervisor/scripts/run.sh" \
"$LOG_DIR/supervisor.log" \
"$PID_DIR/supervisor.pid" \
"${BLUE}[SUPER]" || exit 1
start_component "Osiris Runner" \
"$HERO_BASE/runner_rust/scripts/run.sh" \
"$LOG_DIR/osiris_runner.log" \
"$PID_DIR/osiris_runner.pid" \
"${YELLOW}[OSIRS]" || exit 1
echo ""
log_success "Hero System started successfully!"
show_status
echo ""
log_info "Displaying live logs from all services..."
log_info "Press Ctrl+C to stop all services"
echo ""
# Keep running and wait for signals
wait
;;
stop)
stop_all
;;
restart)
stop_all
sleep 2
$0 start
;;
status)
show_status
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
echo ""
echo "Commands:"
echo " start - Start all Hero services with live log display (default)"
echo " stop - Stop all Hero services"
echo " restart - Restart all Hero services"
echo " status - Show status of all services"
echo ""
echo "Note: 'start' displays live logs from all services with color-coded prefixes."
echo " Each service's run.sh will build itself before starting."
exit 1
;;
esac