rename worker to actor
This commit is contained in:
		@@ -1,10 +1,10 @@
 | 
			
		||||
# Hero Supervisor Protocol
 | 
			
		||||
 | 
			
		||||
This document describes the Redis-based protocol used by the Hero Supervisor for job management and worker communication.
 | 
			
		||||
This document describes the Redis-based protocol used by the Hero Supervisor for job management and actor communication.
 | 
			
		||||
 | 
			
		||||
## Overview
 | 
			
		||||
 | 
			
		||||
The Hero Supervisor uses Redis as a message broker and data store for managing distributed job execution. Jobs are stored as Redis hashes, and communication with workers happens through Redis lists (queues).
 | 
			
		||||
The Hero Supervisor uses Redis as a message broker and data store for managing distributed job execution. Jobs are stored as Redis hashes, and communication with actors happens through Redis lists (queues).
 | 
			
		||||
 | 
			
		||||
## Redis Namespace
 | 
			
		||||
 | 
			
		||||
@@ -22,7 +22,7 @@ hero:job:{job_id}
 | 
			
		||||
**Job Hash Fields:**
 | 
			
		||||
- `id`: Unique job identifier (UUID v4)
 | 
			
		||||
- `caller_id`: Identifier of the client that created the job
 | 
			
		||||
- `worker_id`: Target worker identifier
 | 
			
		||||
- `actor_id`: Target actor identifier
 | 
			
		||||
- `context_id`: Execution context identifier
 | 
			
		||||
- `script`: Script content to execute (Rhai or HeroScript)
 | 
			
		||||
- `timeout`: Execution timeout in seconds
 | 
			
		||||
@@ -35,8 +35,8 @@ hero:job:{job_id}
 | 
			
		||||
- `env_vars`: Environment variables as JSON object (optional)
 | 
			
		||||
- `prerequisites`: JSON array of job IDs that must complete before this job (optional)
 | 
			
		||||
- `dependents`: JSON array of job IDs that depend on this job completing (optional)
 | 
			
		||||
- `output`: Job execution result (set by worker)
 | 
			
		||||
- `error`: Error message if job failed (set by worker)
 | 
			
		||||
- `output`: Job execution result (set by actor)
 | 
			
		||||
- `error`: Error message if job failed (set by actor)
 | 
			
		||||
- `dependencies`: List of job IDs that this job depends on
 | 
			
		||||
 | 
			
		||||
### Job Dependencies
 | 
			
		||||
@@ -47,19 +47,19 @@ Jobs can have dependencies on other jobs, which are stored in the `dependencies`
 | 
			
		||||
 | 
			
		||||
Jobs are queued for execution using Redis lists:
 | 
			
		||||
```
 | 
			
		||||
hero:work_queue:{worker_id}
 | 
			
		||||
hero:work_queue:{actor_id}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Workers listen on their specific queue using `BLPOP` for job IDs to process.
 | 
			
		||||
Actors listen on their specific queue using `BLPOP` for job IDs to process.
 | 
			
		||||
 | 
			
		||||
### Stop Queues
 | 
			
		||||
 | 
			
		||||
Job stop requests are sent through dedicated stop queues:
 | 
			
		||||
```
 | 
			
		||||
hero:stop_queue:{worker_id}
 | 
			
		||||
hero:stop_queue:{actor_id}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Workers monitor these queues to receive stop requests for running jobs.
 | 
			
		||||
Actors monitor these queues to receive stop requests for running jobs.
 | 
			
		||||
 | 
			
		||||
### Reply Queues
 | 
			
		||||
 | 
			
		||||
@@ -68,7 +68,7 @@ For synchronous job execution, dedicated reply queues are used:
 | 
			
		||||
hero:reply:{job_id}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Workers send results to these queues when jobs complete.
 | 
			
		||||
Actors send results to these queues when jobs complete.
 | 
			
		||||
 | 
			
		||||
## Job Lifecycle
 | 
			
		||||
 | 
			
		||||
@@ -79,20 +79,20 @@ Client -> Redis: HSET hero:job:{job_id} {job_fields}
 | 
			
		||||
 | 
			
		||||
### 2. Job Submission
 | 
			
		||||
```
 | 
			
		||||
Client -> Redis: LPUSH hero:work_queue:{worker_id} {job_id}
 | 
			
		||||
Client -> Redis: LPUSH hero:work_queue:{actor_id} {job_id}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
### 3. Job Processing
 | 
			
		||||
```
 | 
			
		||||
Worker -> Redis: BLPOP hero:work_queue:{worker_id}
 | 
			
		||||
Worker -> Redis: HSET hero:job:{job_id} status "started"
 | 
			
		||||
Worker: Execute script
 | 
			
		||||
Worker -> Redis: HSET hero:job:{job_id} status "finished" output "{result}"
 | 
			
		||||
Actor -> Redis: BLPOP hero:work_queue:{actor_id}
 | 
			
		||||
Actor -> Redis: HSET hero:job:{job_id} status "started"
 | 
			
		||||
Actor: Execute script
 | 
			
		||||
Actor -> Redis: HSET hero:job:{job_id} status "finished" output "{result}"
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
### 4. Job Completion (Async)
 | 
			
		||||
```
 | 
			
		||||
Worker -> Redis: LPUSH hero:reply:{job_id} {result}
 | 
			
		||||
Actor -> Redis: LPUSH hero:reply:{job_id} {result}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## API Operations
 | 
			
		||||
@@ -110,7 +110,7 @@ supervisor.list_jobs() -> Vec<String>
 | 
			
		||||
supervisor.stop_job(job_id) -> Result<(), SupervisorError>
 | 
			
		||||
```
 | 
			
		||||
**Redis Operations:**
 | 
			
		||||
- `LPUSH hero:stop_queue:{worker_id} {job_id}` - Send stop request
 | 
			
		||||
- `LPUSH hero:stop_queue:{actor_id} {job_id}` - Send stop request
 | 
			
		||||
 | 
			
		||||
### Get Job Status
 | 
			
		||||
```rust
 | 
			
		||||
@@ -131,20 +131,20 @@ supervisor.get_job_logs(job_id) -> Result<Option<String>, SupervisorError>
 | 
			
		||||
 | 
			
		||||
### Run Job and Await Result
 | 
			
		||||
```rust
 | 
			
		||||
supervisor.run_job_and_await_result(job, worker_id) -> Result<String, SupervisorError>
 | 
			
		||||
supervisor.run_job_and_await_result(job, actor_id) -> Result<String, SupervisorError>
 | 
			
		||||
```
 | 
			
		||||
**Redis Operations:**
 | 
			
		||||
1. `HSET hero:job:{job_id} {job_fields}` - Store job
 | 
			
		||||
2. `LPUSH hero:work_queue:{worker_id} {job_id}` - Submit job
 | 
			
		||||
2. `LPUSH hero:work_queue:{actor_id} {job_id}` - Submit job
 | 
			
		||||
3. `BLPOP hero:reply:{job_id} {timeout}` - Wait for result
 | 
			
		||||
 | 
			
		||||
## Worker Protocol
 | 
			
		||||
## Actor Protocol
 | 
			
		||||
 | 
			
		||||
### Job Processing Loop
 | 
			
		||||
```rust
 | 
			
		||||
loop {
 | 
			
		||||
    // 1. Wait for job
 | 
			
		||||
    job_id = BLPOP hero:work_queue:{worker_id}
 | 
			
		||||
    job_id = BLPOP hero:work_queue:{actor_id}
 | 
			
		||||
    
 | 
			
		||||
    // 2. Get job details
 | 
			
		||||
    job_data = HGETALL hero:job:{job_id}
 | 
			
		||||
@@ -153,8 +153,8 @@ loop {
 | 
			
		||||
    HSET hero:job:{job_id} status "started"
 | 
			
		||||
    
 | 
			
		||||
    // 4. Check for stop requests
 | 
			
		||||
    if LLEN hero:stop_queue:{worker_id} > 0 {
 | 
			
		||||
        stop_job_id = LPOP hero:stop_queue:{worker_id}
 | 
			
		||||
    if LLEN hero:stop_queue:{actor_id} > 0 {
 | 
			
		||||
        stop_job_id = LPOP hero:stop_queue:{actor_id}
 | 
			
		||||
        if stop_job_id == job_id {
 | 
			
		||||
            HSET hero:job:{job_id} status "error" error "stopped"
 | 
			
		||||
            continue
 | 
			
		||||
@@ -175,15 +175,15 @@ loop {
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
### Stop Request Handling
 | 
			
		||||
Workers should periodically check the stop queue during long-running jobs:
 | 
			
		||||
Actors should periodically check the stop queue during long-running jobs:
 | 
			
		||||
```rust
 | 
			
		||||
if LLEN hero:stop_queue:{worker_id} > 0 {
 | 
			
		||||
    stop_requests = LRANGE hero:stop_queue:{worker_id} 0 -1
 | 
			
		||||
if LLEN hero:stop_queue:{actor_id} > 0 {
 | 
			
		||||
    stop_requests = LRANGE hero:stop_queue:{actor_id} 0 -1
 | 
			
		||||
    if stop_requests.contains(current_job_id) {
 | 
			
		||||
        // Stop current job execution
 | 
			
		||||
        HSET hero:job:{current_job_id} status "error" error "stopped_by_request"
 | 
			
		||||
        // Remove stop request
 | 
			
		||||
        LREM hero:stop_queue:{worker_id} 1 current_job_id
 | 
			
		||||
        LREM hero:stop_queue:{actor_id} 1 current_job_id
 | 
			
		||||
        return
 | 
			
		||||
    }
 | 
			
		||||
}
 | 
			
		||||
@@ -193,17 +193,17 @@ if LLEN hero:stop_queue:{worker_id} > 0 {
 | 
			
		||||
 | 
			
		||||
### Job Timeouts
 | 
			
		||||
- Client sets timeout when creating job
 | 
			
		||||
- Worker should respect timeout and stop execution
 | 
			
		||||
- Actor should respect timeout and stop execution
 | 
			
		||||
- If timeout exceeded: `HSET hero:job:{job_id} status "error" error "timeout"`
 | 
			
		||||
 | 
			
		||||
### Worker Failures
 | 
			
		||||
- If worker crashes, job remains in "started" status
 | 
			
		||||
### Actor Failures
 | 
			
		||||
- If actor crashes, job remains in "started" status
 | 
			
		||||
- Monitoring systems can detect stale jobs and retry
 | 
			
		||||
- Jobs can be requeued: `LPUSH hero:work_queue:{worker_id} {job_id}`
 | 
			
		||||
- Jobs can be requeued: `LPUSH hero:work_queue:{actor_id} {job_id}`
 | 
			
		||||
 | 
			
		||||
### Redis Connection Issues
 | 
			
		||||
- Clients should implement retry logic with exponential backoff
 | 
			
		||||
- Workers should reconnect and resume processing
 | 
			
		||||
- Actors should reconnect and resume processing
 | 
			
		||||
- Use Redis persistence to survive Redis restarts
 | 
			
		||||
 | 
			
		||||
## Monitoring and Observability
 | 
			
		||||
@@ -211,10 +211,10 @@ if LLEN hero:stop_queue:{worker_id} > 0 {
 | 
			
		||||
### Queue Monitoring
 | 
			
		||||
```bash
 | 
			
		||||
# Check work queue length
 | 
			
		||||
LLEN hero:work_queue:{worker_id}
 | 
			
		||||
LLEN hero:work_queue:{actor_id}
 | 
			
		||||
 | 
			
		||||
# Check stop queue length  
 | 
			
		||||
LLEN hero:stop_queue:{worker_id}
 | 
			
		||||
LLEN hero:stop_queue:{actor_id}
 | 
			
		||||
 | 
			
		||||
# List all jobs
 | 
			
		||||
KEYS hero:job:*
 | 
			
		||||
@@ -228,7 +228,7 @@ HGETALL hero:job:{job_id}
 | 
			
		||||
- Jobs completed per second
 | 
			
		||||
- Average job execution time
 | 
			
		||||
- Queue depths
 | 
			
		||||
- Worker availability
 | 
			
		||||
- Actor availability
 | 
			
		||||
- Error rates by job type
 | 
			
		||||
 | 
			
		||||
## Security Considerations
 | 
			
		||||
@@ -237,7 +237,7 @@ HGETALL hero:job:{job_id}
 | 
			
		||||
- Use Redis AUTH for authentication
 | 
			
		||||
- Enable TLS for Redis connections
 | 
			
		||||
- Restrict Redis network access
 | 
			
		||||
- Use Redis ACLs to limit worker permissions
 | 
			
		||||
- Use Redis ACLs to limit actor permissions
 | 
			
		||||
 | 
			
		||||
### Job Security
 | 
			
		||||
- Validate script content before execution
 | 
			
		||||
@@ -265,8 +265,8 @@ HGETALL hero:job:{job_id}
 | 
			
		||||
- Batch similar jobs when possible
 | 
			
		||||
- Implement job prioritization if needed
 | 
			
		||||
 | 
			
		||||
### Worker Optimization
 | 
			
		||||
- Pool worker connections to Redis
 | 
			
		||||
### Actor Optimization
 | 
			
		||||
- Pool actor connections to Redis
 | 
			
		||||
- Use async I/O for Redis operations
 | 
			
		||||
- Implement graceful shutdown handling
 | 
			
		||||
- Monitor worker resource usage
 | 
			
		||||
- Monitor actor resource usage
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user