This commit is contained in:
Maxime Van Hees
2025-08-14 14:14:34 +02:00
parent 04a1af2423
commit 0ebda7c1aa
59 changed files with 6950 additions and 354 deletions

209
docs/JOBS_QUICKSTART.md Normal file
View File

@@ -0,0 +1,209 @@
# Jobs Quickstart: Create and Send a Simple Job to the Supervisor
This guide shows how a new (simple) job looks, how to construct it, and how to submit it to the Supervisor. It covers:
- The minimal fields a job needs
- Picking an actor via script type
- Submitting a job using the Rust API
- Submitting a job via the OpenRPC server over Unix IPC (and WS)
Key references:
- [rust.ScriptType](core/job/src/lib.rs:16) determines the target actor queue
- [rust.Job](core/job/src/lib.rs:87) is the canonical job payload stored in Redis
- [rust.JobBuilder::new()](core/job/src/builder.rs:47), [rust.JobBuilder::caller_id()](core/job/src/builder.rs:79), [rust.JobBuilder::context_id()](core/job/src/builder.rs:74), [rust.JobBuilder::script_type()](core/job/src/builder.rs:69), [rust.JobBuilder::script()](core/job/src/builder.rs:84), [rust.JobBuilder::timeout()](core/job/src/builder.rs:94), [rust.JobBuilder::build()](core/job/src/builder.rs:158)
- [rust.SupervisorBuilder::new()](core/supervisor/src/lib.rs:124), [rust.SupervisorBuilder::build()](core/supervisor/src/lib.rs:267)
- [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:642), [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:658), [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:672), [rust.Supervisor::get_job_output()](core/supervisor/src/lib.rs:740)
- Redis key namespace: [rust.NAMESPACE_PREFIX](core/job/src/lib.rs:13)
## 1) What is a “simple job”?
A simple job is the minimal unit of work that an actor can execute. At minimum, you must provide:
- caller_id: String (identifier of the requester; often a public key)
- context_id: String (the “circle” or execution context)
- script: String (the code to run; Rhai for OSIS/SAL; HeroScript for V/Python)
- script_type: ScriptType (OSIS | SAL | V | Python)
- timeout: Duration (optional; default used if not set)
The jobs script_type selects the actor and thus the queue. See [rust.ScriptType::actor_queue_suffix()](core/job/src/lib.rs:29) for mapping.
## 2) Choosing the actor by ScriptType
- OSIS: Rhai script, sequential non-blocking
- SAL: Rhai script, blocking async, concurrent
- V: HeroScript via V engine
- Python: HeroScript via Python engine
Pick the script_type that matches your script/runtime requirements. See design summary in [core/docs/architecture.md](core/docs/architecture.md).
## 3) Build and submit a job using the Rust API
This is the most direct, strongly-typed integration. You will:
1) Build a Supervisor
2) Construct a Job (using the “core” job builder for explicit caller_id/context_id)
3) Submit it with either:
- create_job + start_job (two-step)
- run_job_and_await_result (one-shot request-reply)
Note: We deliberately use the core job builder (hero_job) so we can set caller_id explicitly via [rust.JobBuilder::caller_id()](core/job/src/builder.rs:79).
Example Rhai script (returns 42):
```rhai
40 + 2
```
Rust example (two-step create + start + poll output):
```rust
use hero_supervisor::{SupervisorBuilder, ScriptType};
use hero_job::JobBuilder as CoreJobBuilder;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// 1) Build a Supervisor
let supervisor = SupervisorBuilder::new()
.redis_url("redis://127.0.0.1/")
.build()
.await?;
// 2) Build a Job (using core job builder to set caller_id, context_id)
let job = CoreJobBuilder::new()
.caller_id("02abc...caller") // required
.context_id("02def...context") // required
.script_type(ScriptType::SAL) // select the SAL actor
.script("40 + 2") // simple Rhai script
.timeout(std::time::Duration::from_secs(10))
.build()?; // returns hero_job::Job
let job_id = job.id.clone();
// 3a) Store the job in Redis
supervisor.create_job(&job).await?;
// 3b) Start the job (pushes ID to the actors Redis queue)
supervisor.start_job(&job_id).await?;
// 3c) Fetch output when finished (or poll status via get_job_status)
if let Some(output) = supervisor.get_job_output(&job_id).await? {
println!("Job {} output: {}", job_id, output);
} else {
println!("Job {} has no output yet", job_id);
}
Ok(())
}
```
Rust example (one-shot request-reply):
```rust
use hero_supervisor::{SupervisorBuilder, ScriptType};
use hero_job::JobBuilder as CoreJobBuilder;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let supervisor = SupervisorBuilder::new()
.redis_url("redis://127.0.0.1/")
.build()
.await?;
let job = CoreJobBuilder::new()
.caller_id("02abc...caller")
.context_id("02def...context")
.script_type(ScriptType::SAL)
.script("40 + 2")
.timeout(std::time::Duration::from_secs(10))
.build()?;
// Creates the job, dispatches it to the correct actor queue,
// and waits for a reply on the dedicated reply queue.
let output = supervisor.run_job_and_await_result(&job).await?;
println!("Synchronous output: {}", output);
Ok(())
}
```
References used in this flow:
- [rust.SupervisorBuilder::new()](core/supervisor/src/lib.rs:124), [rust.SupervisorBuilder::build()](core/supervisor/src/lib.rs:267)
- [rust.JobBuilder::caller_id()](core/job/src/builder.rs:79), [rust.JobBuilder::context_id()](core/job/src/builder.rs:74), [rust.JobBuilder::script_type()](core/job/src/builder.rs:69), [rust.JobBuilder::script()](core/job/src/builder.rs:84), [rust.JobBuilder::timeout()](core/job/src/builder.rs:94), [rust.JobBuilder::build()](core/job/src/builder.rs:158)
- [rust.Supervisor::create_job()](core/supervisor/src/lib.rs:642), [rust.Supervisor::start_job()](core/supervisor/src/lib.rs:658), [rust.Supervisor::get_job_output()](core/supervisor/src/lib.rs:740)
- [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:672)
## 4) Submit a job via the OpenRPC server (Unix IPC or WebSocket)
The OpenRPC server exposes JSON-RPC 2.0 methods which proxy to the Supervisor:
- Types: [rust.JobParams](interfaces/openrpc/server/src/types.rs:6)
- Methods registered in [interfaces/openrpc/server/src/lib.rs](interfaces/openrpc/server/src/lib.rs:117)
Unix IPC launcher and client:
- Server: [interfaces/unix/server/src/main.rs](interfaces/unix/server/src/main.rs)
- Client: [interfaces/unix/client/src/main.rs](interfaces/unix/client/src/main.rs)
Start the IPC server:
```bash
cargo run -p hero-unix-server -- \
--socket /tmp/baobab.ipc \
--db-path ./db
```
Create a job (JSON-RPC, IPC):
```bash
cargo run -p hero-unix-client -- \
--socket /tmp/baobab.ipc \
--method create_job \
--params '{
"script": "40 + 2",
"script_type": "SAL",
"caller_id": "02abc...caller",
"context_id": "02def...context",
"timeout": 10
}'
```
This returns the job_id. Then start the job:
```bash
cargo run -p hero-unix-client -- \
--socket /tmp/baobab.ipc \
--method start_job \
--params '["<job_id_from_create>"]'
```
Fetch output (optional):
```bash
cargo run -p hero-unix-client -- \
--socket /tmp/baobab.ipc \
--method get_job_output \
--params '["<job_id_from_create>"]'
```
Notes:
- The “run_job” JSON-RPC method is present but not fully wired to the full request-reply flow; prefer create_job + start_job + get_job_output for now.
- JobParams fields are defined in [rust.JobParams](interfaces/openrpc/server/src/types.rs:6).
## 5) What happens under the hood
- The job is serialized to Redis under the namespace [rust.NAMESPACE_PREFIX](core/job/src/lib.rs:13)
- The Supervisor picks the actor queue from [rust.ScriptType::actor_queue_suffix()](core/job/src/lib.rs:29) and LPUSHes your job ID
- The actor BLPOPs its queue, loads the job, executes your script, and stores the result back into the Redis job hash
- For synchronous flows, Supervisor waits on a dedicated reply queue until the result arrives via [rust.Supervisor::run_job_and_await_result()](core/supervisor/src/lib.rs:672)
## 6) Minimal scripts by actor type
- OSIS/SAL (Rhai):
- "40 + 2"
- "let x = 21; x * 2"
- You can access injected context variables such as CALLER_ID, CONTEXT_ID (see architecture doc in [core/docs/architecture.md](core/docs/architecture.md)).
- V/Python (HeroScript):
- Provide a valid HeroScript snippet appropriate for the selected engine and your deployment.
## 7) Troubleshooting
- Ensure Redis is running and reachable at the configured URL
- SAL vs OSIS: pick SAL if your script is blocking/IO-heavy and needs concurrency; otherwise OSIS is fine for sequential non-blocking tasks
- If using OpenRPC IPC, ensure the socket path matches between server and client
- For lifecycle of actors (starting/restarting/health checks), see [core/supervisor/README.md](core/supervisor/README.md)