...
This commit is contained in:
77
specs/architecture.md
Normal file
77
specs/architecture.md
Normal file
@@ -0,0 +1,77 @@
|
|||||||
|
|
||||||
|
|
||||||
|
## per user
|
||||||
|
|
||||||
|
runs in container or VM, one per user
|
||||||
|
|
||||||
|
- zinit
|
||||||
|
- herocoordinator
|
||||||
|
- think about like DAG worklflow manager
|
||||||
|
- manage jobs who are send around to different nodes
|
||||||
|
- mycelium address range (part of mycelium on host)
|
||||||
|
- herodb
|
||||||
|
- state manager
|
||||||
|
- redis protocol / primitives
|
||||||
|
- fs backend (mem and allways append in future)
|
||||||
|
- encryption & decryption primitives
|
||||||
|
- key mgmt for encryption (creation, deletion)
|
||||||
|
- openrpc admin features: user management, role-based access control
|
||||||
|
- postgresql + postgrest
|
||||||
|
- AI Agent TBD
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
%%{init: {"theme":"dark"}}%%
|
||||||
|
graph TD
|
||||||
|
subgraph Per Node System
|
||||||
|
N[Node] --> OS(Run on top of ZOS4 or Ubuntu or in a VM)
|
||||||
|
|
||||||
|
subgraph On Node
|
||||||
|
OS --> SV(Supervisors)
|
||||||
|
OS --> ZN(Zinit)
|
||||||
|
OS --> R(Runners)
|
||||||
|
OS --> PGN(Some Nodes: PostgreSQL + Postgrest)
|
||||||
|
OS --> HDN(Each Node: Herodb)
|
||||||
|
|
||||||
|
subgraph Supervisors Responsibilities
|
||||||
|
SV --> SV_MR(Manage runners & scheduling for the node)
|
||||||
|
SV --> SV_MJ(Monitor & schedule jobs)
|
||||||
|
SV --> SV_RU(Check resource usage)
|
||||||
|
SV --> SV_TO(Checks on timeout)
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Runners Characteristics
|
||||||
|
R --> R_LV(V/Python & Rust)
|
||||||
|
R --> R_FORK(Uses fork per runner for scalability)
|
||||||
|
R --> R_COUNT(Some runners can only run 1, others more)
|
||||||
|
R --> R_CONTEXT(Some runners are per context)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
SV -- "Manage" --> R
|
||||||
|
SV -- "Schedule jobs via" --> ZN
|
||||||
|
ZN -- "Starts" --> R
|
||||||
|
R -- "Interacts with" --> PGN
|
||||||
|
R -- "Interacts with" --> HDN
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
## per node
|
||||||
|
|
||||||
|
- run on top of ZOS4 or Ubuntu or in a VM
|
||||||
|
- supervisors
|
||||||
|
- manage runners and scheduling for the node of these runners
|
||||||
|
- monitor & schedule jobs, check resource usage, checks on timout
|
||||||
|
- zinit
|
||||||
|
- runners (are scheduled in zinit by supervisor)
|
||||||
|
- V/Python & Rust
|
||||||
|
- uses fork per runner (process) for scalability
|
||||||
|
- some runners can only run 1, others more
|
||||||
|
- some runners are per context
|
||||||
|
- some nodes will have postgresql + postgrest
|
||||||
|
- each node has herodb
|
||||||
|
|
||||||
|
REMARK
|
||||||
|
|
||||||
|
- each rhaj or heroscript running on a node can use herodb if needed (careful, because can and will be lost), but cannot communicate with anyone else outside of the node
|
||||||
|
|
||||||
|
|
16
specs/hercoordinator.md
Normal file
16
specs/hercoordinator.md
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
|
||||||
|
|
||||||
|
will have openrpc interface
|
||||||
|
|
||||||
|
- start, stop, delete, list a DAG
|
||||||
|
- query the DAG and its status
|
||||||
|
|
||||||
|
|
||||||
|
## remarks for supervisor
|
||||||
|
|
||||||
|
- no retry
|
||||||
|
- no dependencies
|
||||||
|
|
||||||
|
## inspiration
|
||||||
|
|
||||||
|
- DAGU
|
Reference in New Issue
Block a user