...
This commit is contained in:
		
							
								
								
									
										77
									
								
								specs/architecture.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										77
									
								
								specs/architecture.md
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,77 @@
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## per user
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					runs in container or VM, one per user
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- zinit
 | 
				
			||||||
 | 
					- herocoordinator
 | 
				
			||||||
 | 
					  - think about like DAG worklflow manager
 | 
				
			||||||
 | 
					  - manage jobs who are send around to different nodes
 | 
				
			||||||
 | 
					- mycelium address range (part of mycelium on host)
 | 
				
			||||||
 | 
					- herodb
 | 
				
			||||||
 | 
					    - state manager
 | 
				
			||||||
 | 
					    - redis protocol / primitives
 | 
				
			||||||
 | 
					    - fs backend (mem and allways append in future)
 | 
				
			||||||
 | 
					    - encryption & decryption primitives
 | 
				
			||||||
 | 
					    - key mgmt for encryption (creation, deletion)
 | 
				
			||||||
 | 
					    - openrpc admin features: user management, role-based access control
 | 
				
			||||||
 | 
					- postgresql + postgrest
 | 
				
			||||||
 | 
					- AI Agent TBD
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```mermaid
 | 
				
			||||||
 | 
					%%{init: {"theme":"dark"}}%%
 | 
				
			||||||
 | 
					graph TD
 | 
				
			||||||
 | 
					    subgraph Per Node System
 | 
				
			||||||
 | 
					        N[Node] --> OS(Run on top of ZOS4 or Ubuntu or in a VM)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        subgraph On Node
 | 
				
			||||||
 | 
					            OS --> SV(Supervisors)
 | 
				
			||||||
 | 
					            OS --> ZN(Zinit)
 | 
				
			||||||
 | 
					            OS --> R(Runners)
 | 
				
			||||||
 | 
					            OS --> PGN(Some Nodes: PostgreSQL + Postgrest)
 | 
				
			||||||
 | 
					            OS --> HDN(Each Node: Herodb)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					            subgraph Supervisors Responsibilities
 | 
				
			||||||
 | 
					                SV --> SV_MR(Manage runners & scheduling for the node)
 | 
				
			||||||
 | 
					                SV --> SV_MJ(Monitor & schedule jobs)
 | 
				
			||||||
 | 
					                SV --> SV_RU(Check resource usage)
 | 
				
			||||||
 | 
					                SV --> SV_TO(Checks on timeout)
 | 
				
			||||||
 | 
					            end
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					            subgraph Runners Characteristics
 | 
				
			||||||
 | 
					                R --> R_LV(V/Python & Rust)
 | 
				
			||||||
 | 
					                R --> R_FORK(Uses fork per runner for scalability)
 | 
				
			||||||
 | 
					                R --> R_COUNT(Some runners can only run 1, others more)
 | 
				
			||||||
 | 
					                R --> R_CONTEXT(Some runners are per context)
 | 
				
			||||||
 | 
					            end
 | 
				
			||||||
 | 
					        end
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        SV -- "Manage" --> R
 | 
				
			||||||
 | 
					        SV -- "Schedule jobs via" --> ZN
 | 
				
			||||||
 | 
					        ZN -- "Starts" --> R
 | 
				
			||||||
 | 
					        R -- "Interacts with" --> PGN
 | 
				
			||||||
 | 
					        R -- "Interacts with" --> HDN
 | 
				
			||||||
 | 
					    end
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## per node
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- run on top of ZOS4 or Ubuntu or in a VM
 | 
				
			||||||
 | 
					- supervisors
 | 
				
			||||||
 | 
					  - manage runners and scheduling for the node of these runners
 | 
				
			||||||
 | 
					  - monitor & schedule jobs, check resource usage, checks on timout
 | 
				
			||||||
 | 
					- zinit
 | 
				
			||||||
 | 
					- runners (are scheduled in zinit by supervisor)
 | 
				
			||||||
 | 
					    - V/Python & Rust
 | 
				
			||||||
 | 
					    - uses fork per runner (process) for scalability
 | 
				
			||||||
 | 
					    - some runners can only run 1, others more  
 | 
				
			||||||
 | 
					    - some runners are per context
 | 
				
			||||||
 | 
					- some nodes will have postgresql + postgrest
 | 
				
			||||||
 | 
					- each node has herodb
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					REMARK
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- each rhaj or heroscript running on a node can use herodb if needed (careful, because can and will be lost), but cannot communicate with anyone else outside of the node
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
							
								
								
									
										16
									
								
								specs/hercoordinator.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										16
									
								
								specs/hercoordinator.md
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,16 @@
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					will have openrpc interface
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- start, stop, delete, list a DAG
 | 
				
			||||||
 | 
					- query the DAG and its status
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## remarks for supervisor
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- no retry
 | 
				
			||||||
 | 
					- no dependencies
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## inspiration
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- DAGU
 | 
				
			||||||
		Reference in New Issue
	
	Block a user