Audit alignment with hero_skills + assess migration to upgraded hero_rpc #41
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Over the past week, Kristof did a substantial alignment pass on
mainto bring hero_logic in line with current Hero ecosystem standards. While that was happening, feature work was happening on side branches (feat/39-logic-modeletc.) that don't yet have Kristof's fixes.Meanwhile,
hero_rpcandhero_skillsgot a major upgrade (parent META hero_skills#262 — now closed): canonical scaffolder, OSchema codegen with Python/JS/Rhai/Rust SDKs, hero_rpc2 trait pattern,service.tomlas source of truth,hero_admin_libfor the admin UI, the newhero_servicetemplate repo as a clone-me reference.Friend reported problems running
hero_logicand concerns that skills/template/best-practices weren't being followed. This issue is the audit + reconciliation.Kristof's recent alignment commits on
main(reference)0b4e1cc— addservice.tomlmanifests, useservice_base!()macro, drop legacy build scripts165eeaa— pin Hero ecosystem deps to 0.6.0, normalize rust-versionbe4afcd— graceful SIGTERM/SIGINT shutdown on_server4be2dc1— wire Hero file logger into tracing on both binariesb69697c— normalize rust-version + Axum 0.8 curly-brace routes470e464— migrate path resolution toherolib_core::basehelpers2ed6a80— bump hero RPC/lib deps, addhero_servicepatch, dropherolib_ai+ stale reports4158f97— split hero_logic into_server,_sdk,_clicrates (canonical naming)a854695— migratehero_logic_admintohero_admin_libpatterns0b3b473—hero_logic_adminroutes use typed SDK client07ed752— parallelize RPC calls in admin route handlersNote:
hero_lifecyclerename (washero_service) happened inhero_rpc#55after2ed6a80. The patch reference may need flipping.What to do
1. Verify
mainworks end-to-endFrom a clean clone of
main:cargo build --workspacecleanlab infocheck— 0 findings, 0 errorslab service hero_logic --installbuilds and installs the 3 binaries (hero_logic,hero_logic_server,hero_logic_admin)lab service hero_logic --start— server + admin bind their sockets--info/--info --jsonon every binary returns the embeddedservice.tomlhero_skills/skills/hero/service/hero_service_check_fix.md§7:/health,/openrpc.jsonnon-empty,/.well-known/heroservice.json, JSON-RPCsystem.pingIf any of these fail, fix in
main(or open a PR againstmain).2. Audit against current canonical references
Compare hero_logic's
mainagainst:hero_proc— canonical wiring referencehero_service— canonical template repohero_skills/skills/hero/service/*— the standardsFlag any divergences in a comment: missing pieces, stale dep refs (e.g.
hero_service→hero_lifecyclerename), legacy patterns that should be migrated.3. Assess migration to upgraded hero_rpc — read-only proposal
hero_logic currently uses traditional hand-rolled SDK + hand-written OpenRPC + Axum + Askama. The upgraded
hero_rpcnow generates all of that (server, SDK, OpenRPC spec, Rust/JS/Python/Rhai bindings) from an.oschemasource of truth.Don't migrate in this issue. Do evaluate:
Write the assessment as a comment on this issue. Do not actually migrate without sign-off.
Acceptance
mainpasses all the §1 checks (or PR opened with fixes againstmainif any failed).hero_logicfrommainend-to-end without surprises.Out of scope
feat/39-logic-modelfeature branch — separate concern; will be reconciled by mergingmainforward later.§1 —
mainend-to-end verificationWorked from a worktree of
origin/mainat07ed752on a branchaudit-41. Three fixes were needed beforemaincould clear the §1 checks; they're onaudit-41and I'll open them as a PR shortly.What I found
cargo build --workspace(with siblinghero_website_framework)cargo metadatafrom canonical lab coderoot (~/hero/code/hero_logic)hero_admin_libpath dep needs sibling not presentlab infocheckhero_logic_servermissingservice_base!()lab service hero_logic --install--infocheck fails (root-caused by canonical clone being ondevelopment+ path-dep failure + missing macro)--info/--info --jsonper binary/health,/openrpc.json,/.well-known/heroservice.jsonon rpc.sockprotocol = "openrpc"/health,/.well-known/heroservice.jsonon admin.sockprotocol = "http"system.pinghero_rpc_server(current OServer) doesn't register anysystem.*builtins. See open items belowFixes landed on
audit-418f51afd—fix(server): use service_base!() macro.hero_logic_server/src/main.rswas hand-rolling theSERVICE_TOML+BUILD_NRconsts the macro emits.lab infocheckflags any binary missing the macro call; replacing the hand-roll withuse herolib_core::service_base; service_base!();brings the server in line with the CLI and admin crates (both already used it). Diff: −17 +2.8d035e4—fix(deps): hero_admin_lib as git dep, not relative path. Root Cargo.toml hadhero_admin_lib = { path = "../hero_website_framework/crates/hero_admin_lib" }. The canonical lab-managed coderoot at~/hero/code/has no siblinghero_website_frameworkclone, socargo metadatafails out of the box. Switched to the git dep bothhero_procand thehero_servicetemplate use:{ git = "https://forge.ourworld.tf/lhumina_code/hero_website_framework.git", branch = "development" }.6499aac—fix(service.toml): mirror hero_proc — each per-crate file is the full service. Each per-crateservice.tomlpreviously listed only its own binary.hero_proc's pattern (cli, server, admin all carry the same[[binaries]]list, differing only in[service.crate]) is whatlabrelies on: it reads<bin> --info --jsonfrom one binary and expects the full binary inventory back. Updated all three files. Also normalizedhero_logic_admin's[service]block back to the canonical "Hero Logic" copy from the server —[service]describes the SERVICE, not the crate.Open items not fixed on
audit-41These came up during §1 but are not hero_logic bugs — flagging here so they don't surprise the next reader.
lab service hero_logic --startstill doesn't iterate companion binaries.hero_skills/crates/lab/src/service/service_manager.rshas a hardcodedSERVICE_MAPlisting every known service's companion daemons (hero_proc→[hero_proc_server, hero_proc_admin], etc.).hero_logicisn't in the map, soresolve_service("hero_logic")returns[hero_logic](just the CLI), and--startbails withkind = "cli", not a long-running service. The code already carries aLONG-TERMTODO to replaceSERVICE_MAPwith dynamic discovery from each binary's embeddedservice.toml. Short-term fix: a one-line entry inSERVICE_MAPforhero_logic. Worth its own follow-up issue againsthero_skills.system.pingsmoke test fails becausehero_rpc_serverdoesn't register anysystem.*builtins. The skillhero_service_check_fix.md§7 expectsPOST /rpc {"method":"system.ping"}to succeed;hero_rpc2adds it (seeexample/recipe_sdk_rpc2/tests/debug_curl.rs:18registeringsystem.ping), but the OServer thathero_logic_serverruns on doesn't. This is a framework gap — either OServer needs a defaultsystem.ping, or §7 needs to acknowledge "system.* methods require hero_rpc2." Cross-references the §3 OSchema/hero_rpc2 migration question.PATH_ROOTpanic whenhero_procspawns the server.lab service hero_logic_server --startpanics inherolib_core::base::paths::path_root()withPATH_ROOT is not set — run lab user init first. The user's~/hero/cfg/has noinit.shorhero_cfg.toml, solab user init/shell-initwas never run.hero_proc_serveritself runs fine withoutPATH_ROOTbecause it never callspath_root();hero_logic_serverdoes, transitively viaresolve_socket_dir → path_var → path_root. For verification I exportedPATH_ROOT=~/heroandHERO_SOCKET_DIR=~/hero/var/socketsand started the binaries directly — all socket smoke tests then passed. Worth a docs note inhero_service_check_fix.md§7 thatlab user initis a prerequisite, or apath_root()fallback to$HOME/herowhen unset.Manual smoke-test transcript (with
PATH_ROOT/HERO_SOCKET_DIRset)Acceptance
PR opening against
mainfor the three commits above. After that lands,mainclears every §1 check that doesn't depend on the three external items above.PR: #42
§2 — Divergence audit vs.
hero_proc+hero_servicetemplate +hero_skillsservice skillsReading order: I followed the META authority order (child issue comments > umbrella > skills > reference services). Reference snapshots used:
hero_proc@f0ba646,hero_service@518448c,hero_skills@e95b7ec,hero_rpc@155768b,hero_lib@5b3e47a0(all onorigin/developmentas of audit time).The findings are grouped by severity, each with a one-line fix recommendation.
A. Already-fixed in PR #42 (resurfaced for tracking)
These are §1 fixes I rolled into the PR; flagging here so the audit list is complete.
hero_logic_server/src/main.rswas missingservice_base!()— hand-rolledSERVICE_TOML+BUILD_NRconsts instead of using the canonical macro. CLI + admin already used it. Fix: PR #42 commit8f51afd.Cargo.tomlusedhero_admin_lib = { path = "../hero_website_framework/..." }—hero_procand thehero_servicetemplate both use a git dep onbranch = "development". The relative path brokecargo metadatain the canonical lab coderoot. Fix: PR #42 commit8d035e4.service.tomlfiles listed only their own binary —hero_proc's pattern (every per-crate file carries the full[[binaries]]list, only[service.crate]differs) is whatlabreads from<bin> --info --json. Fix: PR #42 commit6499aac.B. Real divergences NOT in PR #42 — actionable
B1. Missing canonical CI workflow
.forgejo/workflows/lab-publish.yamlhero_procjust landed (f0ba646) the canonicallab-publish.yamlthat hero_skills standardized on (see hero_skills#268). The commit message says explicitly: "Drop this file verbatim into any hero_ repo's.forgejo/workflows/and it Just Works, provided the repo has FORGEJO_TOKEN in its Actions secrets."*hero_logic has only the old
build-linux.yaml(tag-triggered, manualBINARIESshell loop). It is the same template the META calls out as superseded.Fix: copy
hero_proc/.forgejo/workflows/lab-publish.yamlverbatim intohero_logic/.forgejo/workflows/lab-publish.yaml. Keepbuild-linux.yamlfor now (per hero_proc's transition plan: it only fires onv*tags so it doesn't race with the push-to-mainpublish path).B2. Stale, untracked
src/logic/at the repo rootThere's a
/src/logic/{rpc.rs, rpc_generated.rs, types.rs, types_generated.rs, osis_server_generated.rs}checked in at the workspace root. It is not a workspace member, noCargo.tomlreferences it, and the same generated files are duplicated undercrates/hero_logic_server/src/logic/{core,server}/. Almost certainly a leftover from before the4158f97split into_server/_sdk/_clicrates.Fix:
git rm -r src/from the repo root.B3.
lab servicedoesn't know abouthero_logic(SERVICE_MAPentry missing)Flagged in the §1 comment too — repeating here for the audit list.
hero_skills/crates/lab/src/service/service_manager.rs::SERVICE_MAPenumerates every servicelab service <name> --startknows about.hero_logicisn't there, so--startand--statusresolve to just the CLI binary and bail.Fix: PR against
hero_skillsadding("hero_logic", "hero_logic", &["hero_logic_server", "hero_logic_admin"])and the short alias("logic", "hero_logic", &["hero_logic_server", "hero_logic_admin"]). The long-term TODO in that same file is to replaceSERVICE_MAPwith dynamic discovery from--info --json— that would let the §2-PR-42 service.toml work end-to-end withoutSERVICE_MAPupkeep. Worth a follow-up issue againsthero_skills.B4.
PURPOSE.mduses unregistered short aliaslogicPURPOSE.mddocuments:logicis the short alias, but it isn't inSERVICE_MAPeither (see B3). Even after B3 lands, the alias is only useful forlabusers — Forgejo readers see the docs first.Fix: change all
lab service logicreferences inPURPOSE.mdtolab service hero_logic(matcheshero_proc's ownPURPOSE.mdstyle). Optional: keeplogicas a short alias inSERVICE_MAP.B5. No
README.mdhero_servicetemplate andhero_procboth ship aREADME.md.hero_logichas onlyPURPOSE.md. README is the first thing rendered on the Forgejo project page.Fix: add a minimal
README.mdpointing toPURPOSE.md+ thelab service hero_logiclifecycle commands.B6. No
hero_logic_examplescrateThe skill marks Examples as Optional, but both
hero_proc(crates/hero_proc_examples) and thehero_servicetemplate (crates/hero_service_examples) ship one as a clone-me reference for end-users. hero_logic has none — friend would have nowhere to look for "how do I drive this from Rust."Fix: stub
crates/hero_logic_examples/with one runnableexamples/01_connect.rsmirroringhero_service_examples/examples/01_connect.rs. Low priority; add when there's an actual use case.B7.
[workspace.lints]block is absenthero_proc/Cargo.tomlcarries a long[workspace.lints.clippy]allow list that suppresses noisy lints uniformly across the workspace.hero_logichas none — the workspace currently builds with 8 unused-code warnings (backend_online,play_detail_handler,pretty_json_or_string,rpc_call, etc.) that would be obvious clippy noise once lints are turned on.Fix: copy
hero_proc's[workspace.lints.clippy]block intohero_logic/Cargo.toml. Then either remove the truly-dead code (play_detail_handler,pretty_json_or_string,rpc_call,backend_online) or silence them with#[allow]if they're scaffolding for in-flight work.B8.
[workspace.package].versionis unsethero_procdeclaresversion = "0.6.0"in[workspace.package]; each crate then usesversion.workspace = true.hero_logic's[workspace.package]has noversionfield, so each crate hard-codesversion = "0.1.0". Two consequences:laband admin's/healthalready report0.1.0, but bumping requires touching three Cargo.tomls.Fix: add
version = "0.1.0"to[workspace.package], change each crate'sversion = "0.1.0"toversion.workspace = true.B9.
hero_admin_libis not a workspace dep (just a Cargo.toml entry)After PR #42 the root Cargo.toml declares
hero_admin_lib = { git = "...", branch = "development" }in[workspace.dependencies], andcrates/hero_logic_admin/Cargo.tomluseshero_admin_lib = { workspace = true }— that's fine and matches hero_proc.Issue: the
[workspace.dependencies]hero_admin_libentry is now the ONLY dep without an explicitversion = "0.6.0"pin (every other Hero ecosystem dep there does). Either pin or leave;hero_procdoesn't pin itshero_admin_libgit dep either, so this is fine — flagging for completeness.No fix needed. Documented for the next reader.
B10. The
inspectorfeature is enabled onhero_rpc_serverand serves HTML atGET /of rpc.sockcrates/hero_logic_server/Cargo.toml:hero_rpc_server = { workspace = true, features = ["inspector"] }. That feature embeds an HTML inspector UI on the rpc socket'sGET /route. hero_logic_admin already serves the admin UI on a separateadmin.sockand embeds<hero-api-docs>fromhero_admin_lib— same purpose, better integration. The inspector at rpc.sock'sGET /is dead weight on a service that already ships a real admin.(Not the deprecated
hero_inspectorcrate — different concept. Still redundant.)Fix: drop
features = ["inspector"]fromhero_logic_server's dep onhero_rpc_server. Saves therust-embed+mime_guessdeps. Verify nothing in admin proxies through it first.C. Items the issue body asked about — already clean
Recording the negative findings so they don't get re-investigated.
[patch]reference tohero_service/hero_lifecyclerename — the issue body warned the patch "may need flipping." There is no[patch]section anywhere in hero_logic, andhero_serviceis not in any crate's deps. Already cleaned up (probably between2ed6a80and now). No action.hero_inspectorcrate references —grepreturns nothing. No action.Makefile,scripts/*.sh,service_hero_logic.nu— none present. ✓hero_logic/rpc.sockandhero_logic/admin.sock. Conform tohero_socketsskill. ✓_ui/_clientlegacy crate names — none. ✓be4afcd. ✓tracing— all three binaries wireHeroTracingLayerinto thetracing_subscriberregistry. ✓service_base!()+validate_service_toml+handle_info_flag+print_startup_banner+prepare_sockets— present on every server/admin binary (after PR #42 for the server). ✓hero_admin_libadmin patterns — admin useshero_admin_lib::assets::shared_static_handler,hero_admin_lib::middleware::base_path_middleware,hero_admin_lib::routes::{health_response, heroservice_manifest},hero_admin_lib::socket::bind_admin_socket. Conforms tohero_admin_connection_status/hero_ui_dashboardskills. ✓Priority
If you only do three things from this list: B1 (canonical CI), B3 (lab
SERVICE_MAPentry — pure unblocker), B2 (stalesrc/). The rest is polish.§3 — OSchema / upgraded
hero_rpcmigration assessment (read-only)TL;DR
Viable, low-risk, but not urgent. Recommend opening as its own follow-up issue and timing it for after
hero_rpc#54/55/56land so we migrate onto a settled target rather than a moving one.The framing in the issue body — "hero_logic currently uses traditional hand-rolled SDK + hand-written OpenRPC + Axum + Askama" — is partly outdated. hero_logic is already on the OSchema codegen path; it's the older version of it. The "migration" is really an upgrade from the old codegen API to the new one, not a wholesale rewrite.
What hero_logic already has on the codegen side
types_generated.rs,types.rsfor hand impls)crates/hero_logic_server/build.rsviaOschemaBuildConfig::new().domain("logic", …).generate_server().nested_layout()osis_server_generated.rs)rpc_generated.rs)rpc.rs)logicservice.*)specs/openrpc.json)hero_rpc_derive::openrpc_client!macrocrates/hero_logic_sdk/src/lib.rsLogicServiceClientfor declared methods, hand-rolledLogicRpcClientfor auto-generated OSIS CRUD (workflow.list,play.list, etc.) the SDK doesn't expose typed wrappers forcrates/hero_logic_admin/src/{main,rpc_client}.rsSo hero_logic is sitting on roughly the codegen state hero_service@
518448c(the canonical template) is on. The "hand-rolled" footprint that would shrink in a migration is small and well-located.What's actually different in the upgraded path (per hero_service template + META #262)
Comparing
hero_service@518448c(canonical template) andhero_logic@main:OschemaBuildConfiggains new fields that hero_logic doesn't use yet:sdk_dir,client_crate_dir,server_crate_dir,server_types_crate,sdk_types_crate,with_wasm(). The hero_service config splits types so a sharedhero_servicecrate holds them, consumed by bothhero_service_serverandhero_service_sdk. Today hero_logic's types live duplicated incrates/hero_logic_server/src/logic/core/types_generated.rs(server) and would also appear incrates/hero_logic_sdk/if the SDK ever needed them. This is the main structural change.HeroLifecycle+OServer::run_cli. hero_service template: vs hero_logic today:HeroLifecycleplugs into the renamedhero_lifecyclecrate (washero_serviceper hero_rpc#55).LogicRpcClientexists to paper over. Once those wrappers land, the entirerpc_client.rs(28 lines) and the comment inAppState::sdk()("Raw transport — keep for the auto-generated OSIS CRUD whose typed wrappers don't live in the SDK yet") go away.crates/hero_logic_server/sdk/{js,python}/— looks like an in-progress version of this already.with_wasm()adds the WASM-compatible type variants (types_wasm_generated.rsin hero_service template).#[rpc(server, client)]macros" — that's a tier-after-tier follow-up.What does NOT migrate
These stay untouched regardless of codegen path:
engine/(Python executor, DAG runner) — pure domain code.seed.rs+seed_flows/— built-in flow definitions.tracing_layer.rs—HeroTracingLayerwirestracinginto the Hero file logger.services/— auxiliary handlers.crates/hero_logic_admin/templates/(Askama) + 1719-lineroutes.rs— admin UI lives here, codegen doesn't touch it.crates/hero_logic/src/main.rs— CLI clap tree, hand-written, calls the generated SDK but doesn't get generated itself.logicservice.*methods (Python execution, span pushes, play lifecycle) — these have business logic OSchema CRUD wouldn't capture.So the migration footprint is roughly:
build.rs+src/main.rsof the server + reorganizing where generated types live + deletingadmin/src/rpc_client.rs. Maybe ~300 LOC delta, plus path-dep adjustments.What could block the migration
I looked for these specifically since the issue asked. Net: nothing fundamental.
Serialize/Deserializeover JSON.service.tomlhassse = falseeverywhere.python_sourceb64-prefix encoding is handled in client code (hero_logic_sdk::decode_python_source), not in serde.inspectorfeature (HTML atGET /on rpc.sock) — but flagged in §2 audit as redundant since admin exists. Drop, don't migrate.HeroRequestContext)OServer::rundoes its own routing.The one non-blocker but quirk worth noting: hero_logic's
python_sourcefield carries an optionalb64:prefix because Python source with mixed newlines/quotes is annoying to round-trip cleanly through JSON. The decoder lives inhero_logic_sdk::decode_python_source. After migration this would need to remain a hand-written helper in the SDK (codegen won't infer it) — fine, but worth knowing the helper has to be preserved.Cost estimate
OschemaBuildConfigfieldsmain.rs→HeroLifecycle::run_clipatternhero_logictypes crate (if we follow hero_service template's split)_server,_sdk,_adminadmin/src/rpc_client.rs+ switch admin to typed wrappers everywhereworkflow.list,play.list,example.list(currently the SDK gap)sdk_dirTotal: roughly 1–2 days of focused work, single person. Reversible if anything blows up — the OschemaBuildConfig changes are additive and the runtime change is one main.rs.
Recommendation
hero_logic, titled something like "Upgrade to hero_service template codegen pattern (HeroLifecycle + new OschemaBuildConfig)". Reference this comment for the assessment.hero_rpc#54(extend scaffolder),hero_rpc#55(codegen alignment +hero_lifecyclerename + hero_rpc2 vendor),hero_rpc#56(modularize generator). Doing the work before those land means iterating against a moving target; doing it after means a single clean PR onto a stable target with the hero_service template as the reference.#[rpc(server, client)]). That's a separate follow-up after hero_logic lands on the upgradedhero_rpcfirst; trying to do both in one migration triples the surface area.The hand-rolled
LogicRpcClientdeletion alone makes this worth doing eventually — every new admin route currently has to choose between "typed SDK call" and "rawrpc_callJSON wrangling," and the choice depends on which methods the SDK happens to expose typed wrappers for. That's exactly the kind of split the upgraded codegen erases.Out of scope for #41: actually doing any of the above. This comment is the assessment the acceptance asked for.
hero_logictolabSERVICE_MAP (audit follow-up) #271system.pingbuiltin (skill smoke test expects it) #80lab user initprerequisite in service skills (PATH_ROOT requirement) #272Closing — audit complete.
§1 fixes landed directly on
main(commits8f51afd,8d035e4,6499aac). PR #42 superseded by the direct merge.§2 divergences broken out into:
§1 framework gaps filed:
system.pingbuiltinlab user initPATH_ROOT prerequisite (hero_proc#109 covers managed-daemon case)§3 OSchema upgrade filed as hero_logic#44. Now unblocked from the hero_rpc side; do after #43 cleanup lands.