fix(server): recompute combined OpenRPC spec on every rpc.discover #40

Merged
timur merged 1 commit from fix/rpc-discover-stale-cache into development 2026-05-05 11:26:24 +00:00
Owner

Summary

Closes lhumina_code/hero_rpc#32.

UnifiedState.combined_openrpc: Arc<Value> was a single snapshot of the merged OpenRPC spec, built once at serve() time and cloned out of the Arc on every rpc.discover / /openrpc.json request.

Today that snapshot can't actually drift mid-process — entries are read once at serve() and the builder is consumed — but the structure is the kind that breaks the moment runtime domain registration is added, and the issue's "Expected" section explicitly asks for "no caching layer between what the code says and what the discover endpoint returns." This PR removes that layer.

What changed

  • Replace combined_openrpc: Arc<Value> with domain_specs: Arc<Vec<Value>> — still a snapshot of each domain's spec at serve() time, but small.
  • Add compute_combined_openrpc(state) and a pure build_combined_openrpc(title, version, description, domain_specs) helper. Both rpc.discover and /openrpc.json now call through this helper on every request.
  • Recomputation cost: a few Value clones plus appending the management-method list. Microseconds. rpc.discover is called rarely (router scanner cycle), so the perf delta is irrelevant.
  • Unit tests for build_combined_openrpc: every domain's methods appear; second call reflects subsequent input changes; empty input still emits management methods.

What this does NOT fix

The Observed section of #32 mentions a hero_logic binary built Apr 20 that omits LogicService.play_start from rpc.discover despite the methods being in the .oschema. That's a build-time concern, not a runtime cache concern. include_str! captures whatever was in openrpc.json at compile time. If the spec is regenerated from .oschema after the binary was last built, the running binary still serves the old embedded content. No change to rpc.discover can fix that — only rebuilding the binary will.

A separate follow-up could be a build.rs in each service that runs the hero_rpc_generator against the .oschema so openrpc.json is always rebuilt as part of cargo build. That's out of scope here. Filing a tracking issue if you want.

Test plan

  • cargo build -p hero_rpc_server
  • cargo test -p hero_rpc_server --lib compute_combined_openrpc — 3/3 pass
  • cargo test --workspace --lib — full suite green
  • End-to-end smoke against a live unified server (out of scope — no e2e test infra in this crate today)

🤖 Generated with Claude Code

## Summary Closes lhumina_code/hero_rpc#32. `UnifiedState.combined_openrpc: Arc<Value>` was a single snapshot of the merged OpenRPC spec, built once at `serve()` time and cloned out of the Arc on every `rpc.discover` / `/openrpc.json` request. Today that snapshot can't actually drift mid-process — `entries` are read once at `serve()` and the builder is consumed — but the structure is the kind that breaks the moment runtime domain registration is added, and the issue's "Expected" section explicitly asks for "no caching layer between what the code says and what the discover endpoint returns." This PR removes that layer. ## What changed - Replace `combined_openrpc: Arc<Value>` with `domain_specs: Arc<Vec<Value>>` — still a snapshot of each domain's spec at `serve()` time, but small. - Add `compute_combined_openrpc(state)` and a pure `build_combined_openrpc(title, version, description, domain_specs)` helper. Both `rpc.discover` and `/openrpc.json` now call through this helper on every request. - Recomputation cost: a few `Value` clones plus appending the management-method list. Microseconds. `rpc.discover` is called rarely (router scanner cycle), so the perf delta is irrelevant. - Unit tests for `build_combined_openrpc`: every domain's methods appear; second call reflects subsequent input changes; empty input still emits management methods. ## What this does NOT fix The `Observed` section of #32 mentions a hero_logic binary built Apr 20 that omits `LogicService.play_start` from `rpc.discover` despite the methods being in the `.oschema`. **That's a build-time concern, not a runtime cache concern.** `include_str!` captures whatever was in `openrpc.json` at compile time. If the spec is regenerated from `.oschema` after the binary was last built, the running binary still serves the old embedded content. No change to `rpc.discover` can fix that — only rebuilding the binary will. A separate follow-up could be a `build.rs` in each service that runs the `hero_rpc_generator` against the `.oschema` so `openrpc.json` is always rebuilt as part of `cargo build`. That's out of scope here. Filing a tracking issue if you want. ## Test plan - [x] `cargo build -p hero_rpc_server` - [x] `cargo test -p hero_rpc_server --lib compute_combined_openrpc` — 3/3 pass - [x] `cargo test --workspace --lib` — full suite green - [ ] End-to-end smoke against a live unified server (out of scope — no e2e test infra in this crate today) 🤖 Generated with [Claude Code](https://claude.com/claude-code)
fix(server): recompute combined OpenRPC spec on every rpc.discover
All checks were successful
Test / test (pull_request) Successful in 1m43s
4d09a25d07
`UnifiedState.combined_openrpc: Arc<Value>` was a snapshot built once at
serve() time from the per-domain include_str!'d specs. With current
architecture (entries are snapshotted into the dispatch map at serve()
and the builder is consumed) the snapshot can't actually drift mid-
process — but the structure is the kind that breaks the moment anyone
adds runtime domain registration, and hero_rpc#32 explicitly asks for
"no cache layer between what the code says and what discover returns."

Replace with `domain_specs: Arc<Vec<Value>>` (still a snapshot of each
domain's spec at serve() time) and recompute the combined OpenRPC
document on every `rpc.discover` and `/openrpc.json` request. Cost is a
few `Value` clones plus appending the management methods — microseconds.

Extract the merge to a free `build_combined_openrpc(...)` helper and
unit-test it: spec from every registered domain appears, second call
reflects subsequent input changes, empty input emits only management
methods.

This does NOT address the binary-vs-disk drift the issue's "Observed"
section mentions (a binary built before openrpc.json was updated will
serve the old `include_str!`'d content). That's a build-time concern —
the runtime discover endpoint can only serve what the binary embeds.

Refs lhumina_code/hero_rpc#32

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Owner

Filed #41 for the proper root-cause fix (build.rs regenerates openrpc.json from .oschema, so include_str! is always current). Once #41 lands across consumer service crates, the per-request recompute in this PR becomes unnecessary and can be reverted to a startup snapshot — that revert is the last step of #41.

Keeping this PR as the tactical fix in the meantime.

Filed #41 for the proper root-cause fix (build.rs regenerates openrpc.json from .oschema, so `include_str!` is always current). Once #41 lands across consumer service crates, the per-request recompute in this PR becomes unnecessary and can be reverted to a startup snapshot — that revert is the last step of #41. Keeping this PR as the tactical fix in the meantime.
timur merged commit 3f50397c65 into development 2026-05-05 11:26:24 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_rpc!40
No description provided.