Combined deploy — all PRs merged + Dockerfile.prod + TFGrid deployment #43

Closed
mik-tf wants to merge 102 commits from development_combined_deploy into development
Owner

Summary

Combines all deployment work from PRs #26, #31, #32, #34, #36, #40, #42 into a single branch for the first full Hero OS container deployment on TFGrid.

Production container (Dockerfile.prod)

  • Two-stage build: Rust builder compiles ALL 20+ service binaries, runtime is debian:bookworm-slim (no Rust toolchain)
  • Services built via docker/build-services.sh: hero_auth, hero_books, hero_embedder, hero_fossil, hero_indexer, hero_inspector, hero_os, hero_osis, hero_proxy, hero_redis, hero_voice + hero_indexer_ui
  • ONNX Runtime bundled for hero_embedder (load-dynamic)
  • WASM frontends built (hero_os_ui shell + hero_archipelagos islands)
  • Service TOMLs stripped of [build]/[install] sections (binaries are pre-built)
  • zinit TOMLs removed from user profile (zinit is infrastructure, started by entrypoint)

CI pipeline

  • build-container.yaml switched to Dockerfile.prod
  • Tag push produces :version + :dev (no :latest until production-ready)
  • Manual dispatch produces :version only
  • Removed broken build-macos.yaml (no runner) and redundant build-prod-container.yaml
  • Disabled tag trigger on build-linux.yaml (missing build_lib.sh functions, manual only)

TFGrid deployment (deploy/single-vm/)

  • OpenTofu config with grid_scheduler for node selection
  • Multi-environment support: envs/dev/ and envs/prod/ with per-env tfvars
  • setup.sh handles Docker install on zinit VMs (no systemd — uses dockerd directly)
  • fuse-overlayfs storage driver, data-root on /data disk
  • Web gateway (HTTPS) via TFGrid name proxy
  • Makefile orchestration: make all ENV=prod for full deploy

Service fixes

  • hero_indexer: corrected repo URL in TOML
  • hero_books: switched to Unix socket (unix://__HERO_VAR__/sockets/hero_books.sock)
  • hero_osis/hero_os: set HERO_CONTEXTS=default for auth
  • Container CI: DinD checkout, SSH forwarding, proper build context

PRs included

  • #26 fix: container CI pipeline
  • #31 fix: hero_indexer repo URL
  • #32 fix: container build (included in #34)
  • #34 production container with pre-built binaries
  • #36 switch hero_books to Unix socket
  • #40 fix auth (HERO_CONTEXTS=default)
  • #42 TFGrid deployment

Closes #25
Closes #27
Closes #29
Closes #30
Closes #35
Closes #41

Test plan

  • CI green (build.yaml + build-container.yaml)
  • Container image pushed as hero_zero:0.1.0 + hero_zero:dev
  • TFGrid deploy: make all ENV=prod
  • All 20 services start inside container
  • Gateway responds at heroos.gent02.grid.tf
## Summary Combines all deployment work from PRs #26, #31, #32, #34, #36, #40, #42 into a single branch for the first full Hero OS container deployment on TFGrid. ### Production container (Dockerfile.prod) - Two-stage build: Rust builder compiles ALL 20+ service binaries, runtime is debian:bookworm-slim (no Rust toolchain) - Services built via `docker/build-services.sh`: hero_auth, hero_books, hero_embedder, hero_fossil, hero_indexer, hero_inspector, hero_os, hero_osis, hero_proxy, hero_redis, hero_voice + hero_indexer_ui - ONNX Runtime bundled for hero_embedder (load-dynamic) - WASM frontends built (hero_os_ui shell + hero_archipelagos islands) - Service TOMLs stripped of [build]/[install] sections (binaries are pre-built) - zinit TOMLs removed from user profile (zinit is infrastructure, started by entrypoint) ### CI pipeline - `build-container.yaml` switched to Dockerfile.prod - Tag push produces `:version` + `:dev` (no `:latest` until production-ready) - Manual dispatch produces `:version` only - Removed broken `build-macos.yaml` (no runner) and redundant `build-prod-container.yaml` - Disabled tag trigger on `build-linux.yaml` (missing build_lib.sh functions, manual only) ### TFGrid deployment (`deploy/single-vm/`) - OpenTofu config with grid_scheduler for node selection - Multi-environment support: `envs/dev/` and `envs/prod/` with per-env tfvars - `setup.sh` handles Docker install on zinit VMs (no systemd — uses dockerd directly) - fuse-overlayfs storage driver, data-root on /data disk - Web gateway (HTTPS) via TFGrid name proxy - Makefile orchestration: `make all ENV=prod` for full deploy ### Service fixes - hero_indexer: corrected repo URL in TOML - hero_books: switched to Unix socket (`unix://__HERO_VAR__/sockets/hero_books.sock`) - hero_osis/hero_os: set `HERO_CONTEXTS=default` for auth - Container CI: DinD checkout, SSH forwarding, proper build context ## PRs included - #26 fix: container CI pipeline - #31 fix: hero_indexer repo URL - #32 fix: container build (included in #34) - #34 production container with pre-built binaries - #36 switch hero_books to Unix socket - #40 fix auth (HERO_CONTEXTS=default) - #42 TFGrid deployment Closes #25 Closes #27 Closes #29 Closes #30 Closes #35 Closes #41 ## Test plan - [ ] CI green (build.yaml + build-container.yaml) - [ ] Container image pushed as hero_zero:0.1.0 + hero_zero:dev - [ ] TFGrid deploy: `make all ENV=prod` - [ ] All 20 services start inside container - [ ] Gateway responds at heroos.gent02.grid.tf
fix: replace actions/checkout with git clone in container build CI
Some checks failed
Build and Test / build (pull_request) Has been cancelled
6149d0c847
actions/checkout@v4 fails in docker:24-dind (alpine) due to
glibc/musl mismatch. Replace with manual git clone using
FORGEJO_TOKEN for auth. Also removes nodejs dependency since
checkout action is no longer used.

Fixes both build-container and create-release jobs.

Closes #25

Co-Authored-By: mik-tf <mik@threefold.io>
fix: pass SSH_PRIVATE_KEY via env block to preserve newlines
Some checks failed
Build and Test / build (pull_request) Has been cancelled
5158b1a887
Direct ${{ secrets }} interpolation in run blocks mangles multi-line
SSH keys. Pass via env: block instead, matching the pattern used in
build.yaml which works.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: improve SSH setup — skip ssh-keyscan, add debug output
Some checks failed
Build and Test / build (pull_request) Has been cancelled
3ca3e2b267
ssh-keyscan may hang in DinD container. Use ssh config with
StrictHostKeyChecking instead. Add error output to identify
which step fails.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: add nodejs to apk install — Docker actions require Node
Some checks failed
Build and Test / build (pull_request) Has been cancelled
ff901bdb3b
docker/setup-buildx-action, docker/login-action, and
docker/build-push-action are JavaScript actions that need
Node.js in the runner. Without it, they fail with exit 127
("node: not found").

Co-Authored-By: mik-tf <mik@threefold.io>
fix: clone zinit in Dockerfile for zinit_sdk path dependency
Some checks failed
Build and Test / build (pull_request) Has been cancelled
2a4a2fd98b
The workspace Cargo.toml has a path dependency on
../zinit/crates/zinit_sdk. In the Docker build context this
resolves to /build/zinit/ which must be cloned before cargo
build can proceed.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: update Dockerfile to build actual workspace binaries
All checks were successful
Build and Test / build (pull_request) Successful in 6m21s
2a02b76fc1
The Dockerfile referenced a hero_zero binary that doesn't exist
in this workspace. The workspace produces hero_services_server,
hero_services, and hero_services_ui. Updated to build and copy
the actual binaries.

Simplified the builder stage — removed aspirational hero_zero
install-service loop and zinit install steps that depend on
non-existent binaries.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: correct hero_indexer repo URL from hero_index_server to hero_indexer
All checks were successful
Build and Test / build (pull_request) Successful in 5m22s
003141c3fb
The TOML referenced lhumina_code/hero_index_server which is an empty
repo. The actual repo is lhumina_code/hero_indexer.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: correct Dockerfile binary names, CI pipeline, and add entrypoint
All checks were successful
Build and Test / build (pull_request) Successful in 6m26s
05b6c3ff98
- Dockerfile: fix binary names (hero_services_openrpc, not hero_zero),
  build zinit workspace, use rust:slim-bookworm runtime with g++ for
  services that need C++ at install time
- CI workflow: manual git clone (actions/checkout fails in alpine DinD),
  explicit dockerd startup, SSH key via env block to prevent multiline
  mangling, StrictHostKeyChecking accept-new
- Entrypoint: start zinit_openrpc, wait for socket, launch
  hero_services_openrpc with user profile. Generic SSH key permission
  fix for any mounted key type.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: use flock to prevent race condition on shared repo installs
All checks were successful
Build and Test / build (pull_request) Successful in 5m25s
a9b13c11a6
When multiple services share the same git repo (e.g. zinit_openrpc
and zinit_http both use geomind_code/zinit), their install oneshots
race on the same directory. The second install starts milliseconds
after the first finishes and fails with exit 128 (git lock conflict).

Wrap clone_or_update_sh in flock so concurrent installs serialize
their git operations on the same repo directory.

Fixes #33

Co-Authored-By: mik-tf <mik@threefold.io>
feat: add production container with pre-built service binaries
All checks were successful
Build and Test / build (pull_request) Successful in 6m23s
0f01466789
Production Dockerfile (Dockerfile.prod) compiles ALL hero service
binaries at build time, producing a slim debian:bookworm-slim image
with no Rust toolchain or SSH keys needed at runtime.

- docker/build-services.sh: clones and builds 12 service repos
- docker/strip-build-sections.sh: removes [build]/[install] TOML
  sections so orchestrator starts services without install oneshots
- build-prod-container.yaml: CI workflow for production image builds

Co-Authored-By: mik-tf <mik@threefold.io>
fix: preserve pre-built binaries in production mode
All checks were successful
Build and Test / build (pull_request) Successful in 6m21s
e422ebe918
- stop_and_clean: skip binary deletion when no services have [build]
  sections (production containers with pre-baked binaries)
- build-services.sh: add build_cargo() for repos needing direct cargo
  builds (hero_os: skip WASM/Dioxus frontend), fix status tracking
- strip-build-sections.sh: also remove "install" from profile actions
  to prevent orchestrator from writing install oneshots

Tested: 19/21 services running in production container.
Remaining: hero_indexer (TOML naming mismatch) and hero_books (blocked
by hero_indexer dependency).

Co-Authored-By: mik-tf <mik@threefold.io>
fix: split hero_indexer into openrpc/http to match repo binary names
All checks were successful
Build and Test / build (pull_request) Successful in 6m23s
d2f96f7aed
The hero_indexer repo now builds hero_indexer_openrpc + hero_indexer_http
(not a single hero_indexer binary). Split the TOML accordingly and update
all depends_on references in hero_books, hero_indexer_ui, hero_osis_openrpc.

Refs #29

Co-Authored-By: mik-tf <mik@threefold.io>
fix: remove hardcoded port from hero_books TOML
All checks were successful
Build and Test / build (pull_request) Successful in 5m25s
3bf8866d01
hero_books already listens on both TCP (default 8883) and Unix socket
(~/hero/var/sockets/hero_books_server.sock) automatically. Remove the
explicit --port 8883 and ports = [8883] so the binary uses its defaults
and hero_proxy can route via the existing Unix socket.

Closes #35

Co-Authored-By: mik-tf <mik@threefold.io>
fix: disable kill_others in production mode to prevent restart cascades
All checks were successful
Build and Test / build (pull_request) Successful in 5m24s
45ea909e95
When no [build] section exists (production container with pre-built
binaries), kill_others is unnecessary since there are no stale processes.
The flag causes _http services to kill each other's ports on simultaneous
startup, exhausting zinit's retry budget.

Refs #33

Co-Authored-By: mik-tf <mik@threefold.io>
fix: add ONNX Runtime, fix zinit double-start, fix hero_books embedder URL
All checks were successful
Build and Test / build (pull_request) Successful in 6m23s
d6c8ff3eb3
Production container fixes for 20/20 services:
- Download ONNX Runtime 1.23.2 for hero_embedder (uses load-dynamic dlopen)
- Remove zinit TOMLs from production profile (infrastructure, not user service)
- Fix hero_books embedder URL to use Unix socket instead of broken HTTP URL

Co-Authored-By: mik-tf <mik@threefold.io>
docs: add production container section to README
All checks were successful
Build and Test / build (pull_request) Successful in 5m25s
a4e6ee54d5
Document the production container image (pull, run, tags), 20 service
list, startup notes, key ports, CI build process, and architecture.
Rename existing Docker section to "Development Container".

Co-Authored-By: mik-tf <mik@threefold.io>
Add hero_os_ui Dioxus shell and hero_archipelagos standalone island
builds to the production Docker image. This enables the Hero OS desktop
environment with all island apps (settings, books, calendar, contacts,
etc.) to be served by hero_os_http.

Changes:
- Add wasm32-unknown-unknown target, wasm-pack, and dioxus-cli to builder
- New docker/build-wasm.sh script (builds shell + 37 islands)
- Copy WASM assets to runtime stage
- Set HERO_OS_ASSETS/HERO_OS_ISLANDS env vars for hero_os_http
- Pin zinit to 9d21ba5 (workaround rust-version inheritance bug)
- Expose port 8804 (hero_os_http)

Tested locally: 22 services running, shell served at /, 31/37 islands
built successfully.

Closes #37

Co-Authored-By: mik-tf <mik@threefold.io>
The frontend (hero_os_ui) uses context "default" when calling RPC
methods, but hero_osis_openrpc defaults to context "root". This
causes a socket path mismatch — the frontend calls /rpc/default but
the socket only exists at .../root/hero_osis_openrpc.sock.

Set HERO_CONTEXTS=default in both service TOMLs to align with the
frontend expectation.

Closes #39
OpenTofu config with grid_name_proxy for HTTPS via TFGrid web gateway,
Docker-based setup/update scripts, multi-environment Makefile (prod/dev).

Closes #41
- Mnemonic set as TF_VAR_mnemonic in ~/hero/cfg/env/env.sh, not in tfvars
- credentials.auto.tfvars now infra config only (node_id, cpu, memory, etc.)
- FORGEJO_TOKEN passed from local env to VM via Makefile SSH
- app.env reduced to non-secret config (image, port)
- Use canonical FORGEJO_TOKEN name per env_secrets skill
Makefile auto-sources TF_VAR_mnemonic and FORGEJO_TOKEN from
~/hero/cfg/env/env.sh — no manual sourcing needed, just make all ENV=prod.
Example files trimmed to only what they contain, no secret references.
fix: switch CI to Dockerfile.prod (all binaries pre-built, no runtime compilation)
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
e4368da245
fix: manual dispatch tags only :version, tag push adds :latest
Some checks failed
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 4s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 4s
Build and Test / build (pull_request) Successful in 7m23s
Build Container / Build and Push Container (push) Successful in 32m32s
Build Container / Create Release (push) Failing after 1s
Build macOS / build-macos (push) Has been cancelled
decdf1fe56
fix: clean up CI workflows, use :dev tag instead of :latest
Some checks failed
Build and Test / build (pull_request) Has been cancelled
0104cff1d8
- Delete build-macos.yaml (no macOS runner available)
- Delete build-prod-container.yaml (redundant with build-container.yaml)
- Disable tag trigger on build-linux.yaml (missing build_lib.sh functions)
- Change build-container.yaml image tag from :latest to :dev
fix: use hero_zero:dev image tag across all deploy scripts
Some checks failed
Build and Test / build (pull_request) Successful in 5m15s
Build Container / Build and Push Container (push) Failing after 35s
Build Container / Create Release (push) Has been skipped
47d11a06da
fix: install fuse-overlayfs in CI for Docker daemon startup
Some checks failed
Build and Test / build (pull_request) Successful in 6m39s
Build Container / Build and Push Container (push) Successful in 24m36s
Build Container / Create Release (push) Failing after 1s
f3c7caf20c
Runner lacks overlay2 privileges — explicitly install fuse-overlayfs
and configure Docker to use it as storage driver.
fix: hero_os_http bind 0.0.0.0 + hero_books build skip UI crate
Some checks failed
Build and Test / build (pull_request) Successful in 5m12s
Build Container / Build and Push Container (push) Successful in 29m28s
Build Container / Create Release (push) Failing after 2s
642c0f8da6
- hero_os_http: add --host 0.0.0.0 so port 8804 is reachable via
  Docker port forwarding (was 127.0.0.1, caused 502 from gateway)
- hero_books: switch from build_repo to build_cargo with -p hero_books
  to skip hero_books_ui WASM/Dioxus crate that fails in Docker build
fix: create-release CI job + gitignore tfstate
Some checks failed
Build and Test / build (pull_request) Successful in 6m9s
Build Container / Build and Push Container (push) Successful in 26m15s
Build Container / Create Release (push) Failing after 1s
326adad31a
- Add alpine container image to create-release job (was missing, causing git/curl not found)
- Remove unnecessary checkout step (release only needs env vars)
- Add terraform.tfstate to deploy gitignore
- Fix comment: :latest → :dev

Co-Authored-By: mik-tf <mik@threefold.io>
fix: create-release JSON payload (use jq for proper escaping)
All checks were successful
Build and Test / build (pull_request) Successful in 6m41s
Build Container / Build and Push Container (push) Successful in 27m29s
Build Container / Create Release (push) Successful in 2s
d44503aaa9
Multiline BODY variable broke the JSON string literal in curl -d.
Use jq to construct the payload with proper newline escaping.

Co-Authored-By: mik-tf <mik@threefold.io>
mik-tf changed title from WIP: Combined deploy — all PRs merged + Dockerfile.prod + TFGrid deployment to Combined deploy — all PRs merged + Dockerfile.prod + TFGrid deployment 2026-02-27 20:45:46 +00:00
fix: route gateway through hero_proxy (port 8805) instead of hero_os_http (8804)
All checks were successful
Build and Test / build (pull_request) Successful in 6m13s
840a935870
The TFGrid gateway was pointing to hero_os_http (port 8804), which is a
static WASM file server (GET only). WASM auth RPCs use relative URLs
resolved against the page origin, so POST requests hit the static server
and got HTTP 405. Changing to hero_proxy_http (port 8805), which routes
all paths to the correct backend services via Unix sockets.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: hero_os_http socket path to match hero_proxy routing
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
Build Container / Build and Push Container (push) Successful in 29m3s
Build Container / Create Release (push) Successful in 1s
e27243253a
hero_proxy routes /hero_os/* to ~/hero/var/sockets/hero_os.sock but
hero_os_http was defaulting to hero_os_http.sock (binary name). Add
explicit --bind to the service TOML so it creates the socket the
proxy expects. Without this, /hero_os/ returns 502.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: increase Docker daemon startup timeout to 60s in CI
All checks were successful
Build and Test / build (pull_request) Successful in 6m14s
e4634efee6
DinD (Docker-in-Docker) with fuse-overlayfs can take longer than 30s
to initialize on shared CI runners. 60s matches industry standard and
our own deploy/setup.sh timeout.

Co-Authored-By: mik-tf <mik@threefold.io>
fix: revert hero_os_http socket override — use default hero_os_http.sock
All checks were successful
Build and Test / build (pull_request) Successful in 6m14s
Build Container / Build and Push Container (push) Successful in 23m4s
Build Container / Create Release (push) Successful in 1s
bc4ad5f1a1
The Dioxus frontend uses base_path="hero_os_http", so all asset paths
are absolute at /hero_os_http/assets/.... The proxy routes /hero_os_http/*
to hero_os_http.sock, which only works when the binary uses its default
socket name. The previous --bind override to hero_os.sock broke asset
loading (white screen).

Correct access URL: /hero_os_http/ (not /hero_os/)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: always configure Docker fuse-overlayfs on TFGrid VMs
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
133192be86
TFGrid VMs often have Docker pre-installed but configured with the
default overlayfs driver, which fails on virtiofs. The setup script
now checks the current storage driver and reconfigures to
fuse-overlayfs + /data when needed, regardless of whether Docker
was freshly installed or already present.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: add "root" context to hero_osis_openrpc for auth support
Some checks failed
Build and Test / build (pull_request) Successful in 6m39s
Build Container / Build and Push Container (push) Failing after 1m6s
Build Container / Create Release (push) Has been skipped
9366e88d8b
The Hero OS UI sends auth RPC calls (authservice.get_challenge, etc.)
to the "root" context via hero_osis. With only "default" configured,
hero_osis_openrpc never creates sockets/root/hero_osis_openrpc.sock,
causing 502 "Cannot connect to backend socket" on login.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: add admin seeding and per-repo branch overrides for Docker builds
Some checks failed
Build and Test / build (pull_request) Failing after 8s
abb73618f2
- Add HERO_OSIS_SEED_DIR to hero_osis_openrpc service TOML for automatic
  admin user provisioning on first boot (idempotent)
- Add per-repo branch overrides in build-services.sh (HERO_OS_BRANCH,
  HERO_OSIS_BRANCH env vars) so Docker builds can use fix branches
  without waiting for PRs to merge
- Add seed data COPY step in Dockerfile.prod (copies from hero_osis
  data/seed/ directory into /root/hero/var/seed/)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: pin zinit to development_fix_herolib_core_dep branch
Some checks failed
Build and Test / build (pull_request) Has been cancelled
d3c51ca4f4
Zinit's development branch is broken — commit 11a6bd4 added
herolib_core as a workspace dep in zinit_sdk but never defined it
in the root Cargo.toml. This breaks cargo resolution for all
downstream consumers.

Pin to the fix branch until zinit#25 is merged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: update zinit binary names after upstream restructure
All checks were successful
Build and Test / build (pull_request) Successful in 5m13s
28884afddd
zinit codebase was restructured: zinit_openrpc → zinit_server,
zinit_http → zinit_ui, --web-port → --http. Update Dockerfiles
and entrypoint to match.
fix: rename hero_redis service TOMLs to match upstream binary names
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
46828651ea
hero_redis renamed its binaries: hero_redis_openrpc → hero_redis_server,
hero_redis_http → hero_redis_ui. Update service TOMLs, dependency
references, and build script accordingly.
fix: add HERO_BOOKS_BRANCH build arg for per-repo branch override
All checks were successful
Build and Test / build (pull_request) Successful in 5m22s
32eb597913
fix: update service TOMLs and build script for upstream binary renames
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
703c16f7d2
hero_voice, hero_indexer repos renamed their binaries from *_openrpc/*_http
to *_server/*_ui convention. Update TOMLs and build-services.sh to match.
Also switch hero_inspector to build_cargo to bypass broken Makefile.

Affected:
- hero_voice_openrpc → hero_voice_server
- hero_voice_http → hero_voice_ui
- hero_indexer_openrpc → hero_indexer_server
- hero_indexer_http removed (merged into server)
- hero_inspector: build_repo → build_cargo
fix: update depends_on references for hero_indexer rename
All checks were successful
Build and Test / build (pull_request) Successful in 5m13s
be75bce54d
hero_osis_openrpc and hero_books had depends_on referencing the old
hero_indexer_openrpc service name. Updated to hero_indexer_server.
fix: update hero_inspector TOMLs, add proxy default service, add HERO_PROXY_BRANCH
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
a483c013c8
- Rename hero_inspector TOMLs to match upstream binary names (_server/_ui)
- Revert hero_inspector to build_repo (development already has Makefile fix)
- Add HERO_PROXY_DEFAULT_SERVICE=hero_os_http to proxy TOML for SPA routing
- Add HERO_PROXY_BRANCH build arg to Dockerfile.prod
fix: hero_inspector build — use build_cargo + symlink zinit for path deps
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
88b29c9894
hero_inspector's Makefile has build_lib.sh dependencies that fail in
Docker. Switch to build_cargo and symlink /build/zinit into SRC_DIR
so zinit_sdk relative path deps resolve correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: health checks use unix socket probing + branch overrides for all fixes
All checks were successful
Build and Test / build (pull_request) Successful in 5m12s
fe082030b5
Health check generation now supports a `socket` field in service TOMLs.
When present, uses `curl --unix-socket` instead of TCP port probing.
Fixes 4 failing health checks (voice_ui, inspector_ui, redis_ui,
embedder_http) which bind Unix sockets but had TCP-only health probes.

Branch overrides updated:
- hero_os: development_ui_polish (system menu, maximize, tests)
- hero_osis: development_fix_missing_domains (communication, media, job)
- hero_proxy: development_default_service (SPA fallback routing)
- hero_inspector: development_fix_makefile_target (Docker build fix)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: correct hero_inspector binary and package names
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
80f2f84ad8
The hero_inspector crates produce binaries named hero_inspector_openrpc
and hero_inspector_http, not hero_inspector_server and hero_inspector_ui.
Fix service TOMLs and build script to use the actual binary names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: correct hero_embedder_http socket path to match binary name
All checks were successful
Build and Test / build (pull_request) Successful in 5m18s
6a1ef6d86c
The hero_embedder_ui binary creates hero_embedder_ui.sock, not
hero_embedder_http.sock. Fix health check socket path to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: add fallback binary search in build scripts
All checks were successful
Build and Test / build (pull_request) Successful in 5m14s
8f88a1f0c5
Add fallback to check repo-local target/release/ when binary is not
found at CARGO_TARGET_DIR/release/. Handles edge cases where Docker
cache mounts cause cargo to use an alternate target directory.
Also switch hero_books from build_cargo to build_repo for Makefile install.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: update hero_books binary name to hero_books_server
All checks were successful
Build and Test / build (pull_request) Successful in 5m13s
ccbca54856
The hero_books repo renamed its main binary from hero_books to
hero_books_server. Update build script and service TOML to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: update HERO_OS_BRANCH to development_consolidated
All checks were successful
Build and Test / build (pull_request) Successful in 6m13s
37d8dc1f9e
Point to the new consolidated branch (PR #19) which merges
UI polish, SPA routing, and projects island PRs (#11, #15, #18).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: set hero_proxy_http to listen on port 6666 (public-facing)
All checks were successful
Build and Test / build (pull_request) Successful in 5m14s
c46cc86d5e
In Docker container, hero_proxy_http is the public-facing reverse proxy
and should listen on port 6666 (mapped externally). The hero_ports registry
assigns 8805 for single-machine use, but in container mode 6666 is correct.

Add HERO_PROXY_PORT=6666 env var so the binary uses the right port.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: add HERO_ARCHIPELAGOS_BRANCH support in WASM build
All checks were successful
Build and Test / build (pull_request) Successful in 6m9s
4c72bcf93d
- Dockerfile.prod: add ARG HERO_ARCHIPELAGOS_BRANCH=development_consolidated
  and pass to build-wasm.sh
- build-wasm.sh: use HERO_ARCHIPELAGOS_BRANCH (falls back to BRANCH)
  when cloning hero_archipelagos for standalone island builds

This ensures standalone book/other island WASMs are built from
development_consolidated which has the books iframe→views fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: map host APP_PORT to container PROXY_PORT (6666) in setup.sh
All checks were successful
Build and Test / build (pull_request) Successful in 5m11s
6c4ddea1b4
The hero_proxy_http service listens on container port 6666 (HERO_PROXY_PORT),
not 8805. The Docker run script was mapping 8805:8805 which left nothing
on the mapped port, causing 502s at the gateway.

Introduces PROXY_PORT variable (default 6666) so the mapping becomes
APP_PORT:PROXY_PORT (e.g. 8805:6666).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test: add gateway smoke tests for routing verification
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
ec26c0cf50
Add tests/smoke_gateway.sh with 7 checks covering SPA routing (via
HERO_PROXY_DEFAULT_SERVICE), prefix routing, JSON-RPC proxy, and direct
osis prefix routing. Update deploy/single-vm/Makefile test target to
invoke the script with the deployed gateway URL.

Root path GET / correctly expects 200 (hero_proxy returns JSON service
listing), while /login and /hero_os_http/ assert text/html.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: update branch ARGs for Books WASM island migration
All checks were successful
Build and Test / build (pull_request) Successful in 5m17s
2afc5af2a8
- HERO_BOOKS_BRANCH: development → development_wasm_access
  (adds hero_books_http.sock + /admin/rpc for proxy routing)
- HERO_ARCHIPELAGOS_BRANCH: development_consolidated → development_books_wasm_views
  (replaces iframe with native Dioxus views, adds [package.metadata.island])

Previous HERO_ARCHIPELAGOS_BRANCH=development_consolidated did not exist
on remote, causing build-wasm.sh to silently skip all island builds.
fix: use socat bridge for hero_books routing + revert broken branch
All checks were successful
Build and Test / build (pull_request) Successful in 5m13s
0dbe1d3edb
- Revert HERO_BOOKS_BRANCH from development_wasm_access to development
  (development_wasm_access has broken lib.rs missing module files)
- Add socat to runtime image
- Add hero_books_socket_bridge zinit service: bridges hero_books.sock
  → hero_books_server.sock so hero_proxy /hero_books/ routes correctly
  (depends_on hero_books so it starts after the server is healthy)
chore: add CACHE_BUST ARG to force rebuild with updated git deps
All checks were successful
Build and Test / build (pull_request) Successful in 6m37s
bf19b8d8cf
development_consolidated now has nav bar — bump CACHE_BUST=2 to
invalidate the Docker registry cache so hero_os_ui picks up the
new hero_archipelagos_books commit.
fix: add unlink-early to socat socket bridge to survive container restarts
Some checks failed
Build and Test / build (pull_request) Has been cancelled
2d18e59977
The hero_books_socket_bridge socat service failed on container restart
because the hero_books.sock file persisted in the data volume from the
previous run. Adding `unlink-early` causes socat to remove the stale
socket file before binding, preventing the EADDRINUSE failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: point HERO_BOOKS_BRANCH to development_wasm_access for http socket + admin/rpc
Some checks failed
Build and Test / build (pull_request) Has been cancelled
117cbcdb1a
fix: revert HERO_BOOKS_BRANCH to development (wasm_access fails Rust 1.93 type inference)
Some checks failed
Build and Test / build (pull_request) Has been cancelled
d0bf1c07a9
- Dockerfile.prod: add HERO_ARCHIPELAGOS_BRANCH ARG (default: development_consolidated)
  and pass it to build-wasm.sh
- build-wasm.sh: use HERO_ARCHIPELAGOS_BRANCH env var (fallback to BRANCH) when
  cloning hero_archipelagos

development_consolidated consolidates all archipelagos improvements (CI, MDX,
PDF, islands) including the books iframe → Dioxus WASM view router migration
previously on development_books_wasm_views (PR#37, now closed and merged in).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merge: set HERO_ARCHIPELAGOS_BRANCH=development_consolidated
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
3161ed64e8
Keeps all branch overrides from remote (HERO_PROXY_BRANCH, HERO_INSPECTOR_BRANCH)
and switches HERO_ARCHIPELAGOS_BRANCH from development_books_wasm_views to
development_consolidated — the single consolidated PR that now includes
the Books WASM view router migration (PR#37 merged in and closed).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: restore HERO_BOOKS_BRANCH=development_wasm_access + add Chromium
All checks were successful
Build and Test / build (pull_request) Successful in 7m30s
05a02c533b
- HERO_BOOKS_BRANCH=development_wasm_access: restores hero_books_http.sock
  creation + /admin/rpc endpoint. Previously reverted due to Rust 1.93 type
  inference issues, safe to re-enable now that builder is pinned to 1.92.
- Add chromium + dependencies to runtime stage for PDF generation
  (hero_books pdfservice requires headless Chrome via headless_chrome crate)
- Bump CACHE_BUST=3 to force rebuild with new branch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: set HERO_ARCHIPELAGOS_BRANCH=development_books_wasm_views, bump CACHE_BUST=4
All checks were successful
Build and Test / build (pull_request) Successful in 5m14s
1bb60f47f2
development_consolidated didn't exist on hero_archipelagos remote.
Correct branch for Books WASM island migration is development_books_wasm_views.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: pre-install binaryen so dx build finds wasm-opt without downloading
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
a8813096e5
dx build --release tries to download wasm-opt from GitHub at runtime,
which fails in the Docker build environment. Installing binaryen via apt
puts wasm-opt in PATH so dx uses it directly.

Also bumps CACHE_BUST=5 to force re-clone.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: add reliable/reliable-clean Makefile targets for prod image builds
Some checks failed
Build and Test / build (pull_request) Failing after 0s
986dd6834e
Two modes:
  make reliable TAG=reliable26       — with layer cache (normal, fast)
  make reliable-clean TAG=reliable26 — --no-cache (when Dockerfile changed)

Layer cache (~20-25 min saved) via BuildKit mount caches for cargo registry,
git deps, and target/. Use --no-cache only when apt deps or tool versions change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=6 to pick up hero_os default window fix
All checks were successful
Build and Test / build (pull_request) Successful in 5m24s
e8b2422454
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hero_proxy development_default_service now redirects /login and other
unprefixed paths to /hero_os_http/{path} instead of proxying transparently.
This fixes the /hero_os_http/hero_os_http double-prefix bug in the browser.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chore: resolve merge, bump CACHE_BUST to 7 for hero_proxy redirect fix
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
20d8315eef
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: bump CACHE_BUST=8 for loading overlay fix
All checks were successful
Build and Test / build (pull_request) Successful in 7m30s
8abdd41056
hero_os development_consolidated: add use_effect to dismiss #loading
overlay once WASM app mounts (was permanently blocking rendered app).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chore: bump builder image to rust:1.93-bookworm
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
802e7d58f5
Per hero_ecosystem standard — Rust 1.93 is now the minimum.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- hero_inspector: packages are hero_inspector_server/hero_inspector_ui
  (not hero_inspector_openrpc/hero_inspector_http)
- hero_fossil: use build_cargo to bypass Makefile build_lib.sh issues
  in container; binaries renamed to hero_foundry_*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=9 for hero_archipelagos webdav rename fix
Some checks failed
Build and Test / build (pull_request) Failing after 0s
0bdd57ab14
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=10 for hero_os fix (archipelagos development_consolidated)
All checks were successful
Build and Test / build (pull_request) Successful in 5m14s
df6a320196
Cherry-picked the herofossil_webdav_client rename into the
development_consolidated branch of hero_archipelagos, which
hero_os_ui references. hero_os workspace resolution now succeeds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: update service TOML exec paths for renamed binaries
Some checks failed
Build and Test / build (pull_request) Failing after 0s
10b124bca3
hero_osis: openrpc→server, http→ui
hero_inspector: openrpc→server, http→ui
hero_fossil: hero_fossil→hero_foundry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: use hero_foundry_server (not hero_foundry CLI) in fossil service TOML
Some checks failed
Build and Test / build (pull_request) Failing after 0s
3038091551
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: add hero_biz to container build pipeline
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
687ad73ee9
- Add build_repo entry in build-services.sh
- Create hero_biz.toml service definition (port 8881)
- Expose port 8881 in Dockerfile.prod
- CACHE_BUST=11

Note: hero_biz starts but cannot reach hero_osis backend
(no HTTP API bridge exists). See hero_biz#6.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: use inspector branch with HTTP fallback and MCP fixes
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
17247831f6
Switch HERO_INSPECTOR_BRANCH to development_fix_openrpc_http_fallback
which includes HTTP fallback for _server.sock probing and optional
ConnectInfo for MCP handler behind Unix socket proxy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: use build_cargo for hero_biz to bypass Makefile in Docker
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
dacba097bd
hero_biz Makefile depends on build_lib.sh which isn't available in
the Docker builder. Switch to build_cargo for direct cargo build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
refactor: rename service TOMLs to _server/_ui naming convention
Some checks failed
Build and Test / build (pull_request) Has been cancelled
d50308d91b
Rename all service TOML files and internal names from legacy
_openrpc/_http suffixes to _server/_ui per Hero ecosystem skills.
Update all depends_on, run_after references, socket paths, and
default service names. Bump CACHE_BUST to 14.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: add embedder/osis basepath branches, bump CACHE_BUST to 15
Some checks failed
Build and Test / build (pull_request) Failing after 0s
4be1715259
Pick up development_fix_basepath_regex branches for hero_embedder
and hero_osis which fix the BASE_PATH JS regex (_http -> _ui).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: add fossil branch, use merged inspector, bump CACHE_BUST to 16
All checks were successful
Build and Test / build (pull_request) Successful in 5m13s
788d79d378
- HERO_INSPECTOR_BRANCH back to development (PR #4 merged)
- HERO_FOSSIL_BRANCH=development_fix_openrpc_json (adds /openrpc.json)
- CACHE_BUST=16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: use inspector PR branch with 405 fix, bump CACHE_BUST to 18
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
e3df89b44e
HERO_INSPECTOR_BRANCH back to development_fix_openrpc_http_fallback
(PR #5) which includes the HTTP 405 fallback fix needed for
hero_os_http /openrpc.json discovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive smoke tests + fix RPC proxy routing
Some checks failed
Build and Test / build (pull_request) Has been cancelled
58025f9124
- smoke_container.sh: 57 tests covering zinit status, sockets, health,
  RPC proxy, OpenRPC spec, auth flow, and inspector discovery
- smoke_gateway.sh: updated for _ui naming, added auth/inspector tests
- Makefile: added smoke-container and smoke-gateway targets
- Dockerfile.prod: CACHE_BUST=20 for hero_os RPC proxy fix

Auth dispatch via osis is a known skip (spec aggregated but not dispatched).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix smoke tests: remove raw JSON-RPC health checks, fix MCP/auth/inspector
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
3c62a443d2
- Gateway: remove raw JSON-RPC sockets from HTTP health checks (502 expected)
- Gateway: fix MCP test (POST, not GET)
- Gateway: fix inspector API (use JSON-RPC services.list, skip if scanning)
- Both: mark auth dispatch as known skip (osis aggregates spec but doesn't dispatch)
- Container: fix inspector test to use HTTP over UDS (not raw socat)
- Container: remove hero_embedder_server.sock (doesn't create one)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=21 for hero_rpc 2-part dispatch fix
Some checks failed
Build and Test / build (pull_request) Failing after 0s
c7b0710f77
Rebuilds hero_osis_server with updated hero_rpc dependency that supports
2-part method names (Type.method) in Unix socket dispatch, fixing all
RPC calls from the WASM UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: auth smoke tests now expect pass (not skip)
All checks were successful
Build and Test / build (pull_request) Successful in 5m15s
c07c1cc643
With the hero_rpc 2-part dispatch fix, auth methods should work.
Changed auth test fallback from skip to fail so broken auth is
caught as a regression.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=22 for two-phase dispatch fix
All checks were successful
Build and Test / build (pull_request) Successful in 5m12s
e810a9068f
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore: bump CACHE_BUST=23 for error distinction fix
Some checks failed
Build and Test / build (pull_request) Has been cancelled
6e2a58faff
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: update auth smoke tests to use device_sid parameter
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
fb23726314
The AuthService.login method now requires device_sid instead of metadata.
Updated both smoke tests to pass device_sid: "smoke-test".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix hero_fossil socket name for hero_proxy routing
Some checks failed
Build and Test / build (pull_request) Has been cancelled
44eab665cb
Rename hero_fossil_server.sock to hero_fossil.sock so hero_proxy
can discover it via path prefix matching. hero_proxy searches for
exact, _http, and _ui suffixes but never _server sockets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bump CACHE_BUST=24 for Settings dock fix and Files 404 fix
All checks were successful
Build and Test / build (pull_request) Successful in 5m24s
f056ed154a
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix smoke tests: hero_fossil socket renamed from hero_fossil_server
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s
2d934e0175
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mik-tf closed this pull request 2026-03-05 13:34:43 +00:00
All checks were successful
Build and Test / build (pull_request) Successful in 5m16s

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_services!43
No description provided.