Infrastructure Sync & Production Deployment #12

Closed
opened 2026-03-10 15:44:27 +00:00 by mik-tf · 0 comments
Owner

The Hero OS Docker build (herodev, herodemo) was built using pinned local copies of repos like hero_rpc, hero_lib, and others. This let us move fast on UI work without worrying about upstream breaking changes. As a result, the running containers were based on old commits while the remote development branches had moved ahead significantly.

This issue brought everything back in sync and established a reliable build/deploy pipeline.

Follow-up: #13 — Smoke Tests, Production Deployment & Cleanup

Related:

  • #5 — Production Readiness Plan (completed)
  • #11 — Remaining UI/UX Work
  • #9 — Form Quality Parity

Result

Both herodev and herodemo are live and fully working. All services running, zero failures.

Tier Gateway Port Image Container Status
dev herodev.gent02.grid.tf 8805 hero_zero:dev herodev 31 services, HTTP 200
demo herodemo.gent02.grid.tf 8806 hero_zero:demo herodemo 31 services, HTTP 200

Both containers run the same image: sha256:38164c7235e707f982b6b697e2e09dbdbd1c023e0a37104dbcf276b3f62b55d6

Commits — All on development branch (pushed)

hero_services (lhumina_code/hero_services, branch: development):

Commit Description
5e99dfa Remove Dockerfile.dev, document local build pipeline
38d54f1 Fast local build pipeline + container naming fix
b5e2735 Restore --config-dir in entrypoint
79f3b61 Fix zinit_server CLI: remove --config-dir flag
0ec2acc SSH→HTTPS migration, Docker patches, wasm-opt, service CLI fixes

zinit (geomind_code/zinit, branch: development):

Commit Description
850b849 Use OnceLock for config-dir instead of unsafe env var
40f380a Implement config-dir scanning in service.reload
6611e66 Restore socket module and fix Option<String> Display in zinit_pid1

Other repos — all on development, no changes needed (compiled from latest HEAD):
hero_auth, hero_biz, hero_books, hero_cloud, hero_embedder, hero_forge_ui, hero_foundry, hero_indexer, hero_indexer_ui, hero_inspector, hero_os, hero_osis, hero_proxy, hero_redis, hero_shrimp, hero_voice

Exception: hero_aibroker on development_theme_sync (45408a1)

Fast Local Build Pipeline

Replaces the old Dockerfile.dev (deleted). Compiles inside rust:1.93-bookworm containers with volume-mounted source for correct glibc compatibility.

make deploy        # Full pipeline: dist → pack → push → deploy to herodev (~10 min)
make demo          # Promote :dev → :demo and push

See README.md for full documentation.

Completed Tasks

  • Sync all repos to development branches
  • SSH→HTTPS migration for cargo git deps (5 repos)
  • Docker [patch] removal
  • zinit OnceLock fix (--config-dir, thread-safe, no unsafe code)
  • Entrypoint auto-starts all services after reload
  • Container naming (heroosCONTAINER_NAME per environment)
  • Fast local build pipeline (build-local.sh + Dockerfile.pack)
  • Delete Dockerfile.dev (replaced by local build)
  • README updated with build/deploy documentation
  • Deploy to herodev (:dev) — verified
  • Deploy to herodemo (:demo) — verified
  • All code committed and pushed to development

Key Fixes

  1. SSH→HTTPS migration — all ssh://forge.ourworld.tf git refs → https://
  2. Fast local buildbuild-local.sh + Dockerfile.pack (~10 min vs 40-60 min)
  3. zinit OnceLock — thread-safe --config-dir replacing unsafe env var
  4. Entrypoint auto-start — services auto-start after zinit reload
  5. Container namingheroos → per-env names from app.env
  6. hero_proxy auth fix — filter unexpanded shell vars, bypass auth on /health
  7. hero_cloud .await fix — removed .await on sync call
  8. Service CLI lifecycleserve subcommand for new OServer::run_cli() pattern
The Hero OS Docker build (herodev, herodemo) was built using pinned local copies of repos like hero_rpc, hero_lib, and others. This let us move fast on UI work without worrying about upstream breaking changes. As a result, the running containers were based on old commits while the remote `development` branches had moved ahead significantly. This issue brought everything back in sync and established a reliable build/deploy pipeline. **Follow-up:** https://forge.ourworld.tf/lhumina_code/home/issues/13 — Smoke Tests, Production Deployment & Cleanup **Related:** - https://forge.ourworld.tf/lhumina_code/home/issues/5 — Production Readiness Plan (completed) - https://forge.ourworld.tf/lhumina_code/home/issues/11 — Remaining UI/UX Work - https://forge.ourworld.tf/lhumina_code/home/issues/9 — Form Quality Parity --- ## Result **Both herodev and herodemo are live and fully working.** All services running, zero failures. | Tier | Gateway | Port | Image | Container | Status | |------|---------|------|-------|-----------|--------| | dev | `herodev.gent02.grid.tf` | 8805 | `hero_zero:dev` | `herodev` | 31 services, HTTP 200 | | demo | `herodemo.gent02.grid.tf` | 8806 | `hero_zero:demo` | `herodemo` | 31 services, HTTP 200 | Both containers run the same image: `sha256:38164c7235e707f982b6b697e2e09dbdbd1c023e0a37104dbcf276b3f62b55d6` ### Commits — All on `development` branch (pushed) **hero_services** (`lhumina_code/hero_services`, branch: `development`): | Commit | Description | |--------|-------------| | `5e99dfa` | Remove Dockerfile.dev, document local build pipeline | | `38d54f1` | Fast local build pipeline + container naming fix | | `b5e2735` | Restore --config-dir in entrypoint | | `79f3b61` | Fix zinit_server CLI: remove --config-dir flag | | `0ec2acc` | SSH→HTTPS migration, Docker patches, wasm-opt, service CLI fixes | **zinit** (`geomind_code/zinit`, branch: `development`): | Commit | Description | |--------|-------------| | `850b849` | Use OnceLock for config-dir instead of unsafe env var | | `40f380a` | Implement config-dir scanning in service.reload | | `6611e66` | Restore socket module and fix Option\<String\> Display in zinit_pid1 | **Other repos** — all on `development`, no changes needed (compiled from latest HEAD): `hero_auth`, `hero_biz`, `hero_books`, `hero_cloud`, `hero_embedder`, `hero_forge_ui`, `hero_foundry`, `hero_indexer`, `hero_indexer_ui`, `hero_inspector`, `hero_os`, `hero_osis`, `hero_proxy`, `hero_redis`, `hero_shrimp`, `hero_voice` **Exception:** `hero_aibroker` on `development_theme_sync` (`45408a1`) ### Fast Local Build Pipeline Replaces the old `Dockerfile.dev` (deleted). Compiles inside `rust:1.93-bookworm` containers with volume-mounted source for correct glibc compatibility. ```bash make deploy # Full pipeline: dist → pack → push → deploy to herodev (~10 min) make demo # Promote :dev → :demo and push ``` See README.md for full documentation. ### Completed Tasks - [x] Sync all repos to `development` branches - [x] SSH→HTTPS migration for cargo git deps (5 repos) - [x] Docker [patch] removal - [x] zinit OnceLock fix (`--config-dir`, thread-safe, no unsafe code) - [x] Entrypoint auto-starts all services after reload - [x] Container naming (`heroos` → `CONTAINER_NAME` per environment) - [x] Fast local build pipeline (`build-local.sh` + `Dockerfile.pack`) - [x] Delete `Dockerfile.dev` (replaced by local build) - [x] README updated with build/deploy documentation - [x] Deploy to herodev (`:dev`) — verified - [x] Deploy to herodemo (`:demo`) — verified - [x] All code committed and pushed to `development` ### Key Fixes 1. **SSH→HTTPS migration** — all `ssh://forge.ourworld.tf` git refs → `https://` 2. **Fast local build** — `build-local.sh` + `Dockerfile.pack` (~10 min vs 40-60 min) 3. **zinit OnceLock** — thread-safe `--config-dir` replacing unsafe env var 4. **Entrypoint auto-start** — services auto-start after zinit reload 5. **Container naming** — `heroos` → per-env names from `app.env` 6. **hero_proxy auth fix** — filter unexpanded shell vars, bypass auth on `/health` 7. **hero_cloud .await fix** — removed `.await` on sync call 8. **Service CLI lifecycle** — `serve` subcommand for new `OServer::run_cli()` pattern
mik-tf added this to the ACTIVE project 2026-03-10 15:46:20 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/home#12
No description provided.