[seed] Reproducible media seeding — photos / songs / videos files + copy-step + runbook §7.3 #48

Open
opened 2026-04-30 16:30:50 +00:00 by mik-tf · 0 comments
Owner

Goal

Make the demo media (photos, songs, videos) reproducible from this repo. Today the herodemo VM has photos / songs / videos seeded somehow (probably from a backup tarball or a one-off SCP), but a fresh deploy following docs/ops/DEPLOYMENT.md produces empty Library archipelagos.

Current gap

Asset In repo today On herodemo today Reproducible from a fresh deploy?
data/media/photos/photo_NN.jpg ✓ jpgs present (state currently unknown — Photos UI shows empty during Apr 30 session) ✗ no copy-step from data/media/ to ~/hero/var/hero_foundry/webdav/<ctx>/Photos/
data/media/songs/clip_*.ogg ✓ oggs present ? ✗ same — no copy-step
data/media/videos/*.mp4 ✗ directory does not exist ✓ Big Buck Bunny, Sintel, Tears of Steel mp4 in webdav/{root,default,geomind,incubaid}/Videos/ ✗ no source files in repo, no copy-step

data/seed/seed_install.sh only handles Office-doc generation + libraries.txt for hero_books today.

Acceptance criteria

  • Add data/media/videos/ with one or more trimmed sample mp4s (~5-10 MB each is fine). Two acceptable approaches — pick one:
    • (a) Commit a couple of small CC-licensed clips directly (e.g. trim Big Buck Bunny down to 30 sec).
    • (b) data/media/videos/MANIFEST.txt listing one entry per file with <filename> <sha256> <download-url>. Commit nothing else; let seed_install.sh download and verify on first run.
  • Extend data/seed/seed_install.sh with a seed_media sub-routine that:
    • Reads a list of contexts to seed (default: default, plus any caller-supplied list)
    • For each context, copies data/media/photos/*.jpg~/hero/var/hero_foundry/webdav/<ctx>/Photos/
    • Same for data/media/songs/*.oggwebdav/<ctx>/Songs/ (or Music/, whatever the archipelago expects — verify against current foundry layout)
    • Same for data/media/videos/*.mp4webdav/<ctx>/Videos/
    • Idempotent — re-running refreshes / overwrites; never deletes user-uploaded files
  • Update runbook §7 with a §7.3 Media seeding subsection documenting:
    • Where files live in repo
    • Where they land on disk
    • Which service serves them (hero_foundry)
    • How to add more samples
    • How to seed for additional contexts beyond default
  • Idempotence verified: run seed_install.sh twice on a fresh VM — second run is a no-op.

Open question for the implementer

Does the Photos / Videos archipelago island read only from foundry's webdav (filesystem listing), or does it also need OSIS metadata records (in hero_osis_media) for each file? If the latter, this issue overlaps with the OSIS-seed issue and we may need a small seed_media_osis step here too. Verify before implementing.

Out of scope

  • OSIS-record seeding for non-media domains (Persons, Companies, Projects, etc.) — see the OSIS seed issue.
  • Documenting the update flow — see the update-runbook issue.

Signed-off-by: mik-tf

## Goal Make the demo media (photos, songs, videos) reproducible from this repo. Today the herodemo VM has photos / songs / videos seeded somehow (probably from a backup tarball or a one-off SCP), but a fresh deploy following `docs/ops/DEPLOYMENT.md` produces empty Library archipelagos. ## Current gap | Asset | In repo today | On herodemo today | Reproducible from a fresh deploy? | |---|---|---|---| | `data/media/photos/photo_NN.jpg` | ✓ jpgs present | (state currently unknown — Photos UI shows empty during Apr 30 session) | ✗ no copy-step from `data/media/` to `~/hero/var/hero_foundry/webdav/<ctx>/Photos/` | | `data/media/songs/clip_*.ogg` | ✓ oggs present | ? | ✗ same — no copy-step | | `data/media/videos/*.mp4` | ✗ directory does not exist | ✓ Big Buck Bunny, Sintel, Tears of Steel mp4 in `webdav/{root,default,geomind,incubaid}/Videos/` | ✗ no source files in repo, no copy-step | `data/seed/seed_install.sh` only handles Office-doc generation + `libraries.txt` for hero_books today. ## Acceptance criteria - [ ] **Add `data/media/videos/`** with one or more trimmed sample mp4s (~5-10 MB each is fine). Two acceptable approaches — pick one: - **(a)** Commit a couple of small CC-licensed clips directly (e.g. trim Big Buck Bunny down to 30 sec). - **(b)** `data/media/videos/MANIFEST.txt` listing one entry per file with `<filename> <sha256> <download-url>`. Commit nothing else; let `seed_install.sh` download and verify on first run. - [ ] **Extend `data/seed/seed_install.sh`** with a `seed_media` sub-routine that: - Reads a list of contexts to seed (default: `default`, plus any caller-supplied list) - For each context, copies `data/media/photos/*.jpg` → `~/hero/var/hero_foundry/webdav/<ctx>/Photos/` - Same for `data/media/songs/*.ogg` → `webdav/<ctx>/Songs/` (or `Music/`, whatever the archipelago expects — verify against current foundry layout) - Same for `data/media/videos/*.mp4` → `webdav/<ctx>/Videos/` - Idempotent — re-running refreshes / overwrites; never deletes user-uploaded files - [ ] **Update runbook §7** with a `§7.3 Media seeding` subsection documenting: - Where files live in repo - Where they land on disk - Which service serves them (hero_foundry) - How to add more samples - How to seed for additional contexts beyond `default` - [ ] **Idempotence verified**: run `seed_install.sh` twice on a fresh VM — second run is a no-op. ## Open question for the implementer Does the Photos / Videos archipelago island read **only** from foundry's webdav (filesystem listing), or does it also need OSIS metadata records (in `hero_osis_media`) for each file? If the latter, this issue overlaps with the OSIS-seed issue and we may need a small `seed_media_osis` step here too. Verify before implementing. ## Out of scope - OSIS-record seeding for non-media domains (Persons, Companies, Projects, etc.) — see the OSIS seed issue. - Documenting the update flow — see the update-runbook issue. Signed-off-by: mik-tf
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_demo#48
No description provided.