feat(#28): ask_user + replay-based step memoization #31

Merged
timur merged 2 commits from feature/issue-28-ask-user-resume into development 2026-05-13 15:37:03 +00:00
Owner

Closes #28.

What this lands

The full resume-by-replay model end-to-end. A flow can call flow.pause(name, schema, ui) or one of the ask_user.* helpers; the subprocess exits cleanly with status awaiting_resume, every previously-completed @flow step's output is persisted, and a later play_resume(play_sid, resume_id, payload) re-spawns the subprocess with both caches primed — the same call site returns the cached payload, every cached step short-circuits, and the flow continues from where it paused without re-firing side-effects.

Surface delta

Schema (crates/hero_logic/schemas/logic/logic.oschema)

  • ExecutionStatus: + awaiting_resume
  • SpanStatus: + replayed
  • Play: + pending_resumes, received_resumes, step_outputs, total_cost_usd
  • New ResumeRequest rootobject
  • New RPC methods: play_resume, play_pending_resumes
  • play_start / play_run_async: + prefill_resumes_json param

Python SDK (crates/hero_logic/sdk/python/hero_tracing.py)

  • flow.pause(...) primitive
  • ask_user.text / number / choice / multi_choice / confirm helpers
  • Step memoization: deterministic step_key from (parent_path, flow_name, sorted kwargs)
  • _AwaitingResume exception → boot stub exits 75
  • Replay-cache loader (_STEP_CACHE, _RESUME_CACHE) primed from JSON files the executor stages before spawn

Span socket

  • New step_output event → appends to Play.step_outputs
  • New pause event → appends idempotent ResumeRequest to Play.pending_resumes

Executor

  • Stages replay JSON files in the per-Play workdir, sets HERO_REPLAY_STEP_OUTPUTS_FILE / HERO_REPLAY_RESUMES_FILE
  • Boot stub catches _AwaitingResume and exits 75
  • Outcome handler maps exit-75 → awaiting_resume (no completed_at stamp)

Admin UI (hero_logic_admin)

  • /plays/{sid} → dedicated play detail page (the legacy redirect to the unified editor lives at /plays/{sid}/workflow for backward-compat)
  • Bottom-bar resizable + collapsible island with three tabs: Logs (live span events) · Pending resumes (per-kind form rendering, POSTs LogicService.play_resume via the /rpc proxy) · Events (chronicle of every asked/answered resume). State persists in localStorage.

E2E drivers (examples/)

  • pause_resume_demo.py — full pause → resume → replay; asserts the side-effecting step ran exactly once across both spawns.
  • pause_resume_prefill.py — headless via prefill_resumes_json (two pauses pre-answered, run completes in one pass).

Tests

New integration test:

pause_exits_75_records_pending_resume_then_replay_uses_cached_payload

proves the load-bearing contract end-to-end with a real python3 subprocess: exit 75 on first run, pending resume + step output persisted, replay returns the cached answer, and the side-effecting step does NOT re-run.

All 50 lib tests pass. (tests/e2e_create_event.rs fails to compile because service_agent_v3.py was never added in 4170665 — pre-existing on development, not caused by this PR.)

Test plan

  • cargo build --workspace clean
  • cargo test -p hero_logic --lib — 50 tests pass including the new pause/replay integration test
  • Bring up hero_logic_server locally and run python3 examples/pause_resume_demo.py → assert it prints ✓ pause→resume→replay round trip passed
  • python3 examples/pause_resume_prefill.py → assert it prints ✓ prefill_resumes_json round trip passed
  • Open /plays/{sid} for a paused play in the admin UI, fill the form in the Pending resumes tab, submit, watch the play complete

🤖 Generated with Claude Code

Closes #28. ## What this lands The full resume-by-replay model end-to-end. A flow can call `flow.pause(name, schema, ui)` or one of the `ask_user.*` helpers; the subprocess exits cleanly with status `awaiting_resume`, every previously-completed `@flow` step's output is persisted, and a later `play_resume(play_sid, resume_id, payload)` re-spawns the subprocess with both caches primed — the same call site returns the cached payload, every cached step short-circuits, and the flow continues from where it paused without re-firing side-effects. ## Surface delta **Schema** (`crates/hero_logic/schemas/logic/logic.oschema`) - `ExecutionStatus`: + `awaiting_resume` - `SpanStatus`: + `replayed` - `Play`: + `pending_resumes`, `received_resumes`, `step_outputs`, `total_cost_usd` - New `ResumeRequest` rootobject - New RPC methods: `play_resume`, `play_pending_resumes` - `play_start` / `play_run_async`: + `prefill_resumes_json` param **Python SDK** (`crates/hero_logic/sdk/python/hero_tracing.py`) - `flow.pause(...)` primitive - `ask_user.text / number / choice / multi_choice / confirm` helpers - Step memoization: deterministic step_key from `(parent_path, flow_name, sorted kwargs)` - `_AwaitingResume` exception → boot stub exits 75 - Replay-cache loader (`_STEP_CACHE`, `_RESUME_CACHE`) primed from JSON files the executor stages before spawn **Span socket** - New `step_output` event → appends to `Play.step_outputs` - New `pause` event → appends idempotent `ResumeRequest` to `Play.pending_resumes` **Executor** - Stages replay JSON files in the per-Play workdir, sets `HERO_REPLAY_STEP_OUTPUTS_FILE` / `HERO_REPLAY_RESUMES_FILE` - Boot stub catches `_AwaitingResume` and exits 75 - Outcome handler maps exit-75 → `awaiting_resume` (no `completed_at` stamp) **Admin UI** (`hero_logic_admin`) - `/plays/{sid}` → dedicated play detail page (the legacy redirect to the unified editor lives at `/plays/{sid}/workflow` for backward-compat) - Bottom-bar resizable + collapsible island with three tabs: **Logs** (live span events) · **Pending resumes** (per-kind form rendering, POSTs `LogicService.play_resume` via the `/rpc` proxy) · **Events** (chronicle of every asked/answered resume). State persists in `localStorage`. **E2E drivers** (`examples/`) - `pause_resume_demo.py` — full pause → resume → replay; asserts the side-effecting step ran exactly once across both spawns. - `pause_resume_prefill.py` — headless via `prefill_resumes_json` (two pauses pre-answered, run completes in one pass). ## Tests New integration test: ``` pause_exits_75_records_pending_resume_then_replay_uses_cached_payload ``` proves the load-bearing contract end-to-end with a real `python3` subprocess: exit 75 on first run, pending resume + step output persisted, replay returns the cached answer, and the side-effecting step does NOT re-run. All 50 lib tests pass. (`tests/e2e_create_event.rs` fails to compile because `service_agent_v3.py` was never added in 4170665 — pre-existing on `development`, not caused by this PR.) ## Test plan - [ ] `cargo build --workspace` clean - [ ] `cargo test -p hero_logic --lib` — 50 tests pass including the new pause/replay integration test - [ ] Bring up `hero_logic_server` locally and run `python3 examples/pause_resume_demo.py` → assert it prints `✓ pause→resume→replay round trip passed` - [ ] `python3 examples/pause_resume_prefill.py` → assert it prints `✓ prefill_resumes_json round trip passed` - [ ] Open `/plays/{sid}` for a paused play in the admin UI, fill the form in the **Pending resumes** tab, submit, watch the play complete 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Closes #28. Implements the full resume-by-replay model end-to-end:
flows can call flow.pause(...) (or one of the ask_user.* helpers)
to ask a question, exit cleanly with state persisted, and resume
later via play_resume(play_sid, resume_id, payload) — a fresh
subprocess re-runs from the top with every previously-completed
step replayed from cache instead of re-executed.

Schema (crates/hero_logic/schemas/logic/logic.oschema):
- Add `awaiting_resume` to ExecutionStatus + `replayed` to SpanStatus.
- Add `ResumeRequest` type.
- Add `pending_resumes`, `received_resumes`, `step_outputs`,
  `total_cost_usd` fields to Play.
- Add RPC methods `play_resume`, `play_pending_resumes`. Extend
  `play_start` / `play_run_async` with `prefill_resumes_json` so
  non-interactive callers (E2E drivers, benchmarks) can pre-answer
  every pause and stay headless.

Python SDK (crates/hero_logic/sdk/python/hero_tracing.py):
- Add `flow.pause(name, schema, ui)` + `ask_user.text / number /
  choice / multi_choice / confirm` helpers.
- Add step memoization: every `@flow` call computes a deterministic
  step_key from (parent path, flow name, sorted-kwarg JSON) and
  short-circuits with the cached output on replay.
- Add `_AwaitingResume` exception bubbled by `flow.pause` when no
  cached payload exists; the boot stub catches it and exits with
  EXIT_AWAITING_RESUME (75 / EX_TEMPFAIL).
- Add replay-cache loader that pulls step_outputs + received_resumes
  from JSON files the executor stages before spawn.

Span socket (crates/hero_logic/src/engine/span_socket.rs):
- Add `step_output` event type → appends to Play.step_outputs.
- Add `pause` event type → appends a ResumeRequest to
  Play.pending_resumes (idempotent on id).
- Recognize "replayed" as a SpanStatus value.

Executor (crates/hero_logic/src/engine/python_executor.rs):
- Stage replay JSON files in the per-Play workdir and point the
  Python SDK at them via HERO_REPLAY_STEP_OUTPUTS_FILE /
  HERO_REPLAY_RESUMES_FILE.
- Boot stub catches `_AwaitingResume` and exits 75 cleanly.
- New ExecuteOptions fields: replay_step_outputs_json,
  replay_resumes_json.

RPC handlers (crates/hero_logic/src/logic/server/rpc.rs):
- play_resume: validate payload, move pending→received, flip
  Play.status to Running, respawn with primed caches. Returns
  {"ok":true,"resumed_at":<ms>}.
- play_pending_resumes: read open requests.
- play_start / play_run_async: accept prefill_resumes_json and
  seed Play.received_resumes from it.
- play_cancel: allow cancellation from awaiting_resume.
- spawn_python_flow: snapshot Play.step_outputs + received_resumes
  into ExecuteOptions before spawn. Map exit-code 75 →
  AwaitingResume (no completed_at stamp).

Admin UI (crates/hero_logic_admin):
- Route /plays/{sid} back to the dedicated play detail page (this
  is where the bottom-bar island lives per PRD). Add
  /plays/{sid}/workflow as a legacy redirect.
- Plumb pending_resumes + received_resumes through play_detail
  template.
- Bottom-bar island with three tabs: Logs (live span events),
  Pending resumes (per-kind form rendering: text / number / choice /
  multi_choice / confirm → POSTs LogicService.play_resume via the
  /rpc proxy), Events (full ResumeRequest history). Resizable +
  collapsible; state persists in localStorage.

E2E driver scripts (examples/):
- pause_resume_demo.py: full pause → resume → replay round trip,
  asserts pick_record runs exactly once across both subprocess
  spawns.
- pause_resume_prefill.py: headless via prefill_resumes_json — two
  pauses pre-answered, play completes in a single pass without
  ever entering awaiting_resume.

Tests:
- New integration test
  `pause_exits_75_records_pending_resume_then_replay_uses_cached_payload`
  proves the load-bearing contract end-to-end with real python3:
  exit 75 on first run, pending resume + step output persisted,
  replay returns the cached answer, and the side-effecting step
  does NOT re-run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auto-generated OSIS CRUD methods expose their argument under
'data', not 'obj'. Caught when running the example scripts against
a live server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
timur merged commit ccaa6df004 into development 2026-05-13 15:37:03 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_logic!31
No description provided.