feat: Logic / LogicVersion / Play model + single logic view (supersedes #38) #39

Open
opened 2026-05-14 13:10:41 +00:00 by timur · 2 comments
Owner

feat: Logic / LogicVersion / Play model + single logic view (supersedes #38)

Final shape after the discussion in #38. This issue captures the full target state — schema, SDK, execution, UI — and is the implementation reference.

TL;DR

  • A Logic is a named, versioned concept. Each LogicVersion carries the typed I/O + source — different versions can have different signatures.
  • Every function call to another Logic creates a Play as a child of the caller's Play. Plays form a tree mirroring the invocation tree.
  • Primitive code (RPC calls, stdlib, raw lines) lives in the source; the UI parses it for visualization. No runtime span records.
  • Two UI views only: a dashboard listing Logics, and one logic view per Logic with a bottom play bar.

The system collapses to three rootobjects (Logic, LogicVersion, Play), one navigation rule (click a sub-Logic → open its view), and one execution rule (every invocation = a child Play).


1. Data model

Rootobjects

Logic = {
    sid: str
    name: str                   @index
    description: str            @index
    current_version_sid: str
    versions: [str]             # newest first
    tags: [str]
    created_at: otime
    updated_at: otime
}

LogicVersion = {
    sid: str
    logic_sid: str
    version_label: str          @index    # "v1", "v3.1-optimized"
    notes: str                            # version-specific notes
    inputs:  [FlowField]                  # typed input signature for THIS version
    outputs: [FlowField]                  # typed output signature for THIS version
    python_source: str
    created_at: otime
    updated_at: otime
}

Play = {
    sid: str
    logic_sid: str
    logic_version_sid: str
    name: str                   @index
    parent_play_sid: str        # empty for top-level
    sub_play_sids: [str]        # child Plays in invocation order

    status: PlayStatus          # pending | running | awaiting_resume | success | failed | cancelled | timed_out
    input_data: str             # JSON
    output_data: str            # JSON (accumulates)
    error_message: str

    started_at: u64
    completed_at: u64
    duration_ms: u64

    # captured stdout/stderr from this play's own body (not children's)
    logs: str

    # pause/resume + memoization
    pending_resumes: [ResumeRequest]
    received_resumes: str       # JSON {resume_id: payload}
    step_outputs: str           # JSON {step_key: output} — caches child play results
    prefill_only: bool

    # cost aggregates (lifted from RPC responses at transport layer)
    total_tokens_prompt: u32
    total_tokens_completion: u32
    total_cost_usd: f64

    created_at: otime
}

Value types

FlowField = {
    name, field_type, description, required, default
}

LogicExample [rootobject] = {
    sid: str
    logic_sid: str                       # always set
    logic_version_sid: str               # "" = applies to any version
    name: str                @index
    description: str         @index
    input_values: str                    # JSON {input_name: value}
    tags: [str]
    created_at: otime
    updated_at: otime
}

ResumeRequest = {
    id, name, schema, ui,
    asked_at, payload, resumed_at
}

PlayStatus = "pending" | "running" | "awaiting_resume" | "success" | "failed" | "cancelled" | "timed_out"

Deleted from today's schema

  • Workflow, WorkflowVersion rootobjects → renamed to Logic, LogicVersion.
  • Example rootobject → renamed to LogicExample rootobject; gains logic_version_sid for per-version scoping ("" = applies to all versions).
  • Benchmark rootobject + PerRunResult value type → benchmark stats are derived on the fly from queries over Plays.
  • Span value type, SpanKind and SpanStatus enums → gone entirely. No span concept.
  • Play.spans, Play.parent_span_id → gone (replaced by parent_play_sid + sub_play_sids).

Stored-data migration

  • Each existing Workflow becomes a Logic (same SID). Add #[serde(alias = "Workflow")] if the type tag is persisted; otherwise straight rename.
  • Each existing WorkflowVersion becomes a LogicVersion (same SID).
  • Existing Workflow.inputs/Workflow.outputs migrate to the current LogicVersion (since that's what the latest source matches). Older versions migrate with empty inputs/outputs; the editor can re-populate from a static parse or the user fills them.
  • For Workflows whose python_source contains multiple @flow-decorated functions: walk the AST, extract each @flow def into a fresh standalone Logic record, rewrite the parent's source to call them by name (e.g. logic.invoke("model_call", …)). The migration script seeds the new Logic records and updates the parent's source. One-shot at upgrade.
  • Existing Example records: rename in place to LogicExample. Existing workflow_sid/workflow_version_sid fields become logic_sid/logic_version_sid (serde aliases preserve back-compat). No data loss.
  • Existing Benchmark records: drop. Stats re-derive from Plays.
  • Existing Play records:
    • rename workflow_sidlogic_sid and workflow_version_sidlogic_version_sid (with serde aliases for back-compat)
    • flatten spans into nothing (info we keep — status/timing/cost — already lives on Play directly; status of parent_span_idparent_play_sid set to "" if it was a span on the same play, or to the parent play's sid if the play was launched with play_run_async(parent_span_id=…))
    • existing plays without sub_play_sids populated are inspectable as terminal records but won't have the new tree-drill-in; new plays after the migration have it.

2. SDK surface (Python)

from hero_tracing import logic, ask_user

Decorator

@logic(name="select_services",
       description="…",
       inputs={"prompt": {"type":"string","required":True,"description":"…"},
               "catalog": {"type":"object","required":True}},
       outputs={"services": {"type":"array","description":"chosen service names"}},
       entry=True)
def select_services(prompt, catalog):
    ...

Every call to a @logic-decorated function creates a child Play of the currently-executing Play. The first decorated call (the entry) is the top-level Play. Recursion is the only composition primitive.

Invocation

result = logic.invoke("model_call", model="…", messages=[...])

Resolves by name against the Logic library. The child Play's logic_sid = the resolved Logic; logic_version_sid defaults to the resolved Logic's current_version_sid. The parent's sub_play_sids gets the child's sid appended in invocation order.

from <logic_name> import <logic_name> works too (sugar; meta-path resolver), but the canonical surface is logic.invoke(...).

Pause / resume

choice = ask_user.choice("Which version?", options=["new", "old"])
answer = logic.pause("approve_release", ui={"kind": "confirm", "prompt": "Ship?"})

logic.pause puts THIS Play into awaiting_resume (only the current play; its parent stays in running and its own play_wait on the child blocks). Resumption posts to play_resume(child_play_sid, …).

Logging

logic.log("starting attempt 2")

Appends to the current Play's logs field with a timestamp. Replaces today's flow.current_span.log(…).

Errors

raise logic.Failed("model returned empty array")

Marks the Play failed with the given message (no traceback). Any other exception is captured with full traceback in error_message.

Gone from the SDK

  • flow.step, flow.span — make a function instead.
  • flow.current_span.tag — write to logs or use logic.log.
  • instrument(client) — replaced by transport-level auto-trace.

The flow name is retained as an alias of logic for one release so stored python_source keeps working without immediate rewrite. (from hero_tracing import flow as logic is effectively how migrated sources read.)


3. Execution model

Spawn

Every logic.invoke(...) call:

  1. Resolves the name to a Logic + LogicVersion.
  2. Creates a child Play row: parent_play_sid = current play's sid, logic_sid + logic_version_sid set, input_data = the JSON of kwargs, status = running.
  3. Appends child's sid to parent's sub_play_sids.
  4. Executes the child's python_source in the SAME subprocess (in-process; no per-call subprocess fork). The child's body runs to completion.
  5. On clean return: sets child's output_data, status=success, completed_at, duration_ms. Returns the value to the caller.
  6. On logic.Failed or any exception: sets status=failed, error_message. Re-raises so the parent sees it (unless caught).

The top-level Play (the one started by LogicService.play_start) runs in its own subprocess as today. Sub-Plays do NOT fork subprocesses by default — they're records of in-process invocations.

Transport-level auto-trace

The JSON-RPC transport used by every generated Hero client wraps each call:

  • On request: stamps start time.
  • On response: stamps end time. If the response has usage.prompt_tokens / usage.completion_tokens (the aibroker shape), looks up the model's price and increments the current Play's total_tokens_* and total_cost_usd.
  • On error: propagates the exception to the caller; the current Play's status flows from there.

No per-call record is persisted. The aggregates on the Play are the only runtime trace. Per-call detail (params, result, exact timing of one of N similar calls) is reconstructable from source + logs if needed for debugging.

Pause / resume

Each Play has its own resume state. When logic.pause(...) runs:

  1. SDK emits a pause event over the per-Play UDS socket with a deterministic resume_id (logic name + call sequence within this Play).
  2. Subprocess exits with code 75 if there's no cached answer; the executor reads the pause event, appends to the THIS PLAY's pending_resumes, sets status=awaiting_resume.
  3. If this Play is a child of another running Play, the parent's play_wait (blocking call) detects the child entered awaiting_resume, stays parked, and propagates awaiting_resume up the chain.
  4. play_resume(play_sid, resume_id, payload) writes the answer to the matching Play's received_resumes, flips its status back to running, respawns its subprocess. The chain rewakens naturally.

Each Play has its own step_outputs cache (memoization of child Play results keyed by (version_sid, parent_path, child_logic_name, child_input_args)). On replay, child invocations whose key is in the cache short-circuit to the cached output without re-executing the child. Side-effects don't double-fire.

Wall-clock + sandbox

Same as today's Tier 0 (hero_logic#14): per-subprocess wall-clock budget, address-space cap, fd cap, scrubbed env. Applies to the top-level Play's subprocess. Sub-Plays inherit by virtue of being in the same subprocess.


4. UI

Routes (final)

URL View
/ Dashboard — list of Logics
/logics/{sid} The logic view (only)
/logics/{sid}?play={play_sid} Logic view with a Play overlaid
/api/plays/{sid}/overlay JSON endpoint for live overlay polling
/rpc JSON-RPC proxy to backend

Everything else gets removed: /workflows/*, /examples, /plays, /plays/{sid}, the top toolbar on the editor.

Dashboard

A simple table or card grid:

name description latest run success rate (last 50)
service_agent Self-contained AI agent… 2 min ago (success) 75%
model_call Single AI model call… 5 min ago (success) 98%
ai_chat Lower-level chat completion 1 hr ago (failed) 85%

Plus a + New Logic button.

Logic view layout

┌──────────────────────────────────────────────────────────────────────┐
│  LEFT (info)        │   MIDDLE (flow / code)   │  RIGHT (stats)      │
│                     │                          │                     │
│  ↶ Breadcrumb       │   [Graph] [Code] [Split] │  ┌────────────────┐ │
│                     │                          │  │ Benchmark v3   │ │
│  Title              │   ▼ Flow                 │  │ runs: 50       │ │
│  Description        │   (sub-logics + RPC      │  │ success: 75%   │ │
│  Inputs (declared)  │    icons + control       │  │ p50 dur: 7.2s  │ │
│  Outputs (declared) │    flow viz)             │  │ avg cost: $.02 │ │
│  Versions           │                          │  └────────────────┘ │
│                     │                          │                     │
├─────────────────────┴──────────────────────────┴─────────────────────┤
│  PLAY BAR (always)                                                   │
│  Inputs + Examples + Plays │ Trace + Pause forms │ Output    [▶ Run] │
└──────────────────────────────────────────────────────────────────────┘

Left sidebar — info

  • Breadcrumb (when scoped to a sub-Logic mid-Play; shows ancestor logic names).
  • Title (Logic.name), description.
  • Inputs: declared field list — name, type, required, description. No value inputs here.
  • Outputs: declared field list — name, type, description. (Moved here from today's right sidebar.)
  • Versions: list of LogicVersions with the current one marked. Click to switch.

Middle — flow / code

  • Flow view (default): renders the Play tree visualization for the currently-overlaid Play (or, when idle, the most recent Play of this Logic). Static source parse provides structural backdrop (loops, conditionals, RPC call locations).
    • Each sub-Play renders as a drillable node — click navigates to /logics/{child_logic_sid}?play={child_play_sid}.
    • Each recognized RPC call from source renders as a leaf node with service.method label — click opens a side popover (shows runtime cost if it accumulated tokens; otherwise just the static source line).
    • Loops render as a bracket grouping consecutive sub-Plays of the same Logic; conditionals as branch points where only the taken path's sub-Plays are filled.
    • Replayed sub-Plays render with a dashed border + ↻ marker (status=success but populated via cached output).
  • Code view: Monaco bound to the current LogicVersion.python_source. Save → creates a new LogicVersion.
  • Split view: half each, click-to-jump-to-source from a flow node.

Right sidebar — stats

Two modes:

  • Idle (no overlay): Benchmark card for current_version_sid. Stats derived on the fly from a query over recent Plays of this version. Refresh button.
  • Overlay active: This Play's stats. Status, duration, tokens, cost, attempts (count of sub-Plays of the same Logic, if it loops). Cancel button if the play is non-terminal.

Bottom play bar — three columns

Left column

  • Inputs — one field per declared input, type-appropriate widget. Values are local-only (not stored on the Logic).
  • Examples ▾ — collapsible list of saved Logic.examples. Click → populate inputs. "Save current as example" button.
  • Plays ▾ — collapsible list of recent Plays of this Logic. Click → overlay that Play.
  • ▶ Run button — validates inputs, calls play_start, overlays the new Play.

Middle column

  • Live trace as the Play executes: each new sub-Play creation + each completed sub-Play appended as a row (timestamp + name + status + duration).
  • When the overlaid Play (or one of its descendants in the chain) is in awaiting_resume: pause-form banner at the TOP of the column, accent border, persistent until answered. Forms render per ResumeRequest.ui.kind: text / number / choice / multi_choice / confirm. Submit posts play_resume with the matching play_sid (could be the overlaid play or a child along the chain).

Right column

  • output_data rendered as it accumulates. If Logic.outputs declares fields, render one labeled card per output. Else raw JSON.

Sub-logic mid-Play navigation

Click a sub-Play node in the flow → navigate to /logics/{child_logic_sid}?play={child_play_sid}. The whole logic view rerenders for the child:

  • Left sidebar shows the child's metadata + breadcrumb (service_agent / select_services).
  • Middle shows the child's flow view rendered from the child Play's own data.
  • Right shows the child Play's stats.
  • Play bar's left column shows the child Logic's declared inputs prefilled from the child Play's input_data, plus the child's examples and plays. ▶ Run launches a fresh standalone top-level Play of the child Logic with those inputs (different from the overlaid child Play).
  • Play bar's middle shows the child Play's live trace (its own sub_play_sids + recognized RPC calls). If a pause is pending on the child or its descendants, the form renders here.
  • Play bar's right column shows the child Play's output.

Breadcrumb click → navigate up. Browser back works. URLs share.


5. Walked example: service_agent

The existing service_agent workflow (8 @flow functions in one file) migrates to 8 separate Logic records:

service_agent       (entry; calls fetch_catalog, select_services, …)
fetch_catalog       (one logic.invoke or stdlib calls)
select_services     (calls logic.invoke("model_call", …) inside)
compile_stubs
service_code_gen    (calls logic.invoke("model_call", …))
script_execution
debug_feedback
summarize

Initial state — dashboard

Dashboard lists all 8 Logics. User clicks service_agent.

Logic view — idle

  • Left sidebar: title service_agent, description, inputs (prompt, code_gen_model), outputs (summary), versions (v3 current).
  • Middle: flow view shows fetch_catalog → select_services → for attempt in range(3): {service_code_gen → script_execution → debug_feedback} → summarize. Each is a clickable sub-Logic node.
  • Right sidebar: benchmark card. "v3: 75% success, p50 6.8s, avg cost $0.018."
  • Play bar: input fields empty, Examples = ["Calendar event", "Find contact", "Marketplace tokens"], Plays = ["02hq failed 2h ago", "02ca success 1d ago"], ▶ Run.

User picks "Calendar event" example, hits ▶ Run

Inputs auto-fill (prompt="Create a calendar event …", model=""). Run → play_start → new top-level Play 02j7 created → subprocess spawns → Play overlays the view.

Right sidebar switches to "Play 02j7 — running 0.4s — 0 tokens".

Middle flow view starts updating live:

service_agent (Play 02j7)              running   0.4s
└─ fetch_catalog (Play 02j8)            ok        63ms

Then more arrive — select_services Play starts; child Plays accumulate.

Play bar middle column logs each event chronologically. Right column starts filling as output_data accumulates.

A pause fires inside select_services

The Play for select_services (say 02j9) emits a pause event. Its subprocess exits 75. Its pending_resumes gets the new ResumeRequest. Its status → awaiting_resume. The parent service_agent Play sees its child entered awaiting_resume (via play_wait returning non-terminal) and propagates: parent's status → awaiting_resume as well.

Play bar middle column shifts to show:

┃ ┃ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ┃
┃ ┃ ⏸ Pending pause in select_services (Play 02j9) ┃
┃ ┃                                              ┃
┃ ┃ hero_osis_calendar has these rootobjects matching 'event':
┃ ┃   ◉ Event                                    ┃
┃ ┃   ○ RecurringEvent                           ┃
┃ ┃   ○ Reminder                                 ┃
┃ ┃   ○ Cancel                                   ┃
┃ ┃   [Submit]                                   ┃
┃ ┃ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ┃

User clicks "Event", submits. The form posts play_resume(02j9, "select_services#0@ask_user.choice#0", "Event"). Server respawns select_services's subprocess with the cached answer; it returns through ai chain → parent unblocks → execution continues.

User clicks service_code_gen node mid-Play

Page navigates to /logics/{service_code_gen_sid}?play=02jb (the child Play sid).

  • Left sidebar: service_code_gen metadata + breadcrumb service_agent / service_code_gen.
  • Middle: flow view for Play 02jb — shows its sub-Plays (model_call Play 02jc — running) and the source's primitive re.search(...) and ast.parse(...) calls as markers.
  • Right sidebar: stats for Play 02jb — running, 4s, 1284 prompt + 320 completion, $0.012.
  • Play bar: inputs prefilled (prompt="…", services=[…]); examples = service_code_gen's saved examples; ▶ Run starts a fresh standalone play of service_code_gen with those inputs.

Click service_agent in the breadcrumb → back to parent.

Play completes

Top-level Play 02j7 reaches status=success. Output column fills in: {"summary": "Calendar event 'Standup' created for tomorrow 10am."}. Right sidebar shows final stats: duration 11.4s, total cost $0.024. Plays list in the play bar prepends 02j7 success just now.


6. Implementation phases

  1. Schema migration — define new oschema, generate types, write the migration script that walks existing Workflows/WorkflowVersions/Examples/Benchmarks/Plays and converts them. Run the script as part of the upgrade.
  2. SDK rename + collapse — rename @flow@logic, flow.*logic.*. Drop flow.step, instrument(), span events. Add logic.log(). Each logic.invoke() creates a child Play.
  3. Runtime: child Plays — extend the executor to create a Play row per logic invocation. Hook the per-Play span socket (now per-Play event socket) so each Play receives its own events directly. Maintain parent_play_sid and sub_play_sids.
  4. Transport-level auto-trace — patch the openrpc transport once. Aibroker cost lifted onto the current Play's totals.
  5. Pause-chain propagation — top-level play_wait returns when a descendant Play enters awaiting_resume. play_resume works on any Play in the chain.
  6. Static source parser — Python source → flow graph (logic invocations + RPC calls + loop/branch structure). Used for the idle flow view.
  7. Logic view + play bar UI rebuild — three-region body + bottom three-column bar. Move toolbar functionality into the play bar. Move outputs from right sidebar to left.
  8. Sub-Logic navigation — URL routing, breadcrumb stack, scoped views per child Play.
  9. Benchmark stats derived from queries — drop the Benchmark rootobject; right sidebar runs play.list_by_version filtered queries.
  10. Cleanups — delete play_detail.html, examples.html, plays.html, workflows.html, and their handlers. 301 redirects from old URLs.

7. Acceptance

  • A @logic function call creates a child Play row whose parent_play_sid matches and whose sid appears in the parent's sub_play_sids.
  • The flow view of a Play renders sub-Plays as drillable nodes and source-parsed RPC calls as leaf nodes.
  • A pause inside a nested logic propagates awaiting_resume up to the top-level Play; answering it via play_resume(child_play_sid, …) resumes the chain.
  • Step memoization works at the Play level: replaying a Play whose step_outputs is populated short-circuits sub-Play invocations to their cached outputs.
  • Aibroker calls inside any Play accumulate total_tokens_* and total_cost_usd on that Play (not the parent).
  • / is the dashboard, /logics/{sid} is the only other view. /workflows/*, /examples, /plays, /plays/{sid} all 301 or 404.
  • The logic view has three regions (left info / middle flow / right stats) plus the bottom play bar. The top toolbar is gone.
  • Breadcrumb-click-back from a child Logic returns to the parent's logic view, with the parent's Play overlay still active.
  • Stored python_source written before the rename keeps loading (flow is exported as an alias of logic).
  • Stored Workflow/WorkflowVersion records load as Logic/LogicVersion via serde aliases.

8. Out of scope (future)

  • Visual code editor: drag-and-drop sub-Logic blocks, RPC blocks; conditional/loop/parallel visualization with data-flow lines; two-way graph↔code binding.
  • Auto-instrument generated RPC clients at build time (rather than at transport-level runtime).
  • Cross-process sub-Plays (spawned in their own subprocess) for isolation/sandboxing or distribution. The Play tree model supports this; the implementation phase defers it.

Supersedes #38.

# feat: Logic / LogicVersion / Play model + single logic view (supersedes #38) Final shape after the discussion in #38. This issue captures the full target state — schema, SDK, execution, UI — and is the implementation reference. ## TL;DR - A **Logic** is a named, versioned concept. Each **LogicVersion** carries the typed I/O + source — different versions can have different signatures. - Every function call to another Logic creates a **Play** as a child of the caller's Play. Plays form a tree mirroring the invocation tree. - Primitive code (RPC calls, stdlib, raw lines) lives in the source; the UI parses it for visualization. No runtime span records. - Two UI views only: a dashboard listing Logics, and one logic view per Logic with a bottom play bar. The system collapses to three rootobjects (Logic, LogicVersion, Play), one navigation rule (click a sub-Logic → open its view), and one execution rule (every invocation = a child Play). --- ## 1. Data model ### Rootobjects ``` Logic = { sid: str name: str @index description: str @index current_version_sid: str versions: [str] # newest first tags: [str] created_at: otime updated_at: otime } LogicVersion = { sid: str logic_sid: str version_label: str @index # "v1", "v3.1-optimized" notes: str # version-specific notes inputs: [FlowField] # typed input signature for THIS version outputs: [FlowField] # typed output signature for THIS version python_source: str created_at: otime updated_at: otime } Play = { sid: str logic_sid: str logic_version_sid: str name: str @index parent_play_sid: str # empty for top-level sub_play_sids: [str] # child Plays in invocation order status: PlayStatus # pending | running | awaiting_resume | success | failed | cancelled | timed_out input_data: str # JSON output_data: str # JSON (accumulates) error_message: str started_at: u64 completed_at: u64 duration_ms: u64 # captured stdout/stderr from this play's own body (not children's) logs: str # pause/resume + memoization pending_resumes: [ResumeRequest] received_resumes: str # JSON {resume_id: payload} step_outputs: str # JSON {step_key: output} — caches child play results prefill_only: bool # cost aggregates (lifted from RPC responses at transport layer) total_tokens_prompt: u32 total_tokens_completion: u32 total_cost_usd: f64 created_at: otime } ``` ### Value types ``` FlowField = { name, field_type, description, required, default } LogicExample [rootobject] = { sid: str logic_sid: str # always set logic_version_sid: str # "" = applies to any version name: str @index description: str @index input_values: str # JSON {input_name: value} tags: [str] created_at: otime updated_at: otime } ResumeRequest = { id, name, schema, ui, asked_at, payload, resumed_at } PlayStatus = "pending" | "running" | "awaiting_resume" | "success" | "failed" | "cancelled" | "timed_out" ``` ### Deleted from today's schema - `Workflow`, `WorkflowVersion` rootobjects → renamed to `Logic`, `LogicVersion`. - `Example` rootobject → renamed to `LogicExample` rootobject; gains `logic_version_sid` for per-version scoping (`""` = applies to all versions). - `Benchmark` rootobject + `PerRunResult` value type → benchmark stats are derived on the fly from queries over Plays. - `Span` value type, `SpanKind` and `SpanStatus` enums → gone entirely. No span concept. - `Play.spans`, `Play.parent_span_id` → gone (replaced by `parent_play_sid` + `sub_play_sids`). ### Stored-data migration - Each existing `Workflow` becomes a `Logic` (same SID). Add `#[serde(alias = "Workflow")]` if the type tag is persisted; otherwise straight rename. - Each existing `WorkflowVersion` becomes a `LogicVersion` (same SID). - Existing `Workflow.inputs`/`Workflow.outputs` migrate to the **current** `LogicVersion` (since that's what the latest source matches). Older versions migrate with empty `inputs`/`outputs`; the editor can re-populate from a static parse or the user fills them. - For Workflows whose `python_source` contains multiple `@flow`-decorated functions: walk the AST, extract each `@flow` `def` into a fresh standalone `Logic` record, rewrite the parent's source to call them by name (e.g. `logic.invoke("model_call", …)`). The migration script seeds the new Logic records and updates the parent's source. One-shot at upgrade. - Existing `Example` records: rename in place to `LogicExample`. Existing `workflow_sid`/`workflow_version_sid` fields become `logic_sid`/`logic_version_sid` (serde aliases preserve back-compat). No data loss. - Existing `Benchmark` records: drop. Stats re-derive from Plays. - Existing `Play` records: - rename `workflow_sid` → `logic_sid` and `workflow_version_sid` → `logic_version_sid` (with serde aliases for back-compat) - flatten `spans` into nothing (info we keep — status/timing/cost — already lives on Play directly; status of `parent_span_id` → `parent_play_sid` set to `""` if it was a span on the same play, or to the parent play's sid if the play was launched with `play_run_async(parent_span_id=…)`) - existing plays without `sub_play_sids` populated are inspectable as terminal records but won't have the new tree-drill-in; new plays after the migration have it. --- ## 2. SDK surface (Python) ```python from hero_tracing import logic, ask_user ``` ### Decorator ```python @logic(name="select_services", description="…", inputs={"prompt": {"type":"string","required":True,"description":"…"}, "catalog": {"type":"object","required":True}}, outputs={"services": {"type":"array","description":"chosen service names"}}, entry=True) def select_services(prompt, catalog): ... ``` Every call to a `@logic`-decorated function creates a child Play of the currently-executing Play. The first decorated call (the entry) is the top-level Play. Recursion is the only composition primitive. ### Invocation ```python result = logic.invoke("model_call", model="…", messages=[...]) ``` Resolves by name against the Logic library. The child Play's `logic_sid` = the resolved Logic; `logic_version_sid` defaults to the resolved Logic's `current_version_sid`. The parent's `sub_play_sids` gets the child's sid appended in invocation order. `from <logic_name> import <logic_name>` works too (sugar; meta-path resolver), but the canonical surface is `logic.invoke(...)`. ### Pause / resume ```python choice = ask_user.choice("Which version?", options=["new", "old"]) answer = logic.pause("approve_release", ui={"kind": "confirm", "prompt": "Ship?"}) ``` `logic.pause` puts THIS Play into `awaiting_resume` (only the current play; its parent stays in `running` and its own `play_wait` on the child blocks). Resumption posts to `play_resume(child_play_sid, …)`. ### Logging ```python logic.log("starting attempt 2") ``` Appends to the current Play's `logs` field with a timestamp. Replaces today's `flow.current_span.log(…)`. ### Errors ```python raise logic.Failed("model returned empty array") ``` Marks the Play `failed` with the given message (no traceback). Any other exception is captured with full traceback in `error_message`. ### Gone from the SDK - `flow.step`, `flow.span` — make a function instead. - `flow.current_span.tag` — write to `logs` or use `logic.log`. - `instrument(client)` — replaced by transport-level auto-trace. The `flow` name is retained as an alias of `logic` for one release so stored python_source keeps working without immediate rewrite. (`from hero_tracing import flow as logic` is effectively how migrated sources read.) --- ## 3. Execution model ### Spawn Every `logic.invoke(...)` call: 1. Resolves the name to a `Logic` + `LogicVersion`. 2. Creates a child Play row: `parent_play_sid` = current play's sid, `logic_sid` + `logic_version_sid` set, `input_data` = the JSON of kwargs, `status` = `running`. 3. Appends child's sid to parent's `sub_play_sids`. 4. Executes the child's `python_source` in the SAME subprocess (in-process; no per-call subprocess fork). The child's body runs to completion. 5. On clean return: sets child's `output_data`, `status=success`, `completed_at`, `duration_ms`. Returns the value to the caller. 6. On `logic.Failed` or any exception: sets `status=failed`, `error_message`. Re-raises so the parent sees it (unless caught). The top-level Play (the one started by `LogicService.play_start`) runs in its own subprocess as today. Sub-Plays do NOT fork subprocesses by default — they're records of in-process invocations. ### Transport-level auto-trace The JSON-RPC transport used by every generated Hero client wraps each call: - On request: stamps start time. - On response: stamps end time. If the response has `usage.prompt_tokens` / `usage.completion_tokens` (the aibroker shape), looks up the model's price and increments the current Play's `total_tokens_*` and `total_cost_usd`. - On error: propagates the exception to the caller; the current Play's status flows from there. No per-call record is persisted. The aggregates on the Play are the only runtime trace. Per-call detail (params, result, exact timing of one of N similar calls) is reconstructable from source + logs if needed for debugging. ### Pause / resume Each Play has its own resume state. When `logic.pause(...)` runs: 1. SDK emits a `pause` event over the per-Play UDS socket with a deterministic `resume_id` (logic name + call sequence within this Play). 2. Subprocess exits with code 75 if there's no cached answer; the executor reads the pause event, appends to the THIS PLAY's `pending_resumes`, sets `status=awaiting_resume`. 3. If this Play is a child of another running Play, the parent's `play_wait` (blocking call) detects the child entered `awaiting_resume`, stays parked, and propagates `awaiting_resume` up the chain. 4. `play_resume(play_sid, resume_id, payload)` writes the answer to the matching Play's `received_resumes`, flips its status back to `running`, respawns its subprocess. The chain rewakens naturally. Each Play has its own `step_outputs` cache (memoization of child Play results keyed by `(version_sid, parent_path, child_logic_name, child_input_args)`). On replay, child invocations whose key is in the cache short-circuit to the cached output without re-executing the child. Side-effects don't double-fire. ### Wall-clock + sandbox Same as today's Tier 0 (`hero_logic#14`): per-subprocess wall-clock budget, address-space cap, fd cap, scrubbed env. Applies to the top-level Play's subprocess. Sub-Plays inherit by virtue of being in the same subprocess. --- ## 4. UI ### Routes (final) | URL | View | |---|---| | `/` | Dashboard — list of Logics | | `/logics/{sid}` | The logic view (only) | | `/logics/{sid}?play={play_sid}` | Logic view with a Play overlaid | | `/api/plays/{sid}/overlay` | JSON endpoint for live overlay polling | | `/rpc` | JSON-RPC proxy to backend | Everything else gets removed: `/workflows/*`, `/examples`, `/plays`, `/plays/{sid}`, the top toolbar on the editor. ### Dashboard A simple table or card grid: | name | description | latest run | success rate (last 50) | |---|---|---|---| | service_agent | Self-contained AI agent… | 2 min ago (success) | 75% | | model_call | Single AI model call… | 5 min ago (success) | 98% | | ai_chat | Lower-level chat completion | 1 hr ago (failed) | 85% | Plus a `+ New Logic` button. ### Logic view layout ``` ┌──────────────────────────────────────────────────────────────────────┐ │ LEFT (info) │ MIDDLE (flow / code) │ RIGHT (stats) │ │ │ │ │ │ ↶ Breadcrumb │ [Graph] [Code] [Split] │ ┌────────────────┐ │ │ │ │ │ Benchmark v3 │ │ │ Title │ ▼ Flow │ │ runs: 50 │ │ │ Description │ (sub-logics + RPC │ │ success: 75% │ │ │ Inputs (declared) │ icons + control │ │ p50 dur: 7.2s │ │ │ Outputs (declared) │ flow viz) │ │ avg cost: $.02 │ │ │ Versions │ │ └────────────────┘ │ │ │ │ │ ├─────────────────────┴──────────────────────────┴─────────────────────┤ │ PLAY BAR (always) │ │ Inputs + Examples + Plays │ Trace + Pause forms │ Output [▶ Run] │ └──────────────────────────────────────────────────────────────────────┘ ``` ### Left sidebar — info - Breadcrumb (when scoped to a sub-Logic mid-Play; shows ancestor logic names). - Title (`Logic.name`), description. - Inputs: declared field list — name, type, required, description. **No value inputs here.** - Outputs: declared field list — name, type, description. (Moved here from today's right sidebar.) - Versions: list of LogicVersions with the current one marked. Click to switch. ### Middle — flow / code - **Flow view** (default): renders the Play tree visualization for the currently-overlaid Play (or, when idle, the most recent Play of this Logic). Static source parse provides structural backdrop (loops, conditionals, RPC call locations). - Each sub-Play renders as a drillable node — click navigates to `/logics/{child_logic_sid}?play={child_play_sid}`. - Each recognized RPC call from source renders as a ⚡ leaf node with `service.method` label — click opens a side popover (shows runtime cost if it accumulated tokens; otherwise just the static source line). - Loops render as a bracket grouping consecutive sub-Plays of the same Logic; conditionals as branch points where only the taken path's sub-Plays are filled. - Replayed sub-Plays render with a dashed border + ↻ marker (status=`success` but populated via cached output). - **Code view**: Monaco bound to the current `LogicVersion.python_source`. Save → creates a new LogicVersion. - **Split view**: half each, click-to-jump-to-source from a flow node. ### Right sidebar — stats Two modes: - **Idle** (no overlay): Benchmark card for `current_version_sid`. Stats derived on the fly from a query over recent Plays of this version. Refresh button. - **Overlay active**: This Play's stats. Status, duration, tokens, cost, attempts (count of sub-Plays of the same Logic, if it loops). Cancel button if the play is non-terminal. ### Bottom play bar — three columns #### Left column - **Inputs** — one field per declared input, type-appropriate widget. Values are local-only (not stored on the Logic). - **Examples ▾** — collapsible list of saved `Logic.examples`. Click → populate inputs. "Save current as example" button. - **Plays ▾** — collapsible list of recent Plays of this Logic. Click → overlay that Play. - **▶ Run button** — validates inputs, calls `play_start`, overlays the new Play. #### Middle column - Live trace as the Play executes: each new sub-Play creation + each completed sub-Play appended as a row (timestamp + name + status + duration). - When the overlaid Play (or one of its descendants in the chain) is in `awaiting_resume`: pause-form banner at the TOP of the column, accent border, persistent until answered. Forms render per `ResumeRequest.ui.kind`: text / number / choice / multi_choice / confirm. Submit posts `play_resume` with the matching `play_sid` (could be the overlaid play or a child along the chain). #### Right column - `output_data` rendered as it accumulates. If `Logic.outputs` declares fields, render one labeled card per output. Else raw JSON. ### Sub-logic mid-Play navigation Click a sub-Play node in the flow → navigate to `/logics/{child_logic_sid}?play={child_play_sid}`. The whole logic view rerenders for the child: - Left sidebar shows the child's metadata + breadcrumb (`service_agent / select_services`). - Middle shows the child's flow view rendered from the child Play's own data. - Right shows the child Play's stats. - Play bar's left column shows the child Logic's declared inputs prefilled from the child Play's `input_data`, plus the child's examples and plays. ▶ Run launches a fresh standalone top-level Play of the child Logic with those inputs (different from the overlaid child Play). - Play bar's middle shows the child Play's live trace (its own sub_play_sids + recognized RPC calls). If a pause is pending on the child or its descendants, the form renders here. - Play bar's right column shows the child Play's output. Breadcrumb click → navigate up. Browser back works. URLs share. --- ## 5. Walked example: service_agent The existing `service_agent` workflow (8 `@flow` functions in one file) migrates to 8 separate Logic records: ``` service_agent (entry; calls fetch_catalog, select_services, …) fetch_catalog (one logic.invoke or stdlib calls) select_services (calls logic.invoke("model_call", …) inside) compile_stubs service_code_gen (calls logic.invoke("model_call", …)) script_execution debug_feedback summarize ``` ### Initial state — dashboard Dashboard lists all 8 Logics. User clicks `service_agent`. ### Logic view — idle - Left sidebar: title `service_agent`, description, inputs (prompt, code_gen_model), outputs (summary), versions (v3 current). - Middle: flow view shows `fetch_catalog → select_services → for attempt in range(3): {service_code_gen → script_execution → debug_feedback} → summarize`. Each is a clickable sub-Logic node. - Right sidebar: benchmark card. "v3: 75% success, p50 6.8s, avg cost $0.018." - Play bar: input fields empty, Examples = ["Calendar event", "Find contact", "Marketplace tokens"], Plays = ["02hq failed 2h ago", "02ca success 1d ago"], ▶ Run. ### User picks "Calendar event" example, hits ▶ Run Inputs auto-fill (`prompt="Create a calendar event …"`, `model=""`). Run → `play_start` → new top-level Play `02j7` created → subprocess spawns → Play overlays the view. Right sidebar switches to "Play 02j7 — running 0.4s — 0 tokens". Middle flow view starts updating live: ``` service_agent (Play 02j7) running 0.4s └─ fetch_catalog (Play 02j8) ok 63ms ``` Then more arrive — `select_services` Play starts; child Plays accumulate. Play bar middle column logs each event chronologically. Right column starts filling as `output_data` accumulates. ### A pause fires inside `select_services` The Play for `select_services` (say `02j9`) emits a pause event. Its subprocess exits 75. Its `pending_resumes` gets the new ResumeRequest. Its status → `awaiting_resume`. The parent `service_agent` Play sees its child entered `awaiting_resume` (via `play_wait` returning non-terminal) and propagates: parent's status → `awaiting_resume` as well. Play bar middle column shifts to show: ``` ┃ ┃ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ┃ ┃ ┃ ⏸ Pending pause in select_services (Play 02j9) ┃ ┃ ┃ ┃ ┃ ┃ hero_osis_calendar has these rootobjects matching 'event': ┃ ┃ ◉ Event ┃ ┃ ┃ ○ RecurringEvent ┃ ┃ ┃ ○ Reminder ┃ ┃ ┃ ○ Cancel ┃ ┃ ┃ [Submit] ┃ ┃ ┃ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ┃ ``` User clicks "Event", submits. The form posts `play_resume(02j9, "select_services#0@ask_user.choice#0", "Event")`. Server respawns `select_services`'s subprocess with the cached answer; it returns through ai chain → parent unblocks → execution continues. ### User clicks `service_code_gen` node mid-Play Page navigates to `/logics/{service_code_gen_sid}?play=02jb` (the child Play sid). - Left sidebar: `service_code_gen` metadata + breadcrumb `service_agent / service_code_gen`. - Middle: flow view for Play 02jb — shows its sub-Plays (`model_call` Play 02jc — running) and the source's primitive `re.search(...)` and `ast.parse(...)` calls as ⚡ markers. - Right sidebar: stats for Play 02jb — `running, 4s, 1284 prompt + 320 completion, $0.012`. - Play bar: inputs prefilled (`prompt="…"`, `services=[…]`); examples = service_code_gen's saved examples; ▶ Run starts a fresh standalone play of service_code_gen with those inputs. Click `service_agent` in the breadcrumb → back to parent. ### Play completes Top-level Play 02j7 reaches `status=success`. Output column fills in: `{"summary": "Calendar event 'Standup' created for tomorrow 10am."}`. Right sidebar shows final stats: duration 11.4s, total cost $0.024. Plays list in the play bar prepends `02j7 success just now`. --- ## 6. Implementation phases 1. **Schema migration** — define new oschema, generate types, write the migration script that walks existing Workflows/WorkflowVersions/Examples/Benchmarks/Plays and converts them. Run the script as part of the upgrade. 2. **SDK rename + collapse** — rename `@flow` → `@logic`, `flow.*` → `logic.*`. Drop `flow.step`, `instrument()`, span events. Add `logic.log()`. Each `logic.invoke()` creates a child Play. 3. **Runtime: child Plays** — extend the executor to create a Play row per logic invocation. Hook the per-Play span socket (now per-Play event socket) so each Play receives its own events directly. Maintain `parent_play_sid` and `sub_play_sids`. 4. **Transport-level auto-trace** — patch the openrpc transport once. Aibroker cost lifted onto the current Play's totals. 5. **Pause-chain propagation** — top-level `play_wait` returns when a descendant Play enters `awaiting_resume`. `play_resume` works on any Play in the chain. 6. **Static source parser** — Python source → flow graph (logic invocations + RPC calls + loop/branch structure). Used for the idle flow view. 7. **Logic view + play bar UI rebuild** — three-region body + bottom three-column bar. Move toolbar functionality into the play bar. Move outputs from right sidebar to left. 8. **Sub-Logic navigation** — URL routing, breadcrumb stack, scoped views per child Play. 9. **Benchmark stats derived from queries** — drop the Benchmark rootobject; right sidebar runs `play.list_by_version` filtered queries. 10. **Cleanups** — delete `play_detail.html`, `examples.html`, `plays.html`, `workflows.html`, and their handlers. 301 redirects from old URLs. ## 7. Acceptance - A `@logic` function call creates a child Play row whose `parent_play_sid` matches and whose sid appears in the parent's `sub_play_sids`. - The flow view of a Play renders sub-Plays as drillable nodes and source-parsed RPC calls as ⚡ leaf nodes. - A pause inside a nested logic propagates `awaiting_resume` up to the top-level Play; answering it via `play_resume(child_play_sid, …)` resumes the chain. - Step memoization works at the Play level: replaying a Play whose `step_outputs` is populated short-circuits sub-Play invocations to their cached outputs. - Aibroker calls inside any Play accumulate `total_tokens_*` and `total_cost_usd` on that Play (not the parent). - `/` is the dashboard, `/logics/{sid}` is the only other view. `/workflows/*`, `/examples`, `/plays`, `/plays/{sid}` all 301 or 404. - The logic view has three regions (left info / middle flow / right stats) plus the bottom play bar. The top toolbar is gone. - Breadcrumb-click-back from a child Logic returns to the parent's logic view, with the parent's Play overlay still active. - Stored python_source written before the rename keeps loading (`flow` is exported as an alias of `logic`). - Stored Workflow/WorkflowVersion records load as Logic/LogicVersion via serde aliases. ## 8. Out of scope (future) - Visual code editor: drag-and-drop sub-Logic blocks, RPC blocks; conditional/loop/parallel visualization with data-flow lines; two-way graph↔code binding. - Auto-instrument generated RPC clients at build time (rather than at transport-level runtime). - Cross-process sub-Plays (spawned in their own subprocess) for isolation/sandboxing or distribution. The Play tree model supports this; the implementation phase defers it. --- Supersedes #38.
Author
Owner

Handoff — phases 1–3 + partial 10 landed in PR #40

PR: #40
Branch: feat/39-logic-model (4 commits ahead of development). Do not branch off again — continue on this branch.

git fetch origin
git checkout feat/39-logic-model
git pull

PR will stay open (not merged) until the UI rebuild lands so the operator can smoke-test end-to-end before any of this hits development.


What is done

Phase Commit Summary
1 — schema 5b5a36b WorkflowLogic, WorkflowVersionLogicVersion, ExampleLogicExample. inputs/outputs moved from Logic onto LogicVersion (per-version signatures). Span / SpanKind / SpanStatus / Benchmark / PerRunResult deleted. Play.spans/parent_span_id replaced with parent_play_sid + sub_play_sids. RPC handlers ported (workflow_*logic_*, benchmark_*/pick_version/span_push dropped). span_socket.rs slimmed to a per-Play event listener for log/output/step_output/pause/cost events only. python_executor.rs field renames + new HERO_LOGIC_* env vars. seed.rs emits new RPC names; inputs/outputs go on LogicVersion. Wipe-and-reseed migration (no AST splitter).
2 — SDK 736c7fa logic namespace as alias of flow (one-release back-compat). logic.log(text) emits a Play-level log event. Module docstring rewritten for the issue #39 event protocol.
3 — runtime child Plays eb06676 Every logic.invoke(name, **kwargs) now creates a child Play row via play.set (no new RPC needed), pushes a _current_play_sid contextvar, executes in-process, then finalizes the row. _invoke_sid_cache remembers the resolved (logic_sid, version_sid). Memoization short-circuits via _STEP_CACHE on replays.
10 (partial) 0f3ccd7, d10ac9b Removed 6 dead admin handlers + span-based integration tests + the orphaned tests/e2e_create_event.rs (referenced a service_agent_v3.py that doesn't exist).

Test state: cargo test --lib -p hero_logic → 29 passed / 0 failed. cargo build → 2 harmless warnings (backend_online field, will clear in Phase 7).

What is broken right now (expected)

  • Admin UI is non-functional. Templates still hard-code workflow_sid / workflow_version_sid / workflow.set / workflowversion.set / example.* / /workflows/* / /plays/* routes that no longer exist server-side. The templates compile (they're typed against their own DTO structs, not the Rust schema types), but every RPC call from the JS will 404 / return method-not-found. Do not try to fix piecemeal — the UI rebuild (Phase 7) replaces these wholesale per the issue spec.
  • seed_flows/service_agent.py has 8 @flow-decorated functions in one file. Per the issue spec, these should be split into 8 separate Logic records (one file each). Currently they still seed as one big Logic. The flow=logic alias means the file still loads, but the Play tree will only show one root Play instead of 8 child Plays per top-level call.
  • flow.step / flow.span / flow.current_span / instrument() — kept as a back-compat surface but their events are now silently dropped by the per-Play event listener. flow.current_span.log(...) calls in existing seed_flows are no-ops.
  • Cost aggregates (Play.total_tokens_*, total_cost_usd) are wired on the listener side (the cost event handler exists) but nothing emits them yet — needs Phase 4.

Critical files to read first

  1. crates/hero_logic/schemas/logic/logic.oschema — source of truth for types + RPC. Edit this and regen via cargo build before changing any handler.
  2. PRD.md — full design spec (the version still in the repo is pre-#39, but most of it is still accurate; #39's issue body is the canonical override).
  3. crates/hero_logic/src/logic/server/rpc.rs — handwritten RPC handlers. New methods that need adding: see Phase 5 below.
  4. crates/hero_logic/src/engine/span_socket.rs — minimal per-Play event listener. New event types are easy to add: add a variant to PlayEvent + a handler arm in apply_event_to_play.
  5. crates/hero_logic/src/engine/python_executor.rs — subprocess spawn + sandbox. The integration_tests module is gone; new e2e tests should go here or in tests/.
  6. crates/hero_logic/sdk/python/hero_tracing.py — the Python SDK. _Flow.invoke (line ~1100) is where the new child-Play creation happens. _create_child_play, _append_sub_play_sid, _finalize_child_play helpers live just above.

Remaining work (the punch list)

Per the issue body, in dependency order:

Phase 4 — transport-level auto-trace

  • Patch the SDK's _logic_rpc (or whatever RPC transport _resolve_flow_entry ends up using for aibroker calls; check router_python_dir generated clients) so that on every RPC response: if the response shape matches the aibroker usage.{prompt_tokens, completion_tokens} shape, look up the model's price (modelsconfig.yml from hero_aibroker) and emit a {"type":"cost", "prompt_tokens":..., "completion_tokens":..., "cost_usd":...} event to the per-Play socket. The listener already handles this event and accumulates onto Play.total_*.
  • No per-call records persisted. Per-Play totals only.
  • Touch: hero_tracing.py (transport wrapper or post-call hook).

Phase 5 — pause-chain propagation

  • play_wait on a parent must return when any descendant Play enters awaiting_resume (not just when the parent itself does). Currently play_wait only polls the named Play's status. New behavior: walk sub_play_sids recursively each poll and treat any descendant's awaiting_resume as a non-terminal signal that propagates.
  • play_resume(play_sid, ...) already accepts any Play sid; verify it correctly respawns the child's subprocess (not the parent's). Look at spawn_python_flow call in play_resume — it uses play.logic_sid so this should already work, but worth a test.
  • Each Play has its own pending_resumes / received_resumes / step_outputs. The schema already supports this.
  • Touch: crates/hero_logic/src/logic/server/rpc.rs (play_wait body).

Phase 6 — static Python source parser → flow graph

  • New helper (probably Python, exposed via a subprocess or a Rust ast-py-like crate) that walks LogicVersion.python_source AST and emits a JSON graph of: logic.invoke target names, recognized RPC client method calls (look for Hero*Client(...) instances + their method calls), loop/conditional structure.
  • Used by the idle flow view in the UI as a structural backdrop.
  • Suggested API: logic_parse_graph(version_sid) -> str (returns JSON). Could be pre-computed at logicversion_set time and cached on the record, but a new field would require a schema bump.

Phase 7 — Logic view + play bar UI (THE BIG ONE)

The issue body has a detailed layout spec. Rough sketch:

  • New route: /logics/{sid} and /logics/{sid}?play={play_sid}
  • New template: logic_view.html — three regions:
    • Left: breadcrumb + title + description + Inputs (declared from LogicVersion) + Outputs (declared from LogicVersion) + Versions list
    • Middle: Flow view (default — renders Play tree + RPC call markers from the static parser), Code view (Monaco bound to python_source), Split view. Each sub-Play in the flow view is a drillable node.
    • Right: stats card. Idle → success rate + p50 duration + avg cost over recent N Plays of the current version (Phase 9). Overlay → this Play's stats.
  • New template: logic_view.html bottom play bar — three columns:
    • Left: Inputs (per-field widgets driven by LogicVersion.inputs), Examples dropdown (LogicExample records), Plays dropdown (play_list_for_logic recent), ▶ Run button
    • Middle: live trace (each sub-Play creation/completion as a row), pause-form banner at top when awaiting_resume is pending (renders per ResumeRequest.ui.kind)
    • Right: output_data rendered as labeled cards per declared output field
  • Delete /workflows/*, /plays/*, /examples routes + their templates (workflows.html, workflow_editor.html, plays.html, play_detail.html, examples.html). The workflow_editor.html is the closest model for the new logic_view.html — start by copy-renaming and reshaping rather than from scratch.

Phase 8 — sub-Logic navigation

  • Click a sub-Play node in the flow view → navigate to /logics/{child_logic_sid}?play={child_play_sid}. Breadcrumb shows ancestor logic names (walk parent_play_sid chain). Browser back works.
  • Mostly JS in the template.

Phase 9 — stats from Play queries

  • Idle right-sidebar card runs play_list_for_version (already added in Phase 1) + a few play_status calls to compute success rate / p50 duration / avg cost over the last 50 Plays. Pure read-side aggregation; no new RPC needed beyond what exists.

Phase 10 remainder

  • Delete the 4 dead template files (workflows.html, plays.html, play_detail.html, examples.html) + their Template-derive structs + route registrations in crates/hero_logic_admin/src/main.rs / routes.rs
  • Dashboard at /: list of Logics with latest-run + success-rate columns. Replaces the current index.
  • 301 redirects from old URLs (/workflows/{sid}/logics/{sid}, /plays/{sid}/logics/{logic_sid}?play={play_sid}).
  • Update examples/ driver scripts (pause_resume_demo.py, pause_resume_prefill.py) to use the new logic_* RPC method names.
  • Split seed_flows/service_agent.py into 8 per-logic files + update seed.rs to register all 8.
  • Migrate seed_flows from flow.* to logic.* (cosmetic but matches the spec; flow alias keeps them working either way for one release).

Useful commands

# build + test the library
cargo test --lib -p hero_logic

# regenerate types after editing the oschema (just builds — codegen runs in build.rs)
cargo build -p hero_logic

# run the server in foreground for UI testing
make dev

# run the admin UI in foreground
make dev-ui

# wipe the local OSIS data (since the schema reshaped — old data is unreadable)
rm -rf ~/.hero/var/osis-data/hero_logic/  # or equivalent — check OServer's data dir
hero_proc service hero_logic restart

Gotchas

  • OTOML round-trip drops empty arrays/strings on some shapes. The _create_child_play / _append_sub_play_sid helpers in the SDK do read-modify-write on the parent Play to update sub_play_sids — if you see lost field updates after a save, suspect OTOML serialization. The encode_python_source_b64 workaround in python_executor.rs is a precedent.
  • Generated handlers expect trigger fns to exist on OsisLogic — when you add a new rootobject to the oschema, the build will fail with "no function found" for {rootobject}_trigger_* until you add no-op trigger stubs to the bottom of rpc.rs (look for the existing impl OsisLogic { ... } block).
  • flow and logic are the same singletonlogic is flow evaluates True in Python. Don't break this; the alias is the one-release back-compat contract.
  • Don't edit *_generated.rs / openrpc.json — they're rewritten by build.rs on every cargo build. Edit the oschema and regen.

Original issue spec stands as the canonical target. PR #40 covers ~30% of it (the foundational reshape); the UI rebuild is the bulk of what's left.

## Handoff — phases 1–3 + partial 10 landed in PR #40 PR: https://forge.ourworld.tf/lhumina_code/hero_logic/pulls/40 Branch: `feat/39-logic-model` (4 commits ahead of `development`). **Do not branch off again — continue on this branch.** ```bash git fetch origin git checkout feat/39-logic-model git pull ``` PR will stay open (not merged) until the UI rebuild lands so the operator can smoke-test end-to-end before any of this hits `development`. --- ### What is done | Phase | Commit | Summary | |---|---|---| | 1 — schema | `5b5a36b` | `Workflow`→`Logic`, `WorkflowVersion`→`LogicVersion`, `Example`→`LogicExample`. `inputs`/`outputs` moved from Logic onto LogicVersion (per-version signatures). `Span` / `SpanKind` / `SpanStatus` / `Benchmark` / `PerRunResult` deleted. `Play.spans`/`parent_span_id` replaced with `parent_play_sid` + `sub_play_sids`. RPC handlers ported (`workflow_*`→`logic_*`, `benchmark_*`/`pick_version`/`span_push` dropped). `span_socket.rs` slimmed to a per-Play event listener for `log`/`output`/`step_output`/`pause`/`cost` events only. `python_executor.rs` field renames + new `HERO_LOGIC_*` env vars. `seed.rs` emits new RPC names; inputs/outputs go on LogicVersion. Wipe-and-reseed migration (no AST splitter). | | 2 — SDK | `736c7fa` | `logic` namespace as alias of `flow` (one-release back-compat). `logic.log(text)` emits a Play-level log event. Module docstring rewritten for the issue #39 event protocol. | | 3 — runtime child Plays | `eb06676` | Every `logic.invoke(name, **kwargs)` now creates a child Play row via `play.set` (no new RPC needed), pushes a `_current_play_sid` contextvar, executes in-process, then finalizes the row. `_invoke_sid_cache` remembers the resolved `(logic_sid, version_sid)`. Memoization short-circuits via `_STEP_CACHE` on replays. | | 10 (partial) | `0f3ccd7`, `d10ac9b` | Removed 6 dead admin handlers + span-based integration tests + the orphaned `tests/e2e_create_event.rs` (referenced a `service_agent_v3.py` that doesn't exist). | **Test state:** `cargo test --lib -p hero_logic` → 29 passed / 0 failed. `cargo build` → 2 harmless warnings (`backend_online` field, will clear in Phase 7). ### What is broken right now (expected) - **Admin UI is non-functional.** Templates still hard-code `workflow_sid` / `workflow_version_sid` / `workflow.set` / `workflowversion.set` / `example.*` / `/workflows/*` / `/plays/*` routes that no longer exist server-side. The templates compile (they're typed against their own DTO structs, not the Rust schema types), but every RPC call from the JS will 404 / return method-not-found. **Do not try to fix piecemeal** — the UI rebuild (Phase 7) replaces these wholesale per the issue spec. - **`seed_flows/service_agent.py`** has 8 `@flow`-decorated functions in one file. Per the issue spec, these should be split into 8 separate Logic records (one file each). Currently they still seed as one big Logic. The `flow=logic` alias means the file still loads, but the Play tree will only show one root Play instead of 8 child Plays per top-level call. - **`flow.step` / `flow.span` / `flow.current_span` / `instrument()`** — kept as a back-compat surface but their events are now silently dropped by the per-Play event listener. `flow.current_span.log(...)` calls in existing seed_flows are no-ops. - **Cost aggregates** (`Play.total_tokens_*`, `total_cost_usd`) are wired on the listener side (the `cost` event handler exists) but nothing emits them yet — needs Phase 4. ### Critical files to read first 1. **`crates/hero_logic/schemas/logic/logic.oschema`** — source of truth for types + RPC. Edit this and regen via `cargo build` before changing any handler. 2. **`PRD.md`** — full design spec (the version still in the repo is pre-#39, but most of it is still accurate; #39's issue body is the canonical override). 3. **`crates/hero_logic/src/logic/server/rpc.rs`** — handwritten RPC handlers. New methods that need adding: see Phase 5 below. 4. **`crates/hero_logic/src/engine/span_socket.rs`** — minimal per-Play event listener. New event types are easy to add: add a variant to `PlayEvent` + a handler arm in `apply_event_to_play`. 5. **`crates/hero_logic/src/engine/python_executor.rs`** — subprocess spawn + sandbox. The `integration_tests` module is gone; new e2e tests should go here or in `tests/`. 6. **`crates/hero_logic/sdk/python/hero_tracing.py`** — the Python SDK. `_Flow.invoke` (line ~1100) is where the new child-Play creation happens. `_create_child_play`, `_append_sub_play_sid`, `_finalize_child_play` helpers live just above. --- ### Remaining work (the punch list) Per the issue body, in dependency order: #### Phase 4 — transport-level auto-trace - Patch the SDK's `_logic_rpc` (or whatever RPC transport `_resolve_flow_entry` ends up using for aibroker calls; check `router_python_dir` generated clients) so that on every RPC response: if the response shape matches the aibroker `usage.{prompt_tokens, completion_tokens}` shape, look up the model's price (modelsconfig.yml from hero_aibroker) and emit a `{"type":"cost", "prompt_tokens":..., "completion_tokens":..., "cost_usd":...}` event to the per-Play socket. The listener already handles this event and accumulates onto `Play.total_*`. - No per-call records persisted. Per-Play totals only. - Touch: `hero_tracing.py` (transport wrapper or post-call hook). #### Phase 5 — pause-chain propagation - `play_wait` on a parent must return when **any descendant Play** enters `awaiting_resume` (not just when the parent itself does). Currently `play_wait` only polls the named Play's status. New behavior: walk `sub_play_sids` recursively each poll and treat any descendant's `awaiting_resume` as a non-terminal signal that propagates. - `play_resume(play_sid, ...)` already accepts any Play sid; verify it correctly respawns the child's subprocess (not the parent's). Look at `spawn_python_flow` call in `play_resume` — it uses `play.logic_sid` so this should already work, but worth a test. - Each Play has its own `pending_resumes` / `received_resumes` / `step_outputs`. The schema already supports this. - Touch: `crates/hero_logic/src/logic/server/rpc.rs` (`play_wait` body). #### Phase 6 — static Python source parser → flow graph - New helper (probably Python, exposed via a subprocess or a Rust ast-py-like crate) that walks `LogicVersion.python_source` AST and emits a JSON graph of: `logic.invoke` target names, recognized RPC client method calls (look for `Hero*Client(...)` instances + their method calls), loop/conditional structure. - Used by the idle flow view in the UI as a structural backdrop. - Suggested API: `logic_parse_graph(version_sid) -> str` (returns JSON). Could be pre-computed at `logicversion_set` time and cached on the record, but a new field would require a schema bump. #### Phase 7 — Logic view + play bar UI (THE BIG ONE) The issue body has a detailed layout spec. Rough sketch: - New route: `/logics/{sid}` and `/logics/{sid}?play={play_sid}` - New template: `logic_view.html` — three regions: - **Left**: breadcrumb + title + description + Inputs (declared from LogicVersion) + Outputs (declared from LogicVersion) + Versions list - **Middle**: Flow view (default — renders Play tree + RPC call markers from the static parser), Code view (Monaco bound to `python_source`), Split view. Each sub-Play in the flow view is a drillable node. - **Right**: stats card. Idle → success rate + p50 duration + avg cost over recent N Plays of the current version (Phase 9). Overlay → this Play's stats. - New template: `logic_view.html` bottom play bar — three columns: - **Left**: Inputs (per-field widgets driven by LogicVersion.inputs), Examples dropdown (LogicExample records), Plays dropdown (`play_list_for_logic` recent), ▶ Run button - **Middle**: live trace (each sub-Play creation/completion as a row), pause-form banner at top when `awaiting_resume` is pending (renders per `ResumeRequest.ui.kind`) - **Right**: `output_data` rendered as labeled cards per declared output field - Delete `/workflows/*`, `/plays/*`, `/examples` routes + their templates (`workflows.html`, `workflow_editor.html`, `plays.html`, `play_detail.html`, `examples.html`). The `workflow_editor.html` is the closest model for the new `logic_view.html` — start by copy-renaming and reshaping rather than from scratch. #### Phase 8 — sub-Logic navigation - Click a sub-Play node in the flow view → navigate to `/logics/{child_logic_sid}?play={child_play_sid}`. Breadcrumb shows ancestor logic names (walk `parent_play_sid` chain). Browser back works. - Mostly JS in the template. #### Phase 9 — stats from Play queries - Idle right-sidebar card runs `play_list_for_version` (already added in Phase 1) + a few `play_status` calls to compute success rate / p50 duration / avg cost over the last 50 Plays. Pure read-side aggregation; no new RPC needed beyond what exists. #### Phase 10 remainder - Delete the 4 dead template files (`workflows.html`, `plays.html`, `play_detail.html`, `examples.html`) + their `Template`-derive structs + route registrations in `crates/hero_logic_admin/src/main.rs` / `routes.rs` - Dashboard at `/`: list of Logics with latest-run + success-rate columns. Replaces the current index. - 301 redirects from old URLs (`/workflows/{sid}` → `/logics/{sid}`, `/plays/{sid}` → `/logics/{logic_sid}?play={play_sid}`). - Update `examples/` driver scripts (`pause_resume_demo.py`, `pause_resume_prefill.py`) to use the new `logic_*` RPC method names. - Split `seed_flows/service_agent.py` into 8 per-logic files + update `seed.rs` to register all 8. - Migrate seed_flows from `flow.*` to `logic.*` (cosmetic but matches the spec; `flow` alias keeps them working either way for one release). --- ### Useful commands ```bash # build + test the library cargo test --lib -p hero_logic # regenerate types after editing the oschema (just builds — codegen runs in build.rs) cargo build -p hero_logic # run the server in foreground for UI testing make dev # run the admin UI in foreground make dev-ui # wipe the local OSIS data (since the schema reshaped — old data is unreadable) rm -rf ~/.hero/var/osis-data/hero_logic/ # or equivalent — check OServer's data dir hero_proc service hero_logic restart ``` ### Gotchas - **OTOML round-trip drops empty arrays/strings on some shapes.** The `_create_child_play` / `_append_sub_play_sid` helpers in the SDK do read-modify-write on the parent Play to update `sub_play_sids` — if you see lost field updates after a save, suspect OTOML serialization. The `encode_python_source_b64` workaround in `python_executor.rs` is a precedent. - **Generated handlers expect trigger fns to exist on `OsisLogic`** — when you add a new rootobject to the oschema, the build will fail with "no function found" for `{rootobject}_trigger_*` until you add no-op trigger stubs to the bottom of `rpc.rs` (look for the existing `impl OsisLogic { ... }` block). - **`flow` and `logic` are the same singleton** — `logic is flow` evaluates True in Python. Don't break this; the alias is the one-release back-compat contract. - **Don't edit `*_generated.rs` / `openrpc.json`** — they're rewritten by `build.rs` on every `cargo build`. Edit the oschema and regen. --- Original issue spec stands as the canonical target. PR #40 covers ~30% of it (the foundational reshape); the UI rebuild is the bulk of what's left.
Author
Owner

Handoff — session 2 (Phases 4-10 + lifecycle migration; UI regressed visually)

PR #40, branch feat/39-logic-model (10 new commits stacked on top of the
session-1 foundation — phases 1-3 + partial 10). 33/33 lib tests green,
clean build, pushed.

Commits this session

Commit Phase / scope
3d3418c Phase 4 — Dropped the auto-trace plan after discussion. Added Play.stats (JSON field with documented tokens_prompt / tokens_completion / cost_usd / duration_ms / calls shape) + a stats event type on the per-Play socket + logic.record_stats(...) SDK helper. model_call now looks up aibroker pricing once per subprocess (via models.config RPC) and reports its own consumption. UI aggregators sum across the Play subtree on read.
294d6bb Phase 5play_wait walks sub_play_sids recursively and stays parked while any descendant is in awaiting_resume. Bounded walk, cycle-safe.
0c06097 Phase 6engine/source_parser.rs spawns python3 with an embedded AST parser piped on stdin → JSON flow graph (invoke / rpc / pause / loop / if / try nodes). New logic_parse_graph(version_sid) -> str RPC.
93ce00f Phases 7-10 (BIG, REGRESSION HERE) — Wholesale UI rewrite to the issue's "three regions + bottom play bar" diagram. New logic_view.html template + new dashboard. All five old templates deleted (workflows.html, workflow_editor.html, plays.html, play_detail.html, examples.html) plus workflow_editor.js + hero_logic_graph.js. 301 redirects from legacy URLs. The new UI is visually much worse than the one it replaced — see "Open issue" below.
bf055d2 Phase 10 remainderexamples/pause_resume_*.py + seed_flows/*.py renamed flow.*logic.*.
b4e6d67 SDK clean rewrite — Per "no legacy code" instruction: dropped the flow alias, _Span / _NullSpan / _current_span / flow.step / flow.span / flow.current_span / instrument() / _ClientProxy / _bootstrap_run. Replaced _current_span (span-based) with _current_frame (_LogicFrame, just carries the deterministic call-tree state). Dropped prefill_only / prefill_resumes_json + HERO_LOGIC_PREFILL_ONLY (per user's "plays should be plays, no special mode" feedback — headless callers now loop on play_resume like everyone else). Env vars renamed HERO_FLOW_*HERO_LOGIC_*. Marker attr __hero_flow____hero_logic__. Schema type FlowFieldLogicField.
ed4d6fe service_agent split — Was 607-line monolith; now an ~80-line orchestrator that calls 7 sub-Logics via logic.invoke(...). New seed_flows files: fetch_catalog.py, select_services.py, compile_stubs.py, script_execution.py, debug_feedback.py, summarize.py (service_code_gen + model_call already separate). seed.rs gets 6 new compact BuiltInFlow entries with their inputs/outputs metadata.
72bae09 UI e2e tests — 20 hero_browser MCP test cases under testcases/, a master run_all.md runner, and a repo-local .claude/skills/run_ui_tests/SKILL.md. Also fixed the admin socket name mismatch (admin.sockui.sock — though this was reverted in the next phase, see below).
311c3da Lifecycle migration — Aligned to lab + service.toml. Each crate has a service.toml at its root; main.rs uses service_base!() + validate_service_toml + handle_info_flag + print_startup_banner + prepare_sockets from herolib_core::base. Deleted Makefile, buildenv.sh, crates/hero_logic/scripts/build_lib.sh. Dropped the hero_logic selfstart CLI binary (per the hero_service_test skill — selfstart-only binaries provide no value; lab service covers it). Moved src/bin/hero_logic_server.rssrc/main.rs so include_str!("../service.toml") resolves correctly. Reverted admin socket name to canonical admin.sock. README rewritten end-to-end: no make references, all lab commands. Cargo.toml has a [patch."https://forge.ourworld.tf/lhumina_code/hero_lib.git"] block pointing at the local sibling checkout — needed because the upstream herolib_ai references streaming types (ChatChunkStream, StreamingClient, StreamError) that hero_aibroker_sdk's phase-9 per-domain refactor dropped. The local hero_lib checkout has a small workaround patch (crates/ai/src/client.rs + error.rs) that gates those refs off. Drop both the patch and the local edits once upstream realigns.
de952a3 Admin TCP bind + cheap dashboardhero_router doesn't proxy web-protocol services (only openrpc) so the dashboard wasn't reachable via the router URL. Added a TCP bind at 127.0.0.1:9820 alongside the UDS so a browser can hit http://127.0.0.1:9820/ directly. Used the same hyper http1 serve_connection pattern as the UDS (axum 0.8's axum::serve hung silently — possibly a peer-shutdown handling quirk). Also dropped the per-Logic "latest play + success rate" lookup from the dashboard index_handler — with ~1100 plays in OSIS and no logic_sid index on the play rootobject, play_list_for_logic scans every record per call, making the dashboard ~60s; latest-play info lives on the per-Logic view where it's worth the cost. Dashboard now renders in ~130 ms.

Open issue — UI visual regression

Phase 7's UI rewrite (93ce00f) is the problem. I followed the issue
spec's "three regions + bottom play bar" diagram literally and the
result is visually much worse than the old workflow_editor.html it
replaced — see the side-by-side the user posted in the session
transcript. Specifically the new UI lost:

  • The top toolbar with editable title + description + Start button
    • Plays dropdown + stats chips (RUNS / OK / MEDIAN / ALL).
  • The polished nested play-tree rendering with status dots + per-row
    ms timings + indentation (e.g. attempt 1 → Service Code Gen → Model Call). The new "flow tab" relies on logic_parse_graph for
    the static backdrop and on descendants_json for overlay anchoring,
    but the visual hierarchy is missing.
  • The typed-inputs editor (add field, type dropdown, multi-line value,
    delete) — replaced with a read-only declared-fields list in the
    left sidebar plus a separate "play bar" inputs column.
  • The labeled-outputs panel with per-output cards rendered inline
    with the canvas — replaced with a small "Output" section in the
    bottom-right of the play bar.

The new template/JS also has runtime errors visible in the
screenshots:

  • BASE_PATH is not defined (the stats card init runs before the
    <script> block that defines BASE_PATH).
  • Source parse unavailable: syntax error at line 1: invalid decimal literal (Python 3.14's stricter literal parsing rejects something
    in service_code_gen.py's source the parser is being handed —
    needs a closer look in engine/source_parser.rs).
  • The ▶ Run button isn't visible in the screenshot (likely the
    bottom play bar is overflowing the viewport or being CSS-clipped).

The session-1 instruction was "this is a single canonical issue, don't
fight it"; the session-2 read of the spec produced a worse UI. The
old workflow_editor.html design is the better starting point
single canvas with top toolbar (Start + Plays + stats), three columns
(INPUTS / FLOW+SOURCE / OUTPUT), all the right interaction surfaces
on one screen. To recover:

  1. git show 4c9168e:crates/hero_logic_admin/templates/workflow_editor.html > crates/hero_logic_admin/templates/logic_view.html
    (likewise workflow_editor.js → keep as-is, hero_logic_graph.js
    keep as-is, the four list templates — workflows.html, plays.html,
    play_detail.html, examples.html — restore if you still want the
    list pages, else skip).
  2. Mechanically rename wf_sid/wf_name/wf_descriptionlogic_* in
    the template, rewrite /workflows//logics/. Done partially
    already in the dead branch I just reverted.
  3. RPC call renames: workflow.setlogic.set, workflowversion.set
    logicversion.set, LogicService.workflow_*LogicService.logic_*,
    example.*logicexample.*, LogicService.example_*
    LogicService.logic_example_*, flow_library_search
    logic_library_search. Drop LogicService.workflow_clone (no
    equivalent — use logic_create_version differently or wire from
    the dashboard).
  4. The hard part: the old hero_logic_graph.js renders the play
    tree from Play.spans (gone in #39). The renderer needs to be
    adapted to walk Play.sub_play_sids (with descendants fetched
    client-side or batched server-side via the existing
    /api/plays/{sid}/overlay endpoint, which already returns
    {play, descendants}). The static fallback can stay as
    logic_parse_graph's output (the parser is new in #0c06097 and
    works).
  5. Adapt the cost / token chips to read Play.stats (a JSON string —
    parse and sum across descendants).

Files & state at the time of handoff:

  • crates/hero_logic_admin/templates/logic_view.html — the BAD new template (Phase 7). Replace with the restored editor.
  • crates/hero_logic_admin/templates/index.html — also new and uses LogicCard struct from routes.rs. The old version listed both workflows + plays; restore if desired.
  • crates/hero_logic_admin/static/js/ — only bootstrap.bundle.min.js now. Restore workflow_editor.js + hero_logic_graph.js from 4c9168e.
  • crates/hero_logic_admin/src/routes.rs — speaks the new schema correctly via logic.* RPCs but only renders the bad layout. To swap UIs: restore the old routes.rs from 4c9168e, then apply the same mechanical RPC-name renames listed in (3) above. I had a partial-go at this (reverted before committing) so the sed script in the session transcript is a starting point.
  • crates/hero_logic_admin/src/main.rs — routes for the new layout. Reset the route table to point at the old handlers (workflows_handler, workflow_editor_*_handler, plays_handler, examples_handler, play_detail_handler) once routes.rs is restored.
  • Cargo.toml — the workspace patches hero_lib to a local sibling checkout. Do not drop this until upstream herolib_ai republishes against the phase-9 hero_aibroker_sdk shape. The local hero_lib's crates/ai/src/client.rs and error.rs carry a small workaround patch that gates chat_stream / with_streaming_socket off; tracking it in the README note. There's also a Cargo.toml.hero_builder_backup left in the working tree (not committed) from the policy enforcement — safe to delete.

Everything in commits 3d3418c..bf055d2 and b4e6d67..311c3da (i.e.
schema, SDK, runtime, lifecycle, service split, e2e tests, dependency
work) is solid and matches the issue spec. Only the UI in 93ce00f
needs to be redone from the better starting point.

Smoke-test commands:

lab --install
hero_proc service stop hero_logic && hero_proc service start hero_logic
curl -s -I http://127.0.0.1:9820/    # dashboard (~130ms)
hero_logic_server --info             # service.toml introspection
## Handoff — session 2 (Phases 4-10 + lifecycle migration; UI regressed visually) PR #40, branch `feat/39-logic-model` (10 new commits stacked on top of the session-1 foundation — phases 1-3 + partial 10). 33/33 lib tests green, clean build, pushed. ### Commits this session | Commit | Phase / scope | |---|---| | `3d3418c` | **Phase 4** — Dropped the auto-trace plan after discussion. Added `Play.stats` (JSON field with documented `tokens_prompt / tokens_completion / cost_usd / duration_ms / calls` shape) + a `stats` event type on the per-Play socket + `logic.record_stats(...)` SDK helper. `model_call` now looks up aibroker pricing once per subprocess (via `models.config` RPC) and reports its own consumption. UI aggregators sum across the Play subtree on read. | | `294d6bb` | **Phase 5** — `play_wait` walks `sub_play_sids` recursively and stays parked while any descendant is in `awaiting_resume`. Bounded walk, cycle-safe. | | `0c06097` | **Phase 6** — `engine/source_parser.rs` spawns `python3` with an embedded AST parser piped on stdin → JSON flow graph (invoke / rpc / pause / loop / if / try nodes). New `logic_parse_graph(version_sid) -> str` RPC. | | `93ce00f` | **Phases 7-10 (BIG, REGRESSION HERE)** — Wholesale UI rewrite to the issue's "three regions + bottom play bar" diagram. New `logic_view.html` template + new dashboard. All five old templates deleted (`workflows.html`, `workflow_editor.html`, `plays.html`, `play_detail.html`, `examples.html`) plus `workflow_editor.js` + `hero_logic_graph.js`. 301 redirects from legacy URLs. **The new UI is visually much worse than the one it replaced** — see "Open issue" below. | | `bf055d2` | **Phase 10 remainder** — `examples/pause_resume_*.py` + `seed_flows/*.py` renamed `flow.*` → `logic.*`. | | `b4e6d67` | **SDK clean rewrite** — Per "no legacy code" instruction: dropped the `flow` alias, `_Span` / `_NullSpan` / `_current_span` / `flow.step` / `flow.span` / `flow.current_span` / `instrument()` / `_ClientProxy` / `_bootstrap_run`. Replaced `_current_span` (span-based) with `_current_frame` (`_LogicFrame`, just carries the deterministic call-tree state). Dropped `prefill_only` / `prefill_resumes_json` + `HERO_LOGIC_PREFILL_ONLY` (per user's "plays should be plays, no special mode" feedback — headless callers now loop on `play_resume` like everyone else). Env vars renamed `HERO_FLOW_*` → `HERO_LOGIC_*`. Marker attr `__hero_flow__` → `__hero_logic__`. Schema type `FlowField` → `LogicField`. | | `ed4d6fe` | **service_agent split** — Was 607-line monolith; now an ~80-line orchestrator that calls 7 sub-Logics via `logic.invoke(...)`. New seed_flows files: `fetch_catalog.py`, `select_services.py`, `compile_stubs.py`, `script_execution.py`, `debug_feedback.py`, `summarize.py` (service_code_gen + model_call already separate). `seed.rs` gets 6 new compact `BuiltInFlow` entries with their inputs/outputs metadata. | | `72bae09` | **UI e2e tests** — 20 hero_browser MCP test cases under `testcases/`, a master `run_all.md` runner, and a repo-local `.claude/skills/run_ui_tests/SKILL.md`. Also fixed the admin socket name mismatch (`admin.sock` → `ui.sock` — though this was reverted in the next phase, see below). | | `311c3da` | **Lifecycle migration** — Aligned to `lab` + `service.toml`. Each crate has a `service.toml` at its root; `main.rs` uses `service_base!()` + `validate_service_toml` + `handle_info_flag` + `print_startup_banner` + `prepare_sockets` from `herolib_core::base`. Deleted `Makefile`, `buildenv.sh`, `crates/hero_logic/scripts/build_lib.sh`. Dropped the `hero_logic` selfstart CLI binary (per the `hero_service_test` skill — selfstart-only binaries provide no value; `lab service` covers it). Moved `src/bin/hero_logic_server.rs` → `src/main.rs` so `include_str!("../service.toml")` resolves correctly. Reverted admin socket name to canonical `admin.sock`. README rewritten end-to-end: no `make` references, all `lab` commands. **Cargo.toml has a `[patch."https://forge.ourworld.tf/lhumina_code/hero_lib.git"]` block pointing at the local sibling checkout** — needed because the upstream `herolib_ai` references streaming types (`ChatChunkStream`, `StreamingClient`, `StreamError`) that `hero_aibroker_sdk`'s phase-9 per-domain refactor dropped. The local hero_lib checkout has a small workaround patch (`crates/ai/src/client.rs` + `error.rs`) that gates those refs off. Drop both the patch and the local edits once upstream realigns. | | `de952a3` | **Admin TCP bind + cheap dashboard** — `hero_router` doesn't proxy `web`-protocol services (only `openrpc`) so the dashboard wasn't reachable via the router URL. Added a TCP bind at `127.0.0.1:9820` alongside the UDS so a browser can hit `http://127.0.0.1:9820/` directly. Used the same hyper http1 `serve_connection` pattern as the UDS (axum 0.8's `axum::serve` hung silently — possibly a peer-shutdown handling quirk). Also dropped the per-Logic "latest play + success rate" lookup from the dashboard `index_handler` — with ~1100 plays in OSIS and no `logic_sid` index on the `play` rootobject, `play_list_for_logic` scans every record per call, making the dashboard ~60s; latest-play info lives on the per-Logic view where it's worth the cost. Dashboard now renders in ~130 ms. | ### Open issue — UI visual regression **Phase 7's UI rewrite (`93ce00f`) is the problem.** I followed the issue spec's "three regions + bottom play bar" diagram literally and the result is visually much worse than the old `workflow_editor.html` it replaced — see the side-by-side the user posted in the session transcript. Specifically the new UI lost: - The top toolbar with editable title + description + **Start button** + Plays dropdown + stats chips (RUNS / OK / MEDIAN / ALL). - The polished nested play-tree rendering with status dots + per-row ms timings + indentation (e.g. `attempt 1 → Service Code Gen → Model Call`). The new "flow tab" relies on `logic_parse_graph` for the static backdrop and on `descendants_json` for overlay anchoring, but the visual hierarchy is missing. - The typed-inputs editor (add field, type dropdown, multi-line value, delete) — replaced with a read-only declared-fields list in the left sidebar plus a separate "play bar" inputs column. - The labeled-outputs panel with per-output cards rendered inline with the canvas — replaced with a small "Output" section in the bottom-right of the play bar. The new template/JS also has runtime errors visible in the screenshots: - `BASE_PATH is not defined` (the stats card init runs before the `<script>` block that defines BASE_PATH). - `Source parse unavailable: syntax error at line 1: invalid decimal literal` (Python 3.14's stricter literal parsing rejects something in `service_code_gen.py`'s source the parser is being handed — needs a closer look in `engine/source_parser.rs`). - The ▶ Run button isn't visible in the screenshot (likely the bottom play bar is overflowing the viewport or being CSS-clipped). ### Recommended path forward (for the next agent) The session-1 instruction was "this is a single canonical issue, don't fight it"; the session-2 read of the spec produced a worse UI. The **old `workflow_editor.html` design is the better starting point** — single canvas with top toolbar (Start + Plays + stats), three columns (INPUTS / FLOW+SOURCE / OUTPUT), all the right interaction surfaces on one screen. To recover: 1. `git show 4c9168e:crates/hero_logic_admin/templates/workflow_editor.html > crates/hero_logic_admin/templates/logic_view.html` (likewise `workflow_editor.js` → keep as-is, `hero_logic_graph.js` → keep as-is, the four list templates — `workflows.html`, `plays.html`, `play_detail.html`, `examples.html` — restore if you still want the list pages, else skip). 2. Mechanically rename `wf_sid/wf_name/wf_description` → `logic_*` in the template, rewrite `/workflows/` → `/logics/`. Done partially already in the dead branch I just reverted. 3. RPC call renames: `workflow.set` → `logic.set`, `workflowversion.set` → `logicversion.set`, `LogicService.workflow_*` → `LogicService.logic_*`, `example.*` → `logicexample.*`, `LogicService.example_*` → `LogicService.logic_example_*`, `flow_library_search` → `logic_library_search`. Drop `LogicService.workflow_clone` (no equivalent — use `logic_create_version` differently or wire from the dashboard). 4. **The hard part**: the old `hero_logic_graph.js` renders the play tree from `Play.spans` (gone in #39). The renderer needs to be adapted to walk `Play.sub_play_sids` (with descendants fetched client-side or batched server-side via the existing `/api/plays/{sid}/overlay` endpoint, which already returns `{play, descendants}`). The static fallback can stay as `logic_parse_graph`'s output (the parser is new in #0c06097 and works). 5. Adapt the cost / token chips to read `Play.stats` (a JSON string — parse and sum across descendants). Files & state at the time of handoff: - `crates/hero_logic_admin/templates/logic_view.html` — the BAD new template (Phase 7). Replace with the restored editor. - `crates/hero_logic_admin/templates/index.html` — also new and uses `LogicCard` struct from `routes.rs`. The old version listed both workflows + plays; restore if desired. - `crates/hero_logic_admin/static/js/` — only `bootstrap.bundle.min.js` now. Restore `workflow_editor.js` + `hero_logic_graph.js` from `4c9168e`. - `crates/hero_logic_admin/src/routes.rs` — speaks the new schema correctly via `logic.*` RPCs but only renders the bad layout. To swap UIs: restore the old `routes.rs` from `4c9168e`, then apply the same mechanical RPC-name renames listed in (3) above. I had a partial-go at this (reverted before committing) so the sed script in the session transcript is a starting point. - `crates/hero_logic_admin/src/main.rs` — routes for the new layout. Reset the route table to point at the old handlers (`workflows_handler`, `workflow_editor_*_handler`, `plays_handler`, `examples_handler`, `play_detail_handler`) once routes.rs is restored. - `Cargo.toml` — the workspace patches `hero_lib` to a local sibling checkout. **Do not drop this** until upstream `herolib_ai` republishes against the phase-9 `hero_aibroker_sdk` shape. The local hero_lib's `crates/ai/src/client.rs` and `error.rs` carry a small workaround patch that gates `chat_stream` / `with_streaming_socket` off; tracking it in the README note. There's also a `Cargo.toml.hero_builder_backup` left in the working tree (not committed) from the policy enforcement — safe to delete. Everything in commits `3d3418c..bf055d2` and `b4e6d67..311c3da` (i.e. schema, SDK, runtime, lifecycle, service split, e2e tests, dependency work) is solid and matches the issue spec. Only the UI in `93ce00f` needs to be redone from the better starting point. Smoke-test commands: ```bash lab --install hero_proc service stop hero_logic && hero_proc service start hero_logic curl -s -I http://127.0.0.1:9820/ # dashboard (~130ms) hero_logic_server --info # service.toml introspection ```
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_logic#39
No description provided.