Skip to main content
Eight targeted fixes from engine correctness work, Gateway coverage, and CLI scaffolding.

Fix: parallel <Loop> iterations no longer starve until total run quiescence

Parallel <Loop> iterations stalled until the entire run graph went quiet before any ready loop could advance. All three scheduling paths shared the same bug: the legacy engine ladder returned await-trigger for any in-flight task or schedule-retry for any pending task before ever reaching its ready-loop handling, the workflow session gated loop progress behind a global-idle check, and the WorkflowDriver’s Promise.all batch barrier held all completions until the batch finished. Three changes land together:
  • Legacy engine ladder and workflow session now advance ready loops whenever a loop node is complete, without waiting for unrelated in-flight or pending tasks to settle.
  • WorkflowDriver processes task completions incrementally and re-evaluates decisions after each completion, so a finished loop task can unblock the next iteration before the rest of the batch lands.
  • Continue-as-new handoffs remain quiescence-only: the engine still waits for full quiet before handing off to the next run segment.
  • Unhandled task failures retain precedence over further iterations, preserving the existing error propagation guarantee.
Fixes #267. PR #271.

Fix: Gateway streams persisted events from detached runs

streamRunEvents returned heartbeats only for runs started with bunx smithers-orchestrator up -d, because the Gateway host had no visibility into events written by the detached process. A built-in out-of-process event bridge now tails _smithers_events for runs the Gateway host did not execute. The bridge is on by default and configurable via outOfProcessEventBridge (boolean) and outOfProcessEventBridgePollMs (number) Gateway options. Detached runs now deliver real event frames to connected clients without any changes to the workflow or the detached runner. Fixes #254. PR #266.

Feature: bunx smithers-orchestrator gateway command

A new bunx smithers-orchestrator gateway command starts the multi-run Gateway RPC and WebSocket control plane headlessly, backed by the workspace database. It exposes listRuns, streamRunEvents, and streamDevTools and prints the workspace and DB paths it serves on startup. This is distinct from bunx smithers-orchestrator up --serve, which runs a single workflow and adds a lightweight serve layer. Use gateway when you need the full /v1/rpc/* control plane without launching a workflow run. Fixes #255. PR #268.

Fix: default agent scaffolding no longer ships broken providers

bunx smithers-orchestrator init generated an agents.ts that placed opencode-over-Anthropic-API providers in the default smart and smartTool pools. Those providers are non-functional for most users out of the box and caused runs to fail with cryptic errors rather than a clear diagnostic. Generated agents.ts files now lead smart and smartTool with a working Claude subscription provider. The opencode-over-Anthropic-API entry is removed from default pools. If init cannot find any usable provider it now fails with NO_USABLE_AGENTS instead of writing a configuration that cannot run. Fixes #236. PR #270.

Feature: workflow input schemas in inspect and generated skills

bunx smithers-orchestrator inspect now returns the machine-readable JSON schema for each workflow’s input alongside the run summary. The schema is also surfaced in generated skill docs: real field names, types, defaults, enums, and descriptions appear in the skill file rather than a generic placeholder. Fixes #258. PR #272.

Fix: observability ships its Docker Compose stack assets

The @smithers/observability package was missing the Docker Compose stack files that bunx smithers-orchestrator observability expects at a known path inside the package. The command exited with a confusing path error rather than a clear prerequisite message. The stack assets are now included in the published package at the path the CLI resolves. The prerequisite error message was updated to name Docker Compose explicitly, and the docs were reconciled with the actual command surface. Fixes #262. PR #269.

Docs: Aspects budgets marked declarative, not yet enforced

The costBudget, tokenBudget, and latencySlo fields on <Aspect> were documented as active runtime constraints. They are accepted and stored but not currently enforced by the engine. The docs now label them as declarative scaffolding; runtime enforcement is tracked in #273. Fixes #265. PR #274.

Fix: CI gates restored on main

Two CI failures on main are resolved:
  • A typecheck error in the .smithers/ workspace used by the real-stack-e2e workflow blocked the CI test job (commit ad0b69a4).
  • Bare CLI invocations in the 0.23.1 changelog (without the bunx smithers-orchestrator prefix) failed the normalize-bunx docs gate (commit d7747b15).

Fix: CLI agent answers survive captured-stdout truncation

Long CLI agent runs overflow the 200 KB captured-stdout cap. The capture kept the head of the stream-JSON output, dropping the terminal result event, so text extraction fell back to tool-result fragments and the engine’s context-free JSON repair persisted schema-valid but amnesiac rows. This surfaced live as investigation outputs claiming the task “was not present in the available context.” Four changes land together:
  • The capture now keeps the stream tail for CLI agents and reports truncation in its result.
  • The agent prefers the live interpreter’s completed answer, parsed before the cap applies, whenever stdout was truncated.
  • Token usage falls back to the completed event when the per-message usage lines were cut.
  • A warning event fires on truncation, and the JSON-repair prompt includes the original task.
Fixes #277. Commit 291a99157e.

Fix: bunx smithers-orchestrator output resolves camelCase output tables

bunx smithers-orchestrator output RUN_ID NODE_ID printed null for nodes whose output table schema key is camelCase. Node state stores the schema key verbatim while the physical table is snake_case, so the raw JSON lookup found nothing. The lookup now tries the stored name first and falls back to the snake_case translation only when no physical table with the stored name exists. A real camelCase physical table with a missing row still returns null rather than another table’s row. Fixes #276. Commit 61fbf19edd.

Fix: observability package dts build

apps/observability’s tsup --dts-only build failed because four metric names (snapshotsCaptured, runForksCreated, replaysStarted, snapshotDuration) were listed in the d.ts export block without declarations, keeping the repo’s faults CI job red on every branch. The declarations now exist and pnpm -r build passes. Fixes #275. Commit 1b3ccb8e73.

Docs: agent-operated CLI framing and reference additions

The docs now state explicitly that a coding agent operates the Smithers CLI on the human’s behalf. The guide, quickstart, CLI cheatsheet, component pages, MCP server docs, and the agent skill were updated to make the division of responsibility concrete: the agent runs every command itself, relays approval and human-task prompts to the human in conversation, and submits the resolving command (approve, deny, human answer, signal) without asking the human to type Smithers commands. Installation remains the one step a human may run by hand. The reference docs also document the @smithers-orchestrator/review package and the sandbox:up / sandbox:down GCP VM scripts in the package-configuration tables. The generated llms.txt, llms-full.txt, and skill bundles were regenerated to include these changes.