Documentation Index
Fetch the complete documentation index at: https://smithers.sh/llms.txt
Use this file to discover all available pages before exploring further.
0.17.0
0.17.0 ships Gateway v1: the Gateway’s RPC schema, scope vocabulary, error codes, and OpenAPI document are now a stable, separately published package —@smithers-orchestrator/gateway — so external clients (bots,
dashboards, SDKs in other languages) can build against the contract instead
of the server source. 0.15.0 launched the Gateway; 0.17.0 commits to its
wire shape.
Alongside the contract release, the CLI grows a real account/credential
surface (agents add wizard, multi-account isolation via configDir,
machine-readable JSON stdout, a new chat create subcommand), <Task>
gains an auto-hijack prop for resumable agent handoffs, and the
Gateway picks up the rest of its production hardening (slowloris timeouts,
RFC-compliant 413 / Bearer parsing, cross-instance DB transaction
coordination). GitHub Actions CI returns, this time with a
single-Effect-version guardrail and a new fault-injection E2E lane.
Breaking changes
-
pi-pluginmoved to its own workspace package. The runtime previously lived inpackages/smithers/src/pi-plugin/and was re-exported from thesmithers-orchestrator/pi-pluginsubpath. It is nowpackages/pi-plugin/, published as@smithers-orchestrator/pi-plugin. Thesmithers-orchestrator/pi-pluginandsmithers-orchestrator/pi-extensionsubpath re-exports have been deleted. Update imports:
Gateway v1 — Stable RPC Contract as a Standalone Package
0.15.0 shipped the Gateway. 0.17.0 ships the contract. The Gateway’s RPC schema, scope vocabulary, error codes, and OpenAPI document now live in their own workspace package —@smithers-orchestrator/gateway — and are the single
source of truth that both the in-process server and any external client
codegen against. The server no longer carries its own copy of the method or
scope tables; clients no longer have to scrape the source to learn the wire
shape.
If you’re building a Slack bot, a status dashboard, a CI integration, or a
typed client SDK in another language, this is the package you import.
What ships
- 18 stable v1 RPC methods.
launchRun,resumeRun,cancelRun,hijackRun,rewindRun,submitApproval,submitSignal,getRun,listRuns,streamRunEvents,streamDevTools,getNodeOutput,getNodeDiff,cronList,cronCreate,cronDelete,cronRun. - Typed request and response shapes for every method (e.g.
LaunchRunRequest,LaunchRunResponse). - JSON Schema for every params and result type, exported as data so clients can validate without a TypeScript compiler.
- A v1 error vocabulary —
InvalidRequest,InvalidInput,Unauthorized,Forbidden,RunNotFound,NodeNotFound,IterationNotFound,NodeHasNoOutput,FrameOutOfRange,SeqOutOfRange,Busy,RateLimited,PayloadTooLarge,BackpressureDisconnect,UnsupportedSandbox,VcsError,RewindFailed,Internal— each pinned to an HTTP status. - A typed scope vocabulary (see below) with a hierarchy aware of legacy
read/execute/approve/admingrants for backward compatibility. - Generated
openapi.yamlatpackages/gateway/openapi.yaml, re-derived bybun run generate:openapiand CI-checked. - 18 reference docs under
docs/rpc/— one per method.
Importing the contract
The package has three subpath exports — pick the smallest surface that fits:Calling a method over HTTP
Every method has a stable URL of the formPOST /v1/rpc/<method>:
Calling a method over WebSocket
The same methods work over a single authenticated WebSocket connection. This is the right transport for clients that want to follow events instead of polling:The scope vocabulary
@smithers-orchestrator/gateway/auth/scopes exports the v1 scope vocabulary.
Each method declares the scope it requires; hasGatewayScope() performs the
match and understands legacy grants:
| Scope | What it grants |
|---|---|
run:read | Read run state, summaries, event streams, node outputs and diffs. |
run:write | Launch, resume, and cancel runs. |
run:admin | Elevated run control: hijack and rewind. |
approval:submit | Submit approval decisions. |
signal:submit | Submit workflow signals. |
cron:read | List cron schedules. |
cron:write | Create, delete, and trigger cron schedules. |
observability:read | Read DevTools and other observability streams. |
run:admin implies run:write implies run:read. Same for
cron:write over cron:read. The legacy read/execute/approve/admin
grants from 0.15 / 0.16 are still accepted — admin matches everything,
execute covers run:write / signal:submit / cron:write, and so on —
so existing tokens keep working.
Validating params at the edge
Every method publishes its JSON Schema throughgetGatewayRpcDefinition(),
which lets a gateway, proxy, or client validate payloads without a generated
SDK:
Generating clients in other languages
Theopenapi.yaml is the same contract in OpenAPI 3.1 form. Anything that
consumes OpenAPI works:
packages/gateway/src/rpc/index.ts and
checked at PR time, so the YAML cannot drift from the runtime.
Server adoption
packages/server/src/gateway.js no longer ships its own method or scope
table — it imports getRequiredScopeForGatewayMethod, hasGatewayScope,
and the per-method JSON Schemas straight from
@smithers-orchestrator/gateway. A new 283-line contract test
(packages/server/tests/gateway-v1-contract.test.jsx) pins the wire shape
end-to-end so server changes can’t silently break the published v1
contract.
New: <Task hijack> — resumable agent handoffs
<Task> gains two new props for handing the active session over to an
interactive agent without losing run state:
hijack: boolean— when set, the engine records a hijack handoff before the task starts running. The next CLI invocation can pick up the same agent session interactively.onHijackExit: "complete" | "reopen"— controls what happens when the hijacked session closes:completefinishes the task as if it ran to completion;reopenrestarts the task on the next resume.
cliEngine /
hijackEngine).
New: CLI control plane
The CLI grows a proper account/credentials surface so multiple subscriptions can run side-by-side and so machine-readable output stays clean.@smithers-orchestrator/accounts package
A new workspace package backing per-provider credentials in
~/.smithers/accounts.json:
Interactive smithers agents add
.smithers/agents.ts if present so the new account
flows into existing workflows without manual edits.
configDir and apiKey for multi-account isolation
All CLI agents (ClaudeCodeAgent, CodexAgent, GeminiAgent, KimiAgent)
accept configDir to point at a per-account profile, and ClaudeCodeAgent
accepts apiKey to opt into API billing instead of subscription auth.
Two subscriptions can run concurrently in the same process:
configDir is wired through to the right per-CLI environment variable
(CLAUDE_CONFIG_DIR, CODEX_HOME, GEMINI_DIR, KIMI_SHARE_DIR).
smithers chat create
A one-task auto-hijacked chat that drops you into an interactive Claude
Code, Codex, or Gemini session inside a Smithers run. Built on top of the
new Task hijack prop and now assembled from package-level primitives so
the command does not need to import the top-level smithers-orchestrator
bundle just to create its inline workflow:
{} — control returns to
the workflow and the task completes.
MDX and generated workflow packs
The CLI now owns its Bun MDX plugin locally (apps/cli/src/mdx-plugin.js)
instead of importing the plugin through the public smithers-orchestrator
package. Generated workflow packs also include @types/node alongside the
React and MDX type pins, which keeps freshly initialized packs typecheckable
when workflows use Node globals or built-ins.
smithers init --agents-only
Scaffold the agents file without generating workflows. Useful when you
already have your own workflow pack and just want Smithers to manage
credentials:
JSON-clean stdout
A new stderr logger routes progress, warnings, and diagnostic output to stderr so commands invoked with--format json produce a parseable stdout
stream:
apps/cli/tests/json-stdout-contract.test.js) pins the
invariant for every JSON-emitting CLI surface.
New: Server hardening
Three production-relevant fixes land alongside the gateway package extraction:-
Slowloris mitigation.
GatewayOptionsandServerOptionsgainheadersTimeout(default 30s) andrequestTimeout(default 60s). Sockets that stall mid-headers or mid-body are now closed instead of held open indefinitely: -
HTTP 413 PayloadTooLarge. Oversized request bodies now return
413 PayloadTooLargeinstead of the prior400 INVALID_INPUT, aligning with RFC 7231. -
RFC 6750 Bearer parsing. The auth-scheme match is now case-insensitive,
so
bearer,Bearer, andBEARERall parse. Affectsgateway.js,index.js, andserve.js.
$ref cycles.
New: DB schema evolution + transaction coordination
-
syncZodTableSchema(sqlite, tableName, schema, opts)is a new export from@smithers-orchestrator/db. It replaces the bareCREATE TABLE IF NOT EXISTSpath used at boot: when a Zod schema gains columns between releases, it issuesALTER TABLE ADD COLUMNso existing databases catch up without a manual migration.createSmithers()and external Smithers initialization use it automatically; user-managed custom tables can opt in: -
Cross-instance transaction coordination. Multiple
SmithersDbinstances that share a single underlyingsqliteclient now coordinate transaction depth, owning thread, and turn acquisition through a globalWeakMapkeyed on the client. This eliminates theBEGIN IMMEDIATEcollisions that could deadlock concurrent runs on a shared DB connection. -
Migrations run before index creation.
SqlMessageStoragenow executesMIGRATION_STATEMENTSbeforeCREATE INDEX, so indexes that reference newly added columns no longer fail on boot for databases that pre-date those columns.
New: Memory id encoding
namespaceToString now percent-encodes : and % inside id, and
parseNamespace decodes them on the way back. IDs containing colons —
e.g. { kind: "workflow", id: "task:subtask:0" } — round-trip without
ambiguity:
kind collapsing
to global) is documented and unchanged.
New: Shared metrics and schema exports
Metrics for memory, OpenAPI tools, scorers, and time travel are now owned by the observability package and re-exported from the packages that emit them. This keeps Prometheus metric instances single-sourced while preserving the existing subpath imports:- Memory: fact reads/writes, recall queries, recall duration, and message saves.
- OpenAPI tools: tool calls, call errors, and duration.
- Scorers: started, finished, failed, and duration.
- Time travel: snapshots captured, run forks created, replays started, and snapshot duration.
@smithers-orchestrator/db/internal-schema; the package-level
@smithers-orchestrator/memory/schema and
@smithers-orchestrator/scorers/schema subpaths continue to re-export the
same tables for callers that already import them there. This removes schema
ownership cycles between DB, memory, and scorers without changing the table
names.
New: <ExtractPrompt> workflow component
A library-level workflow component for building prompts with a Socratic
drafter. The drafter loops via <LoopUntilScored> with stakes-based
thresholds (high → 1.0, low → 0.7) and an RCTF (Role / Context / Task /
Format) scaffold. Approved prompts are persisted to one of three pluggable
caches:
MarkdownPromptCache(default) — writes.smithers/cache/prompts/{slug}.md.SqlitePromptCache— single-file durable cache.MemoryPromptCache— in-process for tests.
docs/components/extract-prompt.mdx with a design note in
docs/design-prompts/extract-prompt-design.md.
New: Type-level safety
The runtime ships first-class type tests so refactors that break agent and workflow contracts fail attsc time rather than at runtime:
AgentLikeassignability suite inpackages/agents/src/__type-tests__/AgentLike.assignability.test-d.tspins every concrete agent class (AmpAgent,AnthropicAgent,ClaudeCodeAgent,CodexAgent,ForgeAgent,GeminiAgent,KimiAgent,OpenAIAgent,PiAgent) to theAgentLikeinterface.- Workflow input test at
packages/smithers/src/__type-tests__/workflow-input.test-d.tspins schema-driven input narrowing. AgentGenerateOptionsis extracted into its own module (packages/agents/src/BaseCliAgent/AgentGenerateOptions.ts) so it can be consumed independently ofBaseCliAgent.- Graph-level agent and scorer types no longer import the concrete
agents or scorers packages.
packages/graphnow carries the structuralAgentLikecapability shape andScorersMap, andpackages/componentsimports scorer types from@smithers-orchestrator/graph/types. That keeps the graph and component contracts type-safe without pulling runtime package dependencies into lower-level packages.
New: Fault-injection E2E matrix
0.17.0 adds a private@smithers-orchestrator/e2e workspace package with
the fault matrix from ticket 0022. The suite is organized as reusable
fault primitives, per-case tests, explicit latency/RSS budgets, and a flake
log:
- Harness primitives:
killProcess,dropWebSocket,freezeSqliteLock,stallSandbox,skewClock,corruptHeartbeat, andtakeoverRun. - Fault cases: 30 case files covering crash/recovery, waiting approvals, waiting events/timers, supervisor takeover races, continue-as-new lineage, inspector truthfulness, reconnect-after-seq, time-travel scrub/rewind, Gateway RPC, WebSocket drops, subscriber fanout memory, webhook signature rejection, cron/manual overlap, unsafe replay approvals, approval-scope denial, diff review mode, scorer gating, and live-stream soak behavior.
- Budgets:
e2e/budgets/latency.jsonande2e/budgets/memory.jsondefine enforced ceilings such as the 10-minute per-PR wall time, 2-hour nightly soak wall time, reconnect latency, and RSS caps for live stream / subscriber fanout tests. - Explicit blockers: JJHub/runtime-dependent cases for auth persistence, browser automation, file/VCS pointer integrity, secret redaction, network policy, plus the cron-driver and long-lived JJHub soak cases, are present as skipped tests with their unblockers documented in the file.
.github/workflows/faults.ymlruns the per-PR fault subset withpnpm --filter @smithers-orchestrator/e2e test:faults..github/workflows/faults-nightly.ymlruns the soak lane on a daily cron withSMITHERS_E2E_SOAK=1.
Fixes
Engine reliability and retries
-
SmithersErrordetails preserved across the Effect promise boundary.executeTaskpreviously droppedfailureRetryableand other error metadata when an effect was awaited as a promise. The flag now survives, so deterministic errors stop being retried. -
failureRetryable=falseis honored. Agent-thrown SmithersErrors that declare themselves non-retryable now short-circuit the engine’s retry loop instead of being attempted up tomaxAttemptstimes. -
New
AGENT_CONFIG_INVALIDerror code. Used for deterministic config failures — expired/invalid CLI credentials, “LLM not set”, “unknown model” — and marked non-retryable by default. Agents proactively detect expired CLI credentials and classify auth-failure patterns from CLI output. -
Kimi mid-stream session crashes recover. When kimi exits with the
“To resume this session: kimi -r <id>” hint, the engine surfaces
discardResumeSession: trueso the next attempt starts fresh instead of deterministically replaying the broken session. -
Kimi OAuth tokens auto-refresh.
KimiAgentrefreshes expired access tokens against the storedrefresh_tokenbefore falling back to interactivekimi login. Concurrent refresh calls dedupe within a process; “no refresh_token stored” and “refresh failed” are reported separately so operators can tell whether to re-authenticate or investigate the auth service. - Resume no longer false-positives on VCS revision drift. Workflows that commit mid-execution (e.g. the kanban merge step) and projects with co-located git+jj repositories now resume cleanly. The strict type and revision checks have been relaxed to record-only.
- Kimi’s interactive resume hint is suppressed in stderr. It used to drown out real errors. JSON output extraction now also tolerates arrays, byte-order marks, and markdown code fences.
Type fixes
-
SmithersCtx.output()/outputMaybe()/latest()now narrow by table schema. The accessors were typed as(table, key) => OutputRoweven when the workflow declared an output schema, so callers had to cast to get the real row shape that the runtime was already returning. They are now generic over the table parameter and returnOutputForTable<Schema, Table>. No runtime change — only the types catch up to what the runtime was always producing:
Other fixes
- Test infrastructure:
execSyncmaxBufferraised to 10 MB to keep CLI smoke tests from truncating output. - Gateway edge-case assertions now match the hardened behavior. The
previously skipped oversized-body case now asserts
413 PayloadTooLarge, and lowercase/uppercase Bearer auth-scheme variants now authenticate case-insensitively before falling through to method validation. - Agent usage extraction tests live with the agents package. The token
usage regression suite moved out of
packages/graph/testsand intopackages/agents/tests, matching the package that ownsextractUsageFromOutput.
Refactors
- Kanban workflow ticket discovery now walks
.smithers/tickets/recursively up to depth 4, skipping hidden directories andREADME.md. Subdirectories such assmithers/,jjhub/, andgui/surface as work items. Review filtering scopes matchnodeId.startsWith("${slug}:review:")instead of a globalreviewer-*field, eliminating cross-ticket review pollution. - Sandbox child workflow execution is injected instead of imported.
executeSandbox()now receivesexecuteChildWorkflowthroughExecuteSandboxOptions, and graph extraction loads the sandbox executor and engine child-workflow executor together at the call site. This removes a direct sandbox-to-engine import while preserving sandbox child workflow behavior. - Lower-level package dependencies were tightened. Observability no longer imports agents or driver just to build the metric catalog; graph no longer depends on agents; components no longer depend on scorers for shared scorer types; and memory/scorers schema exports now route through DB-owned internal schema definitions.
Internal / chore
-
GitHub Actions CI returns.
.github/workflows/ci.ymlruns three jobs on PRs and main pushes: typecheck (with the Effect single-version check), lint, and test. The 0.16.0 changelog’s note that CI was managed externally no longer applies. -
scripts/check-single-effect-version.mjsguardrail. Scanspnpm-lock.yaml,bun.lock, and resolved node_modules paths and fails if more than oneeffectversion is reachable from the CLI’s dependency tree. Prevents transitive Effect version drift. -
scripts/check-dependency-boundaries.mjsguardrail. Parses imports across root scripts,packages/*,apps/*, ande2ewith the TypeScript parser and fails when a workspace imports a package it has not declared. Runtime files must declare dependencies independencies; tests, configs, and scripts can usedevDependencies.pnpm testand CI now run this check viapnpm check:deps. -
Test coverage backfill. Ten new or expanded suites covering VCS
round-trips against real
jjrepos (dirty/untracked trees, symlinks, multi-MB binaries, shell meta-args), DevTools delta apply edge cases, reconciler/graph end-to-end, CLI signal handling and flag validation, and first-time test suites for the scheduler, driver, sandbox, memory, and time-travel packages. -
ralph.tsreview-plan-implement helper added at the repo root. Loopsclaudeandcodexuntil the reviewer returnsLGTM. Run withbun ralph.ts. -
Tickets archived. Completed tickets under
.smithers/tickets/(gateway reference deployment, CLI JSON stdout, pi-tui dependency, smithers workspace typecheck/AgentLike, root validation gaps, gui devtools snapshot, gui cursor ghost) moved to.done/. New hardening tickets 0024–0027 from the 2026-04-25 review are added. -
Misc. Dropped unused
@types/diff(thediffpackage ships its own types)..gitignorenow ignores.claude/,.claude/scheduled_tasks.lock,.kanban-reports/, andtui-buffer.txt.
Docs
- 18 RPC method reference pages under
docs/rpc/. docs/contributing/checks.mdxdescribes the localpnpm verifyloop.docs/deployment/reference.mdxlays out the reference deployment.- macOS GUI download instructions in
docs/llms-core.txt. <ExtractPrompt>and<LoopUntilScored>component docs.e2e/README.mddocuments the fault matrix layout, per-PR and nightly commands, budget files, and flake-promotion rule.
- Update
docs/integrations/pi-integration.mdximport paths to@smithers-orchestrator/pi-plugin. - Add
agents.add/agents.list/agents.remove/agents.testandchat.createto the CLI TOON catalog indocs/cli/overview.mdx, and documentconfigDir/apiKeyindocs/integrations/cli-agents.mdx. - Add a credentials/accounts user guide.
- Document
<Task hijack>andonHijackExitindocs/components/task.mdx, and addAGENT_CONFIG_INVALID/TASK_HIJACK_UNSUPPORTEDtodocs/reference/errors.mdx. - Note id-colon encoding in
docs/concepts/memory.mdx. - Describe ticket discovery rules in the kanban workflow doc.