Required Checks
Run these checks before promoting a workflow module:.github/workflows/ci.yml) runs check:effect, check:deps, typecheck, and test on pull requests and pushes to main. check:architecture is a local pre-promotion gate and is not enforced by CI.
Persistence
Use one SQLite database file per deployment and place it on durable storage, or run against PostgreSQL for managed, multi-connection storage (see PostgreSQL and PGlite below). The internal tables are created and migrated idempotently on startup on either backend. Recommended database practices:- Back up the database file and its WAL files together.
- Keep
PRAGMA foreign_keys = ON; Smithers enables it during schema setup and uses referential checks for core run artifacts. - Keep run IDs stable across resume attempts.
- Use the Gateway event stream sequence numbers for reconnects; clients should resume from the last seen
seq. - Avoid manually deleting internal rows. Delete whole runs through supported administrative paths so dependent frames, node diffs, and audit rows stay consistent.
PostgreSQL and PGlite
createSmithersPostgres(schemas, opts) runs the same durable engine, and the same crash-and-resume guarantees, on PostgreSQL or an embedded PGlite through the SQL dialect seam in packages/db/src/dialect.js. Point it at managed Postgres with { provider: "postgres", connectionString }, pass a node-postgres connection config with { provider: "postgres", connection }, or run an in-process PGlite with { provider: "pglite", dataDir }. The factory is async and returns the same createSmithers API plus a close() teardown.
pg, @electric-sql/pglite, and @electric-sql/pglite-socket are optional dependencies installed only when you take this path; the default synchronous bun:sqlite path needs none of them. On Postgres, take database-native backups and connection-pool sizing in place of the SQLite file-and-WAL backup guidance above.
Access Control
Expose the Gateway only behind TLS. Use scoped bearer grants for automation and short TTLs for human-triggered actions. Recommended scopes by client type:| Client | Typical scopes |
|---|---|
| Run dashboard | run:read, approval:read |
| Launch automation | run:read, run:write |
| Approval inbox | run:read, approval:read, approval:submit |
| Operator tools | run:read, run:write, approval:submit |
Execution Boundary
Sandbox workers run in an isolated environment so that untrusted workflow code cannot reach the Gateway database or host filesystem. The concrete controls are:- request and result bundles are written under the run sandbox directory
- bundle manifests are size-bounded
- patch and artifact paths are checked against path traversal
- produced diffs require review unless
autoAcceptDiffsis enabled - sandbox records and events are persisted for audit
allowNetwork, container images, environment variables, ports, volumes, and CPU or memory limits per worker. Verify your chosen runtime (Docker, Kubernetes, etc.) actually enforces these limits before running untrusted code. For high-risk code generation, run sandbox workers in a separate account, namespace, or machine with no ambient production credentials.
Secrets
Never pass long-lived credentials through workflow input. Prefer short-lived tokens from the caller, scoped environment injection at the worker boundary, or a secret manager mounted only into the worker process that needs it. Operational rules:- Do not store provider keys in SQLite rows, run input, task output, or event payloads.
- Redact logs before forwarding them to shared observability sinks.
- Split launch permissions from approval permissions for workflows that can write files, create pull requests, or deploy.
Cache Policy
Use cache policy keys deliberately:scope: "run"keeps reuse inside one run.scope: "workflow"shares reuse across runs of the same workflow.scope: "global"shares reuse across workflow names.ttlMsbounds staleness; expired cache rows are treated as misses and refreshed.versionshould change whenever prompt, model, provider, tool behavior, or output semantics change.
Audit Trail
For incident review, preserve:- Gateway access logs
- Smithers run events
- rows in
_smithers_time_travel_audit, which record workflow rewind/replay events - sandbox bundle metadata and review decisions
- approval decisions, notes, and actor IDs
- deployment version and workflow module revision
Release Checklist
Before a production release:- CI is green on typecheck, dependency checks, and tests.
- Database backups have been restored in a staging environment.
- Gateway tokens are scoped and have bounded TTLs.
- Sandbox runtime enforcement has been tested against the intended threat model.
- Approval paths have a named owner and a fallback owner (see Approval).
- Logs are retained long enough to investigate delayed workflow failures.