monitor - Smithers

monitor is an operator that watches one running Smithers run. It captures the run’s live state, diagnoses its health, answers the questions an operator always asks (what was each question and its answer, what did every approval gate decide, what did each task output, what code did each task change), optionally applies the smallest safe self-fix, and renders a self-contained HTML report. Reach for it when a run is in flight and you want a clear picture of what it is doing, why it is paused, or where it broke. It is the single-run companion to monitor-smithers, which is a fleet-wide watchdog. Run monitor against one run id; run monitor-smithers to sweep every run for trouble.

smithers monitor                       # the most recent active run
smithers monitor RUN_ID                # a specific run
smithers monitor RUN_ID --autofix      # also let it apply a safe fix (behind an approval gate)
smithers monitor RUN_ID --no-ui        # do not auto-open the run's custom UI

By default smithers monitor also auto-opens the monitored run’s own custom UI in your browser when that workflow has one (monitoring a vcs run pops open the vcs UI, an implement run the implement UI, and so on). It reuses bunx smithers-orchestrator ui under the hood, which starts a local Gateway automatically; a workflow with no UI is simply skipped. Pass --no-ui to suppress it, and note it is skipped for machine output (--format json). smithers monitor runs the bundled monitor workflow. It accepts every bunx smithers-orchestrator up flag (-d, --hot, --serve, --resume, …) plus the monitor-specific knobs below.

Inputs

Input	Type	Default	Meaning
`targetRunId`	string \| null	`null`	Run to monitor. Null auto-selects the most recent active run. The positional `RUN_ID` sets this.
`autofix`	boolean	`false`	Let the monitor apply the smallest safe self-fix and resume the run.
`requireApproval`	boolean	`true`	With `autofix`, pause for a human approval gate before any fix.
`staleMinutes`	number	`15`	A non-terminal run idle past this many minutes is treated as stuck.
`title`	string \| null	`null`	Optional report title.

input.runId is reserved for the monitor run’s own id, so the run being watched is named targetRunId.

Stages

gather (deterministic, no agent) shells out to the read-only CLI surface and normalizes the result: inspect --format json --full-output for the node tree, why --json for the blocker, recent events --json, scores, and the human inbox. Any failure degrades to a thin snapshot rather than throwing.
diagnose (agents.smartTool) reads the snapshot and uses its own shell tools to read ground truth (output, diff, node, events). It returns a health bucket plus the answers to the operator’s standing questions and a list of concrete recommended actions. Diagnosis is read-only.
approve-fix (<Approval>, only when autofix is on and a self-fix exists) pauses for a human to authorize the repair. Clear it with bunx smithers-orchestrator approve.
fix (agents.smartTool, only when authorized) applies the smallest safe repair (retry a flaky node, resume a stalled run, correct a bad input) and resumes the run. It never approves gates, answers questions, cancels, or takes outward actions on a human’s behalf.
report (agents.smart) renders one self-contained HTML document covering health, questions and answers, approval decisions, task outputs, code diffs, scores, recommended actions, and an operator cheat-sheet.
artifact (deterministic) writes the report to artifacts/monitor/<runId>.html.

By default (autofix off) the monitor is entirely read-only: it observes, diagnoses, and reports without touching the run.

How it answers the standing questions

The diagnose step assembles these from real evidence, and the report renders each as its own section. You can also read any of them directly:

What was each question and its answer? Every HumanTask / ask-human / Signal the run raised, with the answer and who gave it. Read it yourself with bunx smithers-orchestrator why RUN_ID (the pending prompt) and bunx smithers-orchestrator human inbox.
What did each approval gate decide? Every <Approval> gate, its decision, the note, and who decided. Read it with bunx smithers-orchestrator events RUN_ID --type approval --json or the gate’s bunx smithers-orchestrator output RUN_ID NODE_ID.
What did a task output? bunx smithers-orchestrator output RUN_ID NODE_ID prints the task’s validated output row as JSON.
What code did a task change? bunx smithers-orchestrator diff RUN_ID NODE_ID prints the unified diff the task wrote to the workspace.

Watch it live

The report is a point-in-time snapshot. For a live view, drive the run’s own observability surface while the monitor (or the target run) executes:

bunx smithers-orchestrator ps                            # active / paused / recent runs
bunx smithers-orchestrator inspect RUN_ID --watch        # live run state, nodes, gates
bunx smithers-orchestrator logs RUN_ID -f                # tail the event log
bunx smithers-orchestrator events RUN_ID --json          # full event history (NDJSON)
bunx smithers-orchestrator why RUN_ID                    # why it is paused / blocked / failed
bunx smithers-orchestrator scores RUN_ID                 # scorer results

Custom UI

Two UIs are in play. By default smithers monitor opens the monitored run’s own custom UI (see above). monitor itself also ships a custom Gateway UI (.smithers/ui/monitor.tsx) that visualizes the monitor run: health badge, the questions and answers, approval decisions, task outputs, code diffs, recommended actions, and the rendered HTML report embedded inline. Open it for a monitor run id with:

smithers gateway --port 7331           # serve every workflow + its UI
bunx smithers-orchestrator ui MONITOR_RUN_ID             # open the monitor's own UI

The UI is built with the gateway-react hooks (useGatewayRunEvents, useGatewayNodeOutput, useGatewayActions). See Custom Workflow UIs to author your own.

Hot mode vs. normal mode

When you iterate on the monitor’s own prompts, run it with --hot:

smithers monitor RUN_ID --hot

Normal mode runs the workflow exactly as written. Editing the source does not affect an in-flight run, and resuming after a source change is blocked because it is, to the engine, a different workflow.
Hot mode (--hot) applies edits to prompt wording and task bodies on the next render frame while finished tasks stay persisted. Use it while authoring or tuning. Changing an output schema or a task id’s shape still needs a fresh run, so keep task ids stable and data-derived.
Either way the target run being watched is unaffected. monitor only reads it (and, with autofix, applies the one repair you authorize).

​Inputs

​Stages

​How it answers the standing questions

​Watch it live

​Custom UI

​Hot mode vs. normal mode