monitor-smithers is a watchdog over your Smithers fleet. It reads the live run
table, sorts every run into a health bucket, and, only when something is wrong,
produces concrete escalation actions (which gate to clear, which run to triage,
which to cancel). Reach for it on a schedule or whenever you want a quick
“is anything stuck?” sweep of in-flight runs.
Stages
- poll: deterministic (no agent), shells out to
bunx smithers-orchestrator ps --format json, parses it, and normalises each run into{ runId, status, ageMinutes, lastEvent }. Any failure degrades to an empty list rather than throwing. - classify: a cheap/fast agent sorts the runs into buckets:
healthy,stuck,blocked,failed,overBudget. - triage: runs only when a non-healthy run exists. A smart agent surfaces
the pending approval/question for blocked runs (
bunx smithers-orchestrator why,bunx smithers-orchestrator approve), recommends thetriage-runworkflow for stuck/failed runs, and returns a digest.
Inputs
| Input | Type | Default |
|---|---|---|
staleMinutes | number | 15 (a run idle past this many minutes is treated as stuck) |
Monitor it
digest and actions tell
you exactly which runs need a human and what to do first.