Skip to main content
Smithers ships a built-in MCP stdio server. When you pass --mcp to the CLI it speaks the Model Context Protocol over stdin/stdout instead of acting as an interactive CLI. Any MCP-aware client can connect, discover your workflows, start runs, watch progress, resolve approvals, and revert bad attempts — all through structured, machine-readable tool calls. Use the MCP server when you want an AI agent to drive Smithers autonomously. Use the HTTP Server when you need REST endpoints for human-written code or webhooks.

Setup

Start the server

smithers --mcp
By default this starts the semantic surface — a stable, structured set of tools designed for AI agent consumption. The semantic surface is what this page documents. Two additional surfaces are available via --surface:
# Semantic tools only (default)
smithers --mcp --surface semantic

# Raw CLI-mirroring tools only
smithers --mcp --surface raw

# Both surfaces registered on the same server
smithers --mcp --surface both
Use --surface raw only when you need direct CLI parity. The semantic surface is strongly preferred for new integrations: every tool returns a consistent { ok, data, error } envelope and uses validated Zod schemas for both input and output.

Register with Claude Code

smithers mcp add
smithers mcp add writes the server entry to the appropriate MCP config file for the detected agent. Pass --agent to target a specific client, --no-global to install project-locally, or --command to override the launch command:
smithers mcp add --agent claude-code
smithers mcp add --no-global
smithers mcp add --command "pnpm smithers --mcp"

Register manually

For clients that read a JSON config directly, add an entry like this:
{
  "mcpServers": {
    "smithers": {
      "command": "smithers",
      "args": ["--mcp"]
    }
  }
}
For project-scoped installs (e.g. a monorepo where Smithers is a dev dependency):
{
  "mcpServers": {
    "smithers": {
      "command": "pnpm",
      "args": ["smithers", "--mcp"]
    }
  }
}

Tool Registration

When the server starts it calls registerSemanticTools, which loops over the tool definitions produced by createSemanticToolDefinitions and registers each one via server.registerTool. Every tool carries:
  • inputSchema — a Zod object schema describing accepted parameters.
  • outputSchema — a Zod schema for the structured response envelope.
  • annotations — MCP annotation metadata (readOnlyHint, destructiveHint, idempotentHint, openWorldHint).

Structured tool envelope

Every tool returns the same top-level shape:
{
  ok: boolean;
  data?: { ... };     // present on success
  error?: {           // present on failure
    code: string;
    message: string;
    details?: Record<string, unknown> | null;
    docsUrl?: string | null;
  };
}
The response is also echoed as a text content block so clients that do not parse structuredContent still receive the JSON payload.

Tool annotations

AnnotationToolsMeaning
readOnlyHint: trueMost query toolsTool does not modify state
readOnlyHint: false, openWorldHint: truerun_workflowLaunches external processes
readOnlyHint: false, destructiveHint: true, idempotentHint: falseresolve_approval, revert_attemptMutates persisted state irreversibly

Tool Reference

list_workflows

List all Smithers workflows discovered in the working directory. Input: none Output:
{
  workflows: Array<{
    id: string;
    displayName: string;
    entryFile: string;
    sourceType: "seeded" | "user" | "generated";
  }>;
}
Use the returned id values as the workflowId parameter for run_workflow.

run_workflow

Start or resume a discovered workflow. Input:
ParameterTypeDefaultDescription
workflowIdstringrequiredWorkflow ID from list_workflows
inputRecord<string, unknown>{}Workflow input object
promptstringShorthand: sets input.prompt when input is not provided
runIdstringautoCustom run ID
resumebooleanfalseResume an existing run; requires runId
forcebooleanfalseForce-start even if a run with this ID already exists
waitForTerminalbooleanfalseBlock until the run reaches a terminal state
waitForStartMsnumber1000For background launches, how long to wait for the run row to appear in the database
maxConcurrencynumberMax concurrent nodes
rootDirstringRoot directory for tool sandboxing and path resolution
logDirstringDirectory for log files
allowNetworkbooleanfalseAllow network access in bash tool
maxOutputBytesnumberCap on node output size
toolTimeoutMsnumberPer-tool call timeout
hotbooleanfalseEnable hot-reloading of the workflow file
Output:
{
  workflow: { id, displayName, entryFile, sourceType };
  runId: string;
  launchMode: "background" | "waited";
  requestedResume: boolean;
  status: string;
  observedRun: RunSummary | null;
  result: { runId, status, output?, error? } | null;
}
Background vs. waited launch By default (waitForTerminal: false) the tool fires the workflow and returns immediately with launchMode: "background". The observedRun field reflects the run state polled during waitForStartMs. Use watch_run to track progress after launch. Set waitForTerminal: true to block until the workflow finishes. The result field is populated and launchMode is "waited". Run option forwarding rootDir, logDir, allowNetwork, maxOutputBytes, toolTimeoutMs, and hot are forwarded verbatim to the engine’s runWorkflow call. They override any values baked into the workflow file.

list_runs

List recent runs with summary data. Input:
ParameterTypeDefaultDescription
limitnumber (1–200)20Max runs to return
statusstringFilter by status (running, finished, failed, etc.)
Output:
{
  runs: RunSummary[];
}
RunSummary fields: runId, workflowName, workflowPath, parentRunId, status, createdAtMs, startedAtMs, finishedAtMs, heartbeatAtMs, activeNodeId, activeNodeLabel, pendingApprovalCount, waitingTimers, countsByState.

get_run

Get the full detail record for a specific run, including steps, approvals, timers, loop state, lineage, config, and error. Input:
ParameterTypeDescription
runIdstringRun ID
Output:
{
  run: RunSummary & {
    steps: Array<{ nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label }>;
    approvals: PendingApproval[];
    loops: Array<{ loopId, iteration, maxIterations }>;
    continuedFromRunIds: string[];
    activeDescendantRunId: string | null;
    config: unknown | null;
    error: unknown | null;
  };
}

watch_run

Poll a run at a fixed interval until it reaches a terminal state or a timeout expires. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun to watch
intervalMsnumber1000Poll interval (minimum enforced by runtime)
timeoutMsnumber30000Wall-clock budget before giving up
Output:
{
  runId: string;
  intervalMs: number;
  pollCount: number;
  reachedTerminal: boolean;
  timedOut: boolean;
  finalRun: RunSummary;
  snapshots: Array<{ observedAtMs: number; run: RunSummary }>;
}
When timedOut is true the run is still active; call watch_run again or increase timeoutMs. Terminal statuses are finished, failed, and cancelled.

explain_run

Return a structured diagnosis explaining why a run is blocked, waiting, or stale. Input:
ParameterTypeDescription
runIdstringRun ID
Output:
{
  diagnosis: {
    runId: string;
    status: string;
    summary: string;
    generatedAtMs: number;
    blockers: Array<{
      kind: string;
      nodeId: string;
      iteration: number | null;
      reason: string;
      waitingSince: number;
      unblocker: string;
      context?: string;
      signalName?: string | null;
      dependencyNodeId?: string | null;
      firesAtMs?: number | null;
      remainingMs?: number | null;
      attempt?: number | null;
      maxAttempts?: number | null;
    }>;
    currentNodeId: string | null;
  };
}
The summary field is a human-readable sentence. blockers lists every node currently preventing progress, with unblocker describing what action or event would unblock it.

list_pending_approvals

List approvals that are waiting for a human decision, optionally filtered by run, workflow, or node. Input:
ParameterTypeDescription
runIdstringFilter by run ID
workflowNamestringFilter by workflow name
nodeIdstringFilter by node ID
All parameters are optional. Omit all to list every pending approval across all runs. Output:
{
  approvals: Array<{
    runId: string;
    nodeId: string;
    iteration: number;
    status: string;
    requestedAtMs: number | null;
    decidedAtMs: number | null;
    note: string | null;
    decidedBy: string | null;
    request: unknown;
    decision: unknown;
    autoApproved?: boolean;
    workflowName: string | null;
    runStatus: string | null;
    nodeLabel: string | null;
  }>;
}

resolve_approval

Approve or deny a pending approval. This tool is destructive and non-idempotent. Input:
ParameterTypeDescription
action"approve" | "deny"required — decision to record
runIdstringFilter to a specific run
workflowNamestringFilter by workflow name
nodeIdstringFilter by node ID
iterationnumberFilter by loop iteration
notestringOptional note to record with the decision
decidedBystringIdentity of the decision-maker
decisionunknownStructured decision payload passed back to the workflow
Ambiguity guard If the filters match zero approvals the tool errors with INVALID_INPUT. If the filters match more than one approval the tool errors with INVALID_INPUT and returns the list of matches in details.matches — add runId, nodeId, or iteration to narrow the selection. The tool never guesses when multiple approvals match. Output:
{
  action: "approve" | "deny";
  approval: PendingApproval;   // with updated status, decidedAtMs, note, decidedBy
  run: RunSummary | null;
}

get_node_detail

Get enriched detail for a single node, including all attempts, tool calls, token usage, scorer results, and validated output. Input:
ParameterTypeDescription
runIdstringrequired
nodeIdstringrequired
iterationnumberLoop iteration (default: latest)
Output:
{
  detail: {
    node: { runId, nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label };
    status: string;
    durationMs: number | null;
    attemptsSummary: { total, failed, cancelled, succeeded, waiting };
    attempts: unknown[];
    toolCalls: unknown[];
    tokenUsage: unknown;
    scorers: unknown[];
    output: {
      validated: unknown | null;
      raw: unknown | null;
      source: "cache" | "output-table" | "none";
      cacheKey: string | null;
    };
    limits: {
      toolPayloadBytesHuman: number;
      validatedOutputBytesHuman: number;
    };
  };
}

revert_attempt

Revert the workspace and frame history back to the state captured at a specific attempt. This is destructive and non-idempotent. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun containing the node
nodeIdstringrequiredNode to revert
iterationnumber0Loop iteration
attemptnumberrequiredAttempt number to revert to (must be ≥ 1)
Output:
{
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  success: boolean;
  error?: string;
  jjPointer?: string;
  run: RunSummary | null;
}

list_artifacts

List structured output artifacts produced by nodes in a run. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
nodeIdstringLimit to a specific node
includeRawbooleanfalseInclude raw (pre-validation) output values
Output:
{
  artifacts: Array<{
    artifactId: string;   // "<runId>:<nodeId>:<iteration>"
    kind: "node-output";
    runId: string;
    nodeId: string;
    iteration: number;
    label: string | null;
    state: string;
    outputTable: string | null;
    source: "cache" | "output-table" | "none";
    cacheKey: string | null;
    value: unknown | null;
    rawValue?: unknown | null;   // only when includeRaw=true
  }>;
}
Only nodes that have an outputTable and a non-none output source are included.

get_chat_transcript

Return the structured agent chat transcript for a run, grouped by attempts. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
allbooleanfalseInclude all attempts, not just those with known output events
includeStderrbooleantrueInclude stderr messages
tailnumberReturn only the last N messages
Output:
{
  runId: string;
  attempts: Array<{
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    state: string;
    startedAtMs: number;
    finishedAtMs: number | null;
    cached: boolean;
    meta: unknown | null;
  }>;
  messages: Array<{
    id: string;
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    role: "user" | "assistant" | "stderr";
    stream: "stdout" | "stderr" | null;
    timestampMs: number;
    text: string;
    source: "prompt" | "event" | "responseText";
  }>;
}
Messages are sorted by timestampMs. Use tail to limit context window usage when transcripts are long.

get_run_events

Return the raw structured event history for a run with optional filtering. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
afterSeqnumberOnly events with seq greater than this value
limitnumber (1–10000)200Max events to return
nodeIdstringFilter to events for a specific node
typesstring[]Filter to specific event types (e.g. ["NodeFinished", "NodeFailed"])
sinceTimestampMsnumberOnly events at or after this timestamp
Output:
{
  runId: string;
  events: Array<{
    runId: string;
    seq: number;
    timestampMs: number;
    type: string;
    payload: unknown | null;
  }>;
}
Paginate using afterSeq: pass the seq of the last received event to fetch the next page.

Usage Examples

List workflows and start a run

> list_workflows {}

{
  "ok": true,
  "data": {
    "workflows": [
      { "id": "bugfix", "displayName": "bugfix", "entryFile": "./workflows/bugfix.tsx", "sourceType": "user" }
    ]
  }
}

> run_workflow { "workflowId": "bugfix", "prompt": "Fix the auth token expiry bug" }

{
  "ok": true,
  "data": {
    "runId": "smi_abc123",
    "launchMode": "background",
    "status": "running",
    ...
  }
}

Watch until complete

> watch_run { "runId": "smi_abc123", "timeoutMs": 120000 }

{
  "ok": true,
  "data": {
    "reachedTerminal": true,
    "timedOut": false,
    "finalRun": { "status": "waiting-approval", ... }
  }
}

Resolve a pending approval

> list_pending_approvals { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "approvals": [
      { "nodeId": "deploy", "iteration": 0, "nodeLabel": "Deploy to production", ... }
    ]
  }
}

> resolve_approval { "action": "approve", "runId": "smi_abc123", "nodeId": "deploy", "decidedBy": "alice", "note": "Looks good" }

{
  "ok": true,
  "data": {
    "action": "approve",
    "approval": { "status": "approved", "decidedAtMs": 1707500100000, ... },
    "run": { "status": "running", ... }
  }
}

Debug a blocked run

> explain_run { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "diagnosis": {
      "summary": "Run is waiting for a human approval on node 'deploy'.",
      "blockers": [
        {
          "kind": "approval",
          "nodeId": "deploy",
          "reason": "Node requires human approval before proceeding.",
          "unblocker": "Call resolve_approval with action=approve or action=deny."
        }
      ]
    }
  }
}

Revert a failed attempt

> get_node_detail { "runId": "smi_abc123", "nodeId": "analyze" }

{
  "ok": true,
  "data": {
    "detail": {
      "attemptsSummary": { "total": 3, "failed": 2, "succeeded": 1 },
      ...
    }
  }
}

> revert_attempt { "runId": "smi_abc123", "nodeId": "analyze", "attempt": 1 }

{
  "ok": true,
  "data": {
    "success": true,
    "run": { "status": "running", ... }
  }
}

Error Codes

All errors follow the structured envelope. Common codes:
CodeMeaning
RUN_NOT_FOUNDNo run exists with the given ID
INVALID_INPUTMissing required field, failed validation, or ambiguous approval filter
WORKFLOW_MISSING_DEFAULTWorkflow file has no default export
WORKFLOW_NOT_FOUNDNo workflow matches the given ID