MCP Server - Smithers

Smithers ships a built-in MCP stdio server. Passing --mcp to the CLI speaks the Model Context Protocol over stdin/stdout instead of acting as an interactive CLI. Any MCP-aware client can connect, discover workflows, start runs, watch progress, resolve approvals, and revert bad attempts through structured tool calls. Use the MCP server when an AI agent should drive Smithers autonomously. Use the HTTP Server for REST endpoints for human-written code or webhooks.

Setup

Start the server

smithers --mcp

This starts the semantic surface — a stable, structured tool set for AI agent consumption, documented on this page. Two additional surfaces are available via --surface:

# Semantic tools only (default)
smithers --mcp --surface semantic

# Raw CLI-mirroring tools only
smithers --mcp --surface raw

# Both surfaces registered on the same server
smithers --mcp --surface both

Use --surface raw only for direct CLI parity. Prefer the semantic surface for new integrations: every tool returns a { ok, data, error } envelope with Zod-validated input and output schemas.

Register with Claude Code

smithers mcp add

smithers mcp add writes the server entry to the MCP config file for the detected agent. Pass --agent to target a specific client, --no-global to install project-locally, or --command to override the launch command:

smithers mcp add --agent claude-code
smithers mcp add --no-global
smithers mcp add --command "pnpm smithers --mcp"

Register manually

For clients that read JSON config directly:

{
  "mcpServers": {
    "smithers": {
      "command": "smithers",
      "args": ["--mcp"]
    }
  }
}

Project-scoped install (e.g. a monorepo where Smithers is a dev dependency):

{
  "mcpServers": {
    "smithers": {
      "command": "pnpm",
      "args": ["smithers", "--mcp"]
    }
  }
}

Tool Registration

On start, the server calls registerSemanticTools, which loops over createSemanticToolDefinitions and registers each via server.registerTool. Every tool carries:

inputSchema — Zod object describing accepted parameters.
outputSchema — Zod schema for the structured response envelope.
annotations — MCP annotation metadata (readOnlyHint, destructiveHint, idempotentHint, openWorldHint).

Structured tool envelope

Every tool returns the same shape:

{
  ok: boolean;
  data?: { ... };     // present on success
  error?: {           // present on failure
    code: string;
    message: string;
    details?: Record<string, unknown> | null;
    docsUrl?: string | null;
  };
}

The response is also echoed as a text content block, so clients that do not parse structuredContent still receive the JSON payload.

Tool annotations

Annotation	Tools	Meaning
`readOnlyHint: true`	Most query tools	Tool does not modify state
`readOnlyHint: false, openWorldHint: true`	`run_workflow`	Launches external processes
`readOnlyHint: false, destructiveHint: true, idempotentHint: false`	`resolve_approval`, `revert_attempt`	Mutates persisted state irreversibly

Tool Reference

list_workflows

List all Smithers workflows discovered in the working directory. Input: none Output:

{
  workflows: Array<{
    id: string;
    metadataVersion: number;
    displayName: string;
    entryFile: string;
    sourceType: string;
    description: string;
    tags: string[];
    aliases: string[];
  }>;
}

Use the returned id values as the workflowId parameter for run_workflow.

run_workflow

Start or resume a discovered workflow. Input:

Parameter	Type	Default	Description
`workflowId`	`string`	required	Workflow ID from `list_workflows`
`input`	`Record<string, unknown>`	`{}`	Workflow input object
`prompt`	`string`	—	Shorthand: sets `input.prompt` when `input` is not provided
`runId`	`string`	auto	Custom run ID
`resume`	`boolean`	`false`	Resume an existing run; requires `runId`
`force`	`boolean`	`false`	Force-start even if a run with this ID already exists
`waitForTerminal`	`boolean`	`false`	Block until the run reaches a terminal state
`waitForStartMs`	`number`	`1000`	For background launches, how long to wait for the run row to appear in the database
`maxConcurrency`	`number`	—	Max concurrent nodes
`rootDir`	`string`	—	Root directory for tool sandboxing and path resolution
`logDir`	`string`	—	Directory for log files
`allowNetwork`	`boolean`	`false`	Allow network access in `bash` tool
`maxOutputBytes`	`number`	—	Cap on node output size
`toolTimeoutMs`	`number`	—	Per-tool call timeout
`hot`	`boolean`	`false`	Enable hot-reloading of the workflow file

Output:

{
  workflow: {
    id: string;
    metadataVersion: number;
    displayName: string;
    entryFile: string;
    sourceType: string;
    description: string;
    tags: string[];
    aliases: string[];
  };
  runId: string;
  launchMode: "background" | "waited";
  requestedResume: boolean;
  status: string;
  observedRun: RunSummary | null;
  result: { runId, status, output?, error? } | null;
}

Background vs. waited launch By default (waitForTerminal: false) the tool fires the workflow and returns immediately with launchMode: "background". observedRun reflects the run state polled during waitForStartMs. Use watch_run to track progress. Set waitForTerminal: true to block until the workflow finishes. result is populated and launchMode is "waited". Run option forwarding rootDir, logDir, allowNetwork, maxOutputBytes, toolTimeoutMs, and hot are forwarded verbatim to runWorkflow. They override values baked into the workflow file.

list_runs

List recent runs with summary data. Input:

Parameter	Type	Default	Description
`limit`	`number` (1–200)	`20`	Max runs to return
`status`	`string`	—	Filter by status (`running`, `finished`, `failed`, etc.)

Output:

{
  runs: RunSummary[];
}

RunSummary fields: runId, workflowName, workflowPath, parentRunId, status, createdAtMs, startedAtMs, finishedAtMs, heartbeatAtMs, activeNodeId, activeNodeLabel, pendingApprovalCount, waitingTimers, countsByState.

get_run

Get the full detail record for a specific run, including steps, approvals, timers, loop state, lineage, config, and error. Input:

Parameter	Type	Description
`runId`	`string`	Run ID

Output:

{
  run: RunSummary & {
    steps: Array<{ nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label }>;
    approvals: PendingApproval[];
    loops: Array<{ loopId, iteration, maxIterations }>;
    continuedFromRunIds: string[];
    activeDescendantRunId: string | null;
    config: unknown | null;
    error: unknown | null;
  };
}

watch_run

Poll a run at a fixed interval until it reaches a terminal state or a timeout expires. Input:

Parameter	Type	Default	Description
`runId`	`string`	required	Run to watch
`intervalMs`	`number`	`1000`	Poll interval (minimum enforced by runtime)
`timeoutMs`	`number`	`30000`	Wall-clock budget before giving up

Output:

{
  runId: string;
  intervalMs: number;
  pollCount: number;
  reachedTerminal: boolean;
  timedOut: boolean;
  finalRun: RunSummary;
  snapshots: Array<{ observedAtMs: number; run: RunSummary }>;
}

When timedOut is true the run is still active — call watch_run again or raise timeoutMs. Terminal statuses: finished, failed, cancelled.

explain_run

Return a structured diagnosis explaining why a run is blocked, waiting, or stale. Input:

Parameter	Type	Description
`runId`	`string`	Run ID

Output:

{
  diagnosis: {
    runId: string;
    status: string;
    summary: string;
    generatedAtMs: number;
    blockers: Array<{
      kind: string;
      nodeId: string;
      iteration: number | null;
      reason: string;
      waitingSince: number;
      unblocker: string;
      context?: string;
      signalName?: string | null;
      dependencyNodeId?: string | null;
      firesAtMs?: number | null;
      remainingMs?: number | null;
      attempt?: number | null;
      maxAttempts?: number | null;
    }>;
    currentNodeId: string | null;
  };
}

summary is a human-readable sentence. blockers lists every node preventing progress; unblocker describes what action or event would unblock it.

list_pending_approvals

List approvals that are waiting for a human decision, optionally filtered by run, workflow, or node. Input:

Parameter	Type	Description
`runId`	`string`	Filter by run ID
`workflowName`	`string`	Filter by workflow name
`nodeId`	`string`	Filter by node ID

All parameters optional. Omit all to list every pending approval across all runs. Output:

{
  approvals: Array<{
    runId: string;
    nodeId: string;
    iteration: number;
    status: string;
    requestedAtMs: number | null;
    decidedAtMs: number | null;
    note: string | null;
    decidedBy: string | null;
    request: unknown;
    decision: unknown;
    autoApproved?: boolean;
    workflowName: string | null;
    runStatus: string | null;
    nodeLabel: string | null;
  }>;
}

resolve_approval

Approve or deny a pending approval. This tool is destructive and non-idempotent. Input:

Parameter	Type	Description
`action`	`"approve" \| "deny"`	required — decision to record
`runId`	`string`	Filter to a specific run
`workflowName`	`string`	Filter by workflow name
`nodeId`	`string`	Filter by node ID
`iteration`	`number`	Filter by loop iteration
`note`	`string`	Optional note to record with the decision
`decidedBy`	`string`	Identity of the decision-maker
`decision`	`unknown`	Structured decision payload passed back to the workflow

Ambiguity guard Zero matches errors with INVALID_INPUT. More than one match errors with INVALID_INPUT and returns matches in details.matches — add runId, nodeId, or iteration to narrow the selection. The tool never guesses when multiple approvals match. Output:

{
  action: "approve" | "deny";
  approval: PendingApproval;   // with updated status, decidedAtMs, note, decidedBy
  run: RunSummary | null;
}

get_node_detail

Get enriched detail for a single node, including all attempts, tool calls, token usage, scorer results, and validated output. Input:

Parameter	Type	Description
`runId`	`string`	required
`nodeId`	`string`	required
`iteration`	`number`	Loop iteration (default: latest)

Output:

{
  detail: {
    node: { runId, nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label };
    status: string;
    durationMs: number | null;
    attemptsSummary: { total, failed, cancelled, succeeded, waiting };
    attempts: unknown[];
    toolCalls: unknown[];
    tokenUsage: unknown;
    scorers: unknown[];
    output: {
      validated: unknown | null;
      raw: unknown | null;
      source: "cache" | "output-table" | "none";
      cacheKey: string | null;
    };
    limits: {
      toolPayloadBytesHuman: number;
      validatedOutputBytesHuman: number;
    };
  };
}

revert_attempt

Revert the workspace and frame history back to the state captured at a specific attempt. This is destructive and non-idempotent. Input:

Parameter	Type	Default	Description
`runId`	`string`	required	Run containing the node
`nodeId`	`string`	required	Node to revert
`iteration`	`number`	`0`	Loop iteration
`attempt`	`number`	required	Attempt number to revert to (must be ≥ 1)

Output:

{
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  success: boolean;
  error?: string;
  jjPointer?: string;
  run: RunSummary | null;
}

list_artifacts

List structured output artifacts produced by nodes in a run. Input:

Parameter	Type	Default	Description
`runId`	`string`	required	Run ID
`nodeId`	`string`	—	Limit to a specific node
`includeRaw`	`boolean`	`false`	Include raw (pre-validation) output values

Output:

{
  artifacts: Array<{
    artifactId: string;   // "<runId>:<nodeId>:<iteration>"
    kind: "node-output";
    runId: string;
    nodeId: string;
    iteration: number;
    label: string | null;
    state: string;
    outputTable: string | null;
    source: "cache" | "output-table" | "none";
    cacheKey: string | null;
    value: unknown | null;
    rawValue?: unknown | null;   // only when includeRaw=true
  }>;
}

Only nodes with an outputTable and a non-none output source are included.

get_chat_transcript

Return the structured agent chat transcript for a run, grouped by attempts. Input:

Parameter	Type	Default	Description
`runId`	`string`	required	Run ID
`all`	`boolean`	`false`	Include all attempts, not just those with known output events
`includeStderr`	`boolean`	`true`	Include stderr messages
`tail`	`number`	—	Return only the last N messages

Output:

{
  runId: string;
  attempts: Array<{
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    state: string;
    startedAtMs: number;
    finishedAtMs: number | null;
    cached: boolean;
    meta: unknown | null;
  }>;
  messages: Array<{
    id: string;
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    role: "user" | "assistant" | "stderr";
    stream: "stdout" | "stderr" | null;
    timestampMs: number;
    text: string;
    source: "prompt" | "event" | "responseText";
  }>;
}

Messages are sorted by timestampMs. Use tail to limit context window usage on long transcripts.

get_run_events

Return the raw structured event history for a run with optional filtering. Input:

Parameter	Type	Default	Description
`runId`	`string`	required	Run ID
`afterSeq`	`number`	—	Only events with `seq` greater than this value
`limit`	`number` (1–10000)	`200`	Max events to return
`nodeId`	`string`	—	Filter to events for a specific node
`types`	`string[]`	—	Filter to specific event types (e.g. `["NodeFinished", "NodeFailed"]`)
`sinceTimestampMs`	`number`	—	Only events at or after this timestamp

Output:

{
  runId: string;
  events: Array<{
    runId: string;
    seq: number;
    timestampMs: number;
    type: string;
    payload: unknown | null;
  }>;
}

Paginate via afterSeq: pass the seq of the last received event to fetch the next page.

Usage Examples

List workflows and start a run

> list_workflows {}

{
  "ok": true,
  "data": {
    "workflows": [
      { "id": "bugfix", "displayName": "bugfix", "entryFile": "./workflows/bugfix.tsx", "sourceType": "user" }
    ]
  }
}

> run_workflow { "workflowId": "bugfix", "prompt": "Fix the auth token expiry bug" }

{
  "ok": true,
  "data": {
    "runId": "smi_abc123",
    "launchMode": "background",
    "status": "running",
    ...
  }
}

Watch until complete

> watch_run { "runId": "smi_abc123", "timeoutMs": 120000 }

{
  "ok": true,
  "data": {
    "reachedTerminal": true,
    "timedOut": false,
    "finalRun": { "status": "waiting-approval", ... }
  }
}

Resolve a pending approval

> list_pending_approvals { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "approvals": [
      { "nodeId": "deploy", "iteration": 0, "nodeLabel": "Deploy to production", ... }
    ]
  }
}

> resolve_approval { "action": "approve", "runId": "smi_abc123", "nodeId": "deploy", "decidedBy": "alice", "note": "Looks good" }

{
  "ok": true,
  "data": {
    "action": "approve",
    "approval": { "status": "approved", "decidedAtMs": 1707500100000, ... },
    "run": { "status": "running", ... }
  }
}

Debug a blocked run

> explain_run { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "diagnosis": {
      "summary": "Run is waiting for a human approval on node 'deploy'.",
      "blockers": [
        {
          "kind": "approval",
          "nodeId": "deploy",
          "reason": "Node requires human approval before proceeding.",
          "unblocker": "Call resolve_approval with action=approve or action=deny."
        }
      ]
    }
  }
}

Revert a failed attempt

> get_node_detail { "runId": "smi_abc123", "nodeId": "analyze" }

{
  "ok": true,
  "data": {
    "detail": {
      "attemptsSummary": { "total": 3, "failed": 2, "succeeded": 1 },
      ...
    }
  }
}

> revert_attempt { "runId": "smi_abc123", "nodeId": "analyze", "attempt": 1 }

{
  "ok": true,
  "data": {
    "success": true,
    "run": { "status": "running", ... }
  }
}

Error Codes

Errors follow the structured envelope. Common codes:

Code	Meaning
`RUN_NOT_FOUND`	No run exists with the given ID
`INVALID_INPUT`	Missing required field, failed validation, or ambiguous approval filter
`WORKFLOW_MISSING_DEFAULT`	Workflow file has no default export
`WORKFLOW_NOT_FOUND`	No workflow matches the given ID

Documentation Index

​Setup

​Start the server

​Register with Claude Code

​Register manually

​Tool Registration

​Structured tool envelope

​Tool annotations

​Tool Reference

​list_workflows

​run_workflow

​list_runs

​get_run

​watch_run

​explain_run

​list_pending_approvals

​resolve_approval

​get_node_detail

​revert_attempt

​list_artifacts

​get_chat_transcript

​get_run_events

​Usage Examples

​List workflows and start a run

​Watch until complete

​Resolve a pending approval

​Debug a blocked run

​Revert a failed attempt

​Error Codes

Setup

Start the server

Register with Claude Code

Register manually

Tool Registration

Structured tool envelope

Tool annotations

Tool Reference

list_workflows

run_workflow

list_runs

get_run

watch_run

explain_run

list_pending_approvals

resolve_approval

get_node_detail

revert_attempt

list_artifacts

get_chat_transcript

get_run_events

Usage Examples

List workflows and start a run

Watch until complete

Resolve a pending approval

Debug a blocked run

Revert a failed attempt

Error Codes