Error Handling

Smithers provides several mechanisms for handling errors at the task level. You can retry failed tasks, set timeouts, skip tasks conditionally, allow failures without halting the workflow, and route execution through fallback paths using <Branch>.

Retries

Set the retries prop to retry a task on failure. The value is the number of additional attempts after the first failure (so retries={2} means up to 3 total attempts):

<Task id="analyze" output="analysis" agent={analyst} retries={2}>
  Analyze the codebase and return structured JSON.
</Task>

Each retry creates a new row in _smithers_attempts. Previous attempts are never overwritten. Between the failure and the next attempt, a NodeRetrying event is emitted. The task is marked failed only after all retries are exhausted.

Schema validation retries

Schema validation failures have their own retry mechanism, separate from the retries prop. When the agent returns JSON that does not match the output schema, Smithers sends up to 2 follow-up prompts with the validation errors appended. These schema retries happen within a single attempt — they do not consume a retries count. If schema retries also fail, the attempt fails and the retries mechanism takes over (if configured).

Timeouts

Set timeoutMs to limit how long a single attempt can take:

<Task id="analyze" output="analysis" agent={analyst} timeoutMs={60_000} retries={1}>
  Analyze the codebase.
</Task>

If the task exceeds the timeout, the attempt fails with a timeout error. If retries is set, the task will retry. This is useful for guarding against agent calls that hang indefinitely.

continueOnFail

By default, when a task fails (after exhausting all retries), the workflow stops. Set continueOnFail to allow subsequent tasks to proceed:

<Task id="optional-lint" output="lint" agent={linter} retries={1} continueOnFail>
  Run lint checks on the codebase.
</Task>

<Task id="report" output="report" agent={reporter}>
  Generate the final report.
</Task>

With continueOnFail, the report task will execute even if optional-lint fails. The failed task’s node state is failed, but the workflow continues. This is useful for non-critical steps like linting, optional analysis passes, or telemetry tasks.

skipIf

Set skipIf to conditionally skip a task at render time:

<Task
  id="deep-analysis"
  output="analysis"
  agent={analyst}
  skipIf={ctx.input.mode === "quick"}
>
  Run a thorough analysis of the codebase.
</Task>

When skipIf evaluates to true, the task is marked skipped immediately. It will not run even if the condition changes on a later render cycle. Important: skipIf is evaluated during rendering, not during execution. For tasks that should only run after a prerequisite completes, use conditional rendering with ctx.outputMaybe() instead:

// Preferred: conditional rendering
const analysis = ctx.outputMaybe("analysis", { nodeId: "analyze" });

{analysis ? (
  <Task id="fix" output="fix" agent={fixer}>
    {`Fix these issues: ${analysis.summary}`}
  </Task>
) : null}

Branch for Error Recovery

Use <Branch> to route execution based on the outcome of a previous task. This is the primary pattern for fallback paths:

import { createSmithers, Task, Sequence, Branch } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers } = createSmithers({
  risky: z.object({
    ok: z.boolean(),
    message: z.string(),
  }),
  output: z.object({
    summary: z.string(),
  }),
});

const riskyAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Attempt the operation. Return JSON with ok (boolean) and message (string).",
});

export default smithers((ctx) => {
  const risky = ctx.outputMaybe("risky", { nodeId: "risky" });
  const ok = risky?.ok ?? false;

  return (
    <Workflow name="error-recovery">
      <Sequence>
        <Task id="risky" output="risky" agent={riskyAgent} retries={2} timeoutMs={30_000}>
          Attempt the operation.
        </Task>

        <Branch
          if={ok}
          then={
            <Task id="summary" output="output">
              {{ summary: `Success: ${risky?.message}` }}
            </Task>
          }
          else={
            <Task id="summary" output="output">
              {{ summary: `Fallback: operation did not succeed` }}
            </Task>
          }
        />
      </Sequence>
    </Workflow>
  );
});

The <Branch> component evaluates its if prop at render time. On the first render, risky is undefined so ok is false — but the risky task runs first because it appears earlier in the <Sequence>. After risky completes, the workflow re-renders, ok resolves to the actual value, and the appropriate branch is taken.

Combining Patterns

Here is a workflow that combines multiple error handling patterns:

export default smithers((ctx) => {
  const analysis = ctx.outputMaybe("analysis", { nodeId: "analyze" });
  const lint = ctx.outputMaybe("lint", { nodeId: "lint" });

  return (
    <Workflow name="robust-pipeline">
      <Sequence>
        {/* Retries + timeout for the critical analysis step */}
        <Task id="analyze" output="analysis" agent={analyst} retries={3} timeoutMs={120_000}>
          Analyze the codebase thoroughly.
        </Task>

        {/* Optional lint step -- continues even if it fails */}
        {analysis ? (
          <Task id="lint" output="lint" agent={linter} retries={1} continueOnFail>
            {`Lint the files: ${analysis.filesAnalyzed.join(", ")}`}
          </Task>
        ) : null}

        {/* Skip the detailed report in quick mode */}
        {analysis ? (
          <Task
            id="report"
            output="report"
            agent={reporter}
            skipIf={ctx.input.mode === "quick"}
          >
            {`Generate a detailed report.
Analysis: ${analysis.summary}
Lint results: ${lint?.issues?.join(", ") ?? "lint skipped or failed"}`}
          </Task>
        ) : null}

        {/* Always produce a final summary */}
        {analysis ? (
          <Task id="final" output="output">
            {{ summary: analysis.summary, lintPassed: lint?.passed ?? null }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});

Error Handling Summary

Mechanism	Prop	Effect
Retries	`retries={N}`	Retry up to N times after failure. Each attempt is recorded.
Timeout	`timeoutMs={N}`	Fail the attempt after N milliseconds. Combines with retries.
Continue on fail	`continueOnFail`	Let subsequent tasks run even if this task fails.
Skip	`skipIf={boolean}`	Skip the task at render time. Evaluated once per render cycle.
Branch	`<Branch if={...} then={...} else={...} />`	Route to different tasks based on a condition.
Conditional rendering	`{condition ? <Task /> : null}`	Mount tasks only when prerequisites are available.

Next Steps

Resumability — How failed runs can be resumed after fixing issues.
Debugging — Inspect failed attempts and error details.
Execution Model — How retries and node states work internally.

Getting Started

Core Concepts

Components

Guides

Runtime

CLI

Integrations

Examples

Reference

Retries

Schema validation retries

Timeouts

continueOnFail

skipIf

Branch for Error Recovery

Combining Patterns

Error Handling Summary

Next Steps

Getting Started

Core Concepts

Components

Guides

Runtime

CLI

Integrations

Examples

Reference

​Retries

​Schema validation retries

​Timeouts

​continueOnFail

​skipIf

​Branch for Error Recovery

​Combining Patterns

​Error Handling Summary

​Next Steps

Retries

Schema validation retries

Timeouts

continueOnFail

skipIf

Branch for Error Recovery

Combining Patterns

Error Handling Summary

Next Steps