Skip to main content
Every <Task> in Smithers produces a structured output that is validated and persisted to SQLite. This guide explains the full validation flow, from schema definition through auto-retry on failure.

Schema-Driven Output

When you use createSmithers, each Zod schema becomes an output table. The agent must return JSON that matches the schema shape.
import { createSmithers, Task } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    issues: z.array(z.string()),
    risk: z.enum(["low", "medium", "high"]),
  }),
});

const analyst = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Return JSON matching the schema exactly.",
});

export default smithers((ctx) => (
  <Workflow name="structured-output">
    <Task id="analyze" output="analysis" agent={analyst}>
      {`Analyze this codebase: ${ctx.input.target}.
Return JSON with:
- summary (string)
- issues (string[])
- risk ("low" | "medium" | "high")`}
    </Task>
  </Workflow>
));

The outputSchema Prop

The outputSchema prop provides additional schema information to the agent. When your <Task> children are a React or MDX element, Smithers auto-injects a schema prop containing a JSON example derived from the Zod schema:
import { z } from "zod";

const analysisSchema = z.object({
  summary: z.string(),
  risk: z.enum(["low", "medium", "high"]),
  files: z.array(z.string()),
});

// The AnalysisPrompt component receives props.schema automatically
<Task id="analyze" output="analysis" agent={analyst} outputSchema={analysisSchema}>
  <AnalysisPrompt repo={ctx.input.repoPath} />
</Task>
Inside your prompt component (MDX or TSX), props.schema is a formatted JSON string the agent can reference:
{/* prompts/analysis.mdx */}
Analyze the repository at {props.repo}.

Return JSON matching this schema:
{props.schema}
For string children, you should describe the expected shape explicitly in the prompt text. The outputSchema prop still participates in validation and cache key computation.

Validation Flow

When a task executes, Smithers validates the agent’s output through the following steps:
  1. JSON extraction — The agent’s response is parsed for JSON. Smithers tries structured output first, then raw JSON, then code-fenced JSON blocks, and finally balanced-brace extraction. If no JSON is found, a follow-up prompt asks the agent to return just the JSON.
  2. Auto-populated column stripping — Columns that Smithers manages automatically (runId, nodeId, iteration) are stripped from the agent’s response before validation. This means the agent does not need to include these fields, and if it does include them, they are ignored.
  3. Schema validation — The extracted JSON is validated against the Zod schema (if outputSchema is set) and against the Drizzle table schema (column types and nullability).
  4. Auto-retry on failure — If validation fails, Smithers sends up to 2 schema-retry prompts to the agent with the Zod error details appended. This gives the agent a chance to self-correct:
    Your previous response did not match the expected schema.
    Errors:
    - issues: Expected array, received string
    - risk: Invalid enum value. Expected 'low' | 'medium' | 'high', received 'moderate'
    
    Please return valid JSON matching the schema.
    
  5. Persistence — On successful validation, the output is written to the SQLite table with runId, nodeId, and iteration auto-populated.

Auto-Populated Columns

Every output table has three columns that Smithers manages:
ColumnTypeDescription
runIdstringThe current run’s ID
nodeIdstringThe task’s id prop
iterationintegerThe current Ralph loop iteration (0 for non-loop tasks)
These columns are:
  • Auto-added to the table when using createSmithers (you do not define them in your Zod schema).
  • Stripped from the agent’s response before validation. If the agent returns { "runId": "...", "summary": "..." }, the runId is silently removed.
  • Auto-populated when the row is written to SQLite.
This means your Zod schemas should only describe the business-relevant fields:
// Correct -- only business fields
const analysisSchema = z.object({
  summary: z.string(),
  issues: z.array(z.string()),
});

// The agent only needs to return:
// { "summary": "...", "issues": ["..."] }
// Smithers adds runId, nodeId, and iteration automatically.

Agent Prompt Tips

Help the agent produce valid output by being explicit in your prompts:
<Task id="analyze" output="analysis" agent={analyst}>
  {`Analyze the following code for security vulnerabilities.

Return a JSON object with these fields:
- summary (string): A one-paragraph overview
- issues (string[]): List of issues found, or empty array if none
- risk ("low" | "medium" | "high"): Overall risk assessment

Important: Return ONLY the JSON object, no additional text.`}
</Task>
Guidelines for effective prompts:
  • List every field with its type and expected format.
  • Use enum values explicitly (e.g., "low" | "medium" | "high").
  • Specify array item shapes for nested objects.
  • Include “or empty array” for optional array fields to avoid null responses.
  • Ask for “ONLY the JSON object” to reduce extraneous text that complicates parsing.

Static Mode Validation

Tasks without an agent prop are static — the children value is written directly to the database. Static outputs are still validated against the table schema:
// Static payload -- validated against the output table schema
<Task id="config" output="config">
  {{ environment: "production", version: 3 }}
</Task>
If the payload does not match the table schema (wrong types, missing required fields), the task fails immediately without retries.

JSON Mode Columns

For columns that store arrays or complex objects, use Drizzle’s { mode: "json" } option when working with the manual Drizzle API. With createSmithers, Zod arrays and objects are automatically stored as JSON text columns:
// With createSmithers -- automatic
const { Workflow, smithers } = createSmithers({
  analysis: z.object({
    issues: z.array(z.string()), // stored as JSON text automatically
  }),
});

Combining Zod and Drizzle Schemas

When using the manual Drizzle API (without createSmithers), you can pair a Drizzle table with a Zod outputSchema for double validation:
import { sqliteTable, text, integer, primaryKey } from "drizzle-orm/sqlite-core";

const analysisTable = sqliteTable(
  "analysis",
  {
    runId: text("run_id").notNull(),
    nodeId: text("node_id").notNull(),
    summary: text("summary").notNull(),
    issues: text("issues", { mode: "json" }).$type<string[]>(),
    risk: integer("risk").notNull(),
  },
  (t) => ({
    pk: primaryKey({ columns: [t.runId, t.nodeId] }),
  }),
);

const analysisSchema = z.object({
  summary: z.string(),
  issues: z.array(z.string()),
  risk: z.number().int().min(1).max(10),
});

<Task id="analyze" output={analysisTable} outputSchema={analysisSchema} agent={analyst}>
  Analyze the codebase.
</Task>
Here, outputSchema validates the JSON structure (including the risk range), and the Drizzle table validates column types and nullability.

Next Steps

  • Error Handling — What happens when validation fails after all retries.
  • Patterns — Schema organization conventions for larger projects.
  • Data Model — Required columns and primary key conventions.