Give Agents Big, Coherent Tasks
Every task boundary is a context boundary. When one task ends and another begins, the agent forgets everything from the previous task. It starts fresh with only the prompt you give it and the structured output you pass in. So do not split one logical operation into four tiny tasks. You are not decomposing work — you are destroying context.read, edit, bash) to accomplish everything in one pass. Only split into multiple tasks when the context genuinely changes, you need an explicit checkpoint, or a later task depends on the structured output of an earlier one.
Use Measurable Stop Conditions for Loops
A Loop should stop based on a concrete, measurable signal — not a subjective judgment. Ask yourself: can a machine evaluate this condition without interpretation? Good stop conditions:- Tests passing (boolean)
- Approval flag from a reviewer (boolean)
- Score exceeding a threshold (number comparison)
- All items in a list processed (array length check)
maxIterations. A loop without a cap is a bug waiting to burn your API budget at 3 AM.
Ask for Validation in Prompts
Do not assume the agent will run checks without being told. If you need tests, linting, or verification, say so explicitly in the prompt. Agents are literal-minded collaborators — they do what you ask, not what you hope.Request Structured Reports
Design your structured output schemas to capture the data you need for downstream tasks and human inspection. Every field should be something you can act on programmatically:criticalIssues > 0, generate summary reports from structured data, track metrics across runs, and feed specific recommendations into a fix task. Free-text summaries give you none of that.
Use outputSchema for Type Safety
TheoutputSchema prop validates the agent’s response against your Zod schema. It does three things:
- Validation — Responses are validated against the schema, with auto-retry on failure.
- Auto-injection — When children are JSX/MDX elements,
props.schemais auto-injected with a JSON example. - Cache key — The schema shape is part of the cache key, so schema changes invalidate stale caches.
{ approved: "yes" } instead of { approved: true }, schema validation catches it and retries — without burning a full task retry.
Mark Side-Effect Tools
This is the single most important thing to get right with custom tools. If your tool mutates external state — calling an API, writing to a database, sending an email, creating a PR — you must setsideEffect: true on the tool definition. If the mutation is not safe to repeat, also set idempotent: false.
Why this matters: Smithers retries failed tasks. Without sideEffect: true, Smithers treats your tool as a pure read and replays it without warning. That means duplicate orders, duplicate emails, double charges. The sideEffect flag is how Smithers knows to warn the agent on retry and provide an idempotency key for deduplication.
git reset. The built-in write, edit, and bash tools do not carry the side effect flag for exactly this reason.
The rule: if you cannot undo it with git reset, it is a side effect. Mark it.
For the full reference on sideEffect, idempotent, and ctx.idempotencyKey, see defineTool.
Design for Resumability
Every long-running workflow will eventually crash. A network blip, a rate limit, a deploy that kills the process. If your workflow cannot resume from where it stopped, you are starting over from scratch every time. Three rules:- Use deterministic task IDs. No timestamps, no random strings, no array indices. If the ID changes between renders, Smithers treats it as a different task.
- Make tasks idempotent where possible. If a task writes files, design it so re-running produces the same result. For custom tools that call external APIs, mark them as side effects so Smithers handles retries safely.
- Use
depsfor direct task handoff andctx.outputMaybe()for orchestration decisions. This keeps prompt wiring terse while preserving explicit control-flow logic.
Keep Prompts and Schemas Separate from Logic
As your workflow grows, you will want to iterate on prompts without touching orchestration logic, and swap agents without changing schemas. Separate your concerns:schemas.ts— All Zod schemas in one file.agents.ts— Agent configuration (model, system prompt, tools).prompts/— MDX prompt templates.workflow.tsx— Composition only (how tasks connect, branch, and hand typed deps into steps).
workflow.tsx, something is wrong with your factoring.
Set Reasonable Timeouts and Retry Limits
Every agent task should have a timeout. Agent calls can hang due to rate limits, network issues, or unexpectedly long generation. A task without a timeout is a task that might run forever.- Simple analysis tasks: 30-60 seconds timeout, 1-2 retries.
- Tool-using tasks (read, edit, bash): 2-5 minutes timeout, 1-2 retries.
- Large generation tasks: 5-10 minutes timeout, 0-1 retries.
- Non-critical tasks: add
continueOnFailso failures do not block the workflow.
Use Caching for Iterative Development
You are going to iterate on prompts. A lot. Each iteration should not re-run every upstream task that already succeeded.Example: Complete Review Loop
Here is a full example combining these practices — a review loop with structured output, measurable stop conditions, explicit validation instructions, and reasonable error handling:Next Steps
- Review Loop — Production pattern for implement, validate, and review cycles.
- Patterns — Project structure and naming conventions.
- Structured Output — Schema validation details.
- Resumability — Deterministic IDs, safe retries, and resume behavior.
- Error Handling — Retries, timeouts, and fallback paths.