Skip to main content
Writing a better prompt is the smallest lever. Getting an agent to reliably finish real work is a layered control system, and most of those layers live around the model, not inside the prompt. Smithers owns the outer layers so you can describe an outcome and let the system assemble the rest.

The layers

LayerWhat it controlsWhere it lives in Smithers
Prompt engineeringinstructions, examples, role, output format, success criteriathe prompt .mdx a <Task> renders
Context engineeringwhat information, tools, memory, schemas, and state enter the model each stepthe workflow graph + memory + typed outputs
Harness engineeringruntime, tools, conventions, permissions, retries, fresh-context loopsagents.ts, sandboxes, tools, repoCommands
Workflow engineeringorder, parallelism, review loops, approvals, resumability, artifactsthe Smithers runtime itself
Backpressureevery desired behavior becomes a gate, test, eval, schema, reviewer, approval, or loop conditionZod outputs, bunx smithers-orchestrator eval, <ReviewLoop>, <Approval>, traces
The first four shape what the agent can do; backpressure decides whether the work is allowed to move forward. A workflow that just “tries its best and moves on” has no backpressure, and that is where unreliable agents come from.

Backpressure, concretely

Turn each success criterion into a verification signal, and pick the Smithers primitive that enforces it:
  • Schema: the step must return a shape: a Zod output={...} on the <Task>.
  • Test: generated code must pass: a function task shelling out to repoCommands.test.
  • Eval: an answer must satisfy examples/rubrics: bunx smithers-orchestrator eval + scorers.
  • Review: another agent (or human) must approve: <ReviewLoop> / <Panel>.
  • Approval: a human signs off before a risky action: <Approval>.
  • Dependency: step B can’t start until step A produced a field: gate on ctx.outputMaybe(...).
  • Trace: tool calls, retries, and handoffs must be visible: observability + bunx smithers-orchestrator events.
Loop until the gate passes (<Loop> / <Ralph until={…}>) rather than running once and hoping.

Smithers does the context engineering for you

You should not need to know any of the above to get a workflow. The create-workflow workflow is the entry point to the “context engineering for you” layer:
bunx smithers-orchestrator workflow run create-workflow \
  --prompt "Watch a landing request and auto-land it once CI is green"
It clarifies your ask into a spec, provisions the docs and skills the work needs (pulls the relevant llms-*.txt, finds the closest examples/ template, and installs worker skills via bunx smithers-orchestrator skills add), designs the graph, pauses for your approval, scaffolds the files, verifies the graph renders, and documents the result. You answer product questions; it produces the prompts, context, components, and gates. This is the direction Smithers is heading: a concierge that takes a vague script, interrogates it, routes it to the right skills and workflows, adds backpressure, runs as much as it can, and reports legibly. The durable, observable, gated workflow is something you describe rather than hand-build.

Further reading

The field this builds on: Anthropic and OpenAI on prompting and on evaluating the model and the harness together; LangChain and LlamaIndex on context engineering; HumanLayer on harness engineering for coding agents; the Ralph loop on acceptance-driven, fresh-context iteration; and BAML on treating structured output as schema engineering.