Skip to main content

Documentation Index

Fetch the complete documentation index at: https://smithers.sh/llms.txt

Use this file to discover all available pages before exploring further.

Run smithers optimize to generate improved prompts for agent tasks via GEPA, verify the improvement against your eval suite, and save the result as a reusable artifact.
bunx smithers-orchestrator optimize workflow.tsx \
  --cases evals/smoke.jsonl \
  --suite smoke-gepa \
  --provider cerebras \
  --model gpt-oss-120b \
  --artifact .smithers/optimizations/smoke-gepa.json
smithers optimize runs the eval suite twice:
  1. baseline run with the workflow’s current prompts
  2. optimized run with GEPA-generated prompt patches applied
The command writes the artifact only when the optimized score improves by at least --min-improvement. Reports for both runs are written under .smithers/optimizations/reports unless --report-dir is set.

Reuse an artifact

Apply the optimized prompts to future evals with --optimization:
bunx smithers-orchestrator eval workflow.tsx \
  --cases evals/smoke.jsonl \
  --suite smoke-optimized \
  --optimization .smithers/optimizations/smoke-gepa.json
The artifact patches only agent-backed <Task> prompts by nodeId. Workflow structure, output schemas, retries, approvals, and persistence behavior stay unchanged.

Cerebras improvement demo

Example: the following run demonstrates a baseline failure corrected by a GEPA-generated patch. The baseline prompt did not include the required optimization token, so the eval failed. Cerebras GEPA generated a prompt patch that included the missing requirement, and the optimized eval passed.
CEREBRAS_API_KEY=... bunx smithers-orchestrator optimize workflow.tsx \
  --cases evals/opt.jsonl \
  --suite cerebras-proof \
  --provider cerebras \
  --model gpt-oss-120b \
  --artifact artifacts/optimized.json \
  --report-dir artifacts/reports \
  --format json
Observed result:
{
  "baseline": { "score": 0.1, "passed": 0, "total": 1 },
  "optimized": { "score": 1, "passed": 1, "total": 1 },
  "improved": true,
  "provider": "cerebras",
  "model": "gpt-oss-120b"
}

Providers

smithers optimize accepts the same provider vocabulary Smithers uses for agents and accounts:
ProviderOptimizer APIRequired envDefault model
cerebrasOpenAI-compatibleCEREBRAS_API_KEYgpt-oss-120b
openai-api, openai, openai-sdk, codexOpenAI-compatibleOPENAI_API_KEYgpt-5.3-codex
anthropic-api, anthropic, anthropic-sdk, claude-code, claudeAnthropic Messages APIANTHROPIC_API_KEYclaude-opus-4-7
gemini-api, gemini, antigravityGemini generateContent APIGEMINI_API_KEY or GOOGLE_API_KEYgemini-3.1-pro-preview
kimi, moonshotOpenAI-compatible Moonshot APIMOONSHOT_API_KEYkimi-latest
opencodeOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLanthropic/claude-sonnet-4-5
piOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLgpt-5.3-codex
amp, forge, openai-compatibleOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLpass --model when needed
The CLI provider names (claude-code, codex, antigravity, gemini, kimi) map to their hosted API equivalents for optimization because GEPA needs a direct model call to propose prompt patches. Providers with no single hosted backend (opencode, pi, amp, forge) are still accepted through a generic OpenAI-compatible endpoint. --provider heuristic is deterministic and intended for local tests and fixtures. Use heuristic when you want deterministic optimization without an API call — place optimizationHints in each case’s metadata to control the patch. It uses eval-case metadata such as:
{
  "metadata": {
    "optimizationHints": {
      "answer": "Include the exact rubric requirement in the task prompt."
    }
  }
}
Artifacts are Smithers JSON records with baseline score, optimized score, improvement, prompt patches, and linked eval reports.