Skip to main content
Run bunx smithers-orchestrator optimize to generate improved prompts for agent tasks via GEPA, verify the improvement against your eval suite, and save the result as a reusable artifact.
bunx smithers-orchestrator optimize workflow.tsx \
  --cases evals/smoke.jsonl \
  --suite smoke-gepa \
  --provider cerebras \
  --model zai-glm-4.7 \
  --artifact .smithers/optimizations/smoke-gepa.json
bunx smithers-orchestrator optimize runs the eval suite twice:
  1. baseline run with the workflow’s current prompts
  2. optimized run with GEPA-generated prompt patches applied
The command writes the artifact only when the optimized score improves by at least --min-improvement. Reports for both runs are written under .smithers/optimizations/reports unless --report-dir is set.

Reuse an artifact

Apply the optimized prompts to future evals with --optimization:
bunx smithers-orchestrator eval workflow.tsx \
  --cases evals/smoke.jsonl \
  --suite smoke-optimized \
  --optimization .smithers/optimizations/smoke-gepa.json
The artifact patches only agent-backed <Task> prompts by nodeId. Workflow structure, output schemas, retries, approvals, and persistence behavior stay unchanged.

Cerebras improvement demo

Example: the following run demonstrates a baseline failure corrected by a GEPA-generated patch. The baseline prompt did not include the required optimization token, so the eval failed. Cerebras GEPA generated a prompt patch that included the missing requirement, and the optimized eval passed.
CEREBRAS_API_KEY=... bunx smithers-orchestrator optimize workflow.tsx \
  --cases evals/opt.jsonl \
  --suite cerebras-proof \
  --provider cerebras \
  --model zai-glm-4.7 \
  --artifact artifacts/optimized.json \
  --report-dir artifacts/reports \
  --format json
Observed result:
{
  "baseline": { "score": 0.1, "passed": 0, "total": 1 },
  "optimized": { "score": 1, "passed": 1, "total": 1 },
  "improved": true,
  "provider": "cerebras",
  "model": "zai-glm-4.7"
}

Providers

bunx smithers-orchestrator optimize accepts the same provider vocabulary Smithers uses for agents and accounts:
ProviderOptimizer APIRequired envDefault model
cerebrasOpenAI-compatibleCEREBRAS_API_KEYzai-glm-4.7
openai-api, openai, openai-sdk, codexOpenAI-compatibleOPENAI_API_KEYgpt-5.5
anthropic-api, anthropic, anthropic-sdk, claude-code, claudeAnthropic Messages APIANTHROPIC_API_KEYclaude-fable-5
gemini-api, gemini, antigravityGemini generateContent APIGEMINI_API_KEY or GOOGLE_API_KEYgemini-3.1-pro-preview
kimi, moonshotOpenAI-compatible Moonshot APIMOONSHOT_API_KEYkimi-k2.6
opencodeOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLanthropic/claude-fable-5
piOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLgpt-5.5
amp, forge, openai-compatibleOpenAI-compatible endpointSMITHERS_OPTIMIZER_API_KEY and SMITHERS_OPTIMIZER_BASE_URLpass --model when needed
The CLI provider names (claude-code, codex, antigravity, gemini, kimi) map to their hosted API equivalents for optimization because GEPA needs a direct model call to propose prompt patches. Providers with no single hosted backend (opencode, pi, amp, forge) are still accepted through a generic OpenAI-compatible endpoint. --provider heuristic is deterministic and intended for local tests and fixtures. Use heuristic when you want deterministic optimization without an API call: place optimizationHints in each case’s metadata to control the patch. It uses eval-case metadata such as:
{
  "metadata": {
    "optimizationHints": {
      "answer": "Include the exact rubric requirement in the task prompt."
    }
  }
}
Artifacts are Smithers JSON records with baseline score, optimized score, improvement, prompt patches, and linked eval reports.