Recommended Models
Codex (gpt-5.3-codex) — Implementation
Codex is the strongest model for writing and modifying code. Use it for:- Implementing features
- Fixing bugs
- Running and interpreting tests
- Refactoring code
- Fixing review issues
high by default. Use xhigh for especially complex tasks (architectural refactors, multi-file changes with tricky dependencies).
Claude Opus (claude-opus-4-6) — Planning and Review
Claude Opus is the strongest model for reasoning about architecture and evaluating code quality. Use it for:- Research and codebase exploration
- Planning implementation steps
- Code review
- Report generation
- Orchestration logic and tool calling
Claude Sonnet (claude-sonnet-4-5-20250929) — Simple Tasks
Sonnet is fast, cheap, and good enough for straightforward tasks. Use it for:- Simple tool calling (reading files, running commands)
- Lightweight reviews where deep reasoning is not needed
- Report aggregation from structured data
- Tasks where a more expensive model would be wasteful
Summary Table
| Task Type | Recommended Model | Why |
|---|---|---|
| Implementing code | Codex | Strongest at code generation |
| Reviewing code | Claude Opus + Codex (parallel) | Two models catch more issues |
| Research and planning | Claude Opus | Strongest at architectural reasoning |
| Running tests / validation | Codex | Good at interpreting build output |
| Simple tool calls | Claude Sonnet | Fast, cheap, sufficient |
| Report generation | Claude Sonnet or Opus | Depends on complexity |
| Ticket discovery | Codex or Claude Opus | Both work well for codebase analysis |
CLI Agents vs AI SDK Agents
Smithers supports two ways to run each model:CLI Agents (subscription-based)
UseClaudeCodeAgent and CodexAgent when you have a Claude Code or Codex subscription. The agent runs as a subprocess using the CLI binary, which provides its native tool ecosystem (file editing, shell access, etc.).
AI SDK Agents (API billing)
UseToolLoopAgent from the ai package when you want API billing instead of a subscription, or when you want sandboxed tools from Smithers:
Dual-Agent Setup
The recommended pattern is to define both CLI and API versions and switch with an environment variable:Assigning Models to Steps
In a typical workflow with a review loop, assign models by their strengths:| Step | Agent | Reasoning |
|---|---|---|
| Discover | codex | Good at codebase analysis and structured output |
| Research | claude | Strong at finding patterns and synthesizing information |
| Plan | claude | Best at architectural reasoning |
| Implement | codex | Strongest at writing code |
| Validate | codex | Good at running and interpreting tests |
| Review (parallel) | claude + codex | Two models catch different issue types |
| ReviewFix | codex | Fixing code is implementation work |
| Report | claude | Good at summarization |
Codex Reasoning Effort
Themodel_reasoning_effort config controls how much thinking Codex does. Higher effort produces better results but is slower and more expensive:
| Level | Use when |
|---|---|
medium | Simple, well-defined changes with clear instructions |
high | Default. Most implementation and review tasks |
xhigh | Complex architectural changes, multi-file refactors, tricky edge cases |
Next Steps
- Implement-Review Loop — The recommended review loop pattern.
- CLI Agents — Full reference for ClaudeCodeAgent, CodexAgent, GeminiAgent.
- Built-in Tools — Sandboxed tools for AI SDK agents.